Back to all jobs
Data Annotation

Generalist Evaluator – AI Prompt Engineering & QA

Remote (United States & Canada) Contract

About the Role

A high-impact AI research initiative is seeking detail-oriented writing professionals to contribute to the development and evaluation of advanced language models. This opportunity is ideal for individuals who can translate complex ideas into structured, high-quality written outputs with clarity and precision.

The work involves designing prompts, defining evaluation standards, and assessing model responses within structured workflows where consistency and accuracy are critical.

What You'll Do

  • Design structured prompts with multiple constraints and clear instructions
  • Develop evaluation rubrics that define high-quality outputs across general-use scenarios
  • Generate high-quality reference (“gold-standard”) responses
  • Test model outputs and assess performance against defined criteria
  • Contribute to benchmarking and quality assurance processes
  • Ensure consistency and rigor across prompt design and evaluation workflows

Requirements

  • Strong writing skills with clarity, structure, and precision
  • Strong critical thinking and analytical reasoning ability
  • Ability to work independently and meet deadlines
  • Familiarity with AI tools such as ChatGPT or similar systems
  • Currently based in the United States or Canada
  • Bachelor’s degree (completed or in progress)
  • Background in teaching, research, or structured writing (preferred)
  • Experience with rubrics, evaluation systems, or analytical frameworks (preferred)
Application Note: By submitting your profile for this partnered position, our team can quickly review your background and reach out to present you with this specific opportunity or match you with similar AI Training projects.