Back to all jobs
Medical

Medical AI Evaluation Specialist

United States Contract

About the Role

A high-impact AI initiative is building advanced healthcare reasoning systems by incorporating expert-level clinical judgment into model training workflows. This effort focuses on improving how AI interprets, evaluates, and responds to complex medical scenarios across multiple specialties.

This opportunity is ideal for licensed physicians, senior trainees, and specialists who want to apply their clinical expertise in a non-traditional setting. It suits individuals comfortable with evidence-based medicine, structured evaluation, and translating real-world decision-making into high-quality written outputs.

The work involves designing realistic patient scenarios, assessing AI-generated clinical responses, and producing reference-standard answers. Success in this role depends on precision, clinical depth, and the ability to clearly articulate reasoning aligned with established medical guidelines.

What You'll Do

  • Design clinically accurate case scenarios reflecting real-world diagnostic and treatment challenges
  • Develop high-quality reference responses aligned with attending-level clinical standards
  • Evaluate AI-generated outputs using structured scoring frameworks
  • Provide detailed written feedback to improve model reasoning and accuracy
  • Apply evidence-based guidelines across diverse clinical cases
  • Participate in calibration sessions to maintain evaluation consistency
  • Contribute to iterative improvement of clinical reasoning datasets

Requirements

  • Board-certified physician with active, unrestricted medical license
  • OR final-year resident with board eligibility
  • OR fellow with board certification/eligibility in primary specialty and active license
  • Demonstrated expertise in at least one specialty (e.g., Internal Medicine, Emergency Medicine, Psychiatry, Oncology, Radiology, Cardiology)
  • Strong clinical reasoning and evidence-based decision-making skills
  • Ability to write clear, structured, and high-quality medical explanations
  • Comfort working in asynchronous, remote environments
  • Availability for ~20 hours per week with potential scaling
  • Experience with clinical guidelines and standardized care frameworks
  • Familiarity with AI tools or structured evaluation systems (preferred)
Application Note: By submitting your profile for this partnered position, our team can quickly review your background and reach out to present you with this specific opportunity or match you with similar AI Training projects.