About the Role

A high-impact AI initiative is focused on improving the accuracy, reasoning quality, and reliability of conversational systems in software engineering contexts. The work centers on evaluating how advanced models generate, explain, and validate code across diverse technical scenarios.

This opportunity is ideal for experienced software engineers with deep expertise in C++ and strong problem-solving ability. It suits individuals who are comfortable analyzing complex logic, identifying subtle issues, and applying rigorous engineering standards to AI-generated outputs.

The work involves reviewing and validating model-generated code, testing implementations, and assessing explanation quality, where precision, technical depth, and consistency are critical to success.

What You'll Do

Evaluate AI-generated responses to software engineering and coding tasks for correctness and clarity
Execute and validate code outputs using appropriate development tools
Identify logical errors, inefficiencies, and edge cases in generated solutions
Annotate responses with structured feedback on strengths and weaknesses
Assess code quality, including readability, maintainability, and algorithmic soundness
Verify factual accuracy using reliable technical references
Apply standardized evaluation frameworks, benchmarks, and taxonomies
Ensure outputs align with expected conversational and engineering standards

Requirements

Bachelor’s degree or higher in Computer Science or a related field
5+ years of professional experience in software engineering or similar technical roles
Expert-level proficiency in C++
Ability to independently solve medium to hard algorithmic problems
Strong analytical skills for debugging and evaluating complex systems
Experience executing and testing code across development environments
High attention to detail in reviewing technical outputs and identifying subtle flaws
Strong written communication skills for structured technical feedback
Fluency in English
Experience with large language models in coding workflows and understanding of their limitations
Prior involvement in open-source projects with accepted contributions
Familiarity with model evaluation, annotation workflows, or RLHF processes
Background in competitive programming or advanced problem-solving environments
Experience reviewing production-level codebases
Ability to explain technical concepts clearly to non-technical audiences

Senior C++ Software Engineer (AI Evaluation)

About the Role

What You'll Do

Requirements

Explore Similar Global AI Roles

Senior Finance Strategy Specialist

Senior Graphic Designer – Presentation & Pitch Deck Design

Strategic Project Lead – Legal AI Operations