Senior C++ Software Engineer (AI Evaluation)
About the Role
A high-impact AI initiative is focused on improving the accuracy, reasoning quality, and reliability of conversational systems in software engineering contexts. The work centers on evaluating how advanced models generate, explain, and validate code across diverse technical scenarios.
This opportunity is ideal for experienced software engineers with deep expertise in C++ and strong problem-solving ability. It suits individuals who are comfortable analyzing complex logic, identifying subtle issues, and applying rigorous engineering standards to AI-generated outputs.
The work involves reviewing and validating model-generated code, testing implementations, and assessing explanation quality, where precision, technical depth, and consistency are critical to success.
What You'll Do
- Evaluate AI-generated responses to software engineering and coding tasks for correctness and clarity
- Execute and validate code outputs using appropriate development tools
- Identify logical errors, inefficiencies, and edge cases in generated solutions
- Annotate responses with structured feedback on strengths and weaknesses
- Assess code quality, including readability, maintainability, and algorithmic soundness
- Verify factual accuracy using reliable technical references
- Apply standardized evaluation frameworks, benchmarks, and taxonomies
- Ensure outputs align with expected conversational and engineering standards
Requirements
- Bachelor’s degree or higher in Computer Science or a related field
- 5+ years of professional experience in software engineering or similar technical roles
- Expert-level proficiency in C++
- Ability to independently solve medium to hard algorithmic problems
- Strong analytical skills for debugging and evaluating complex systems
- Experience executing and testing code across development environments
- High attention to detail in reviewing technical outputs and identifying subtle flaws
- Strong written communication skills for structured technical feedback
- Fluency in English
- Experience with large language models in coding workflows and understanding of their limitations
- Prior involvement in open-source projects with accepted contributions
- Familiarity with model evaluation, annotation workflows, or RLHF processes
- Background in competitive programming or advanced problem-solving environments
- Experience reviewing production-level codebases
- Ability to explain technical concepts clearly to non-technical audiences