Senior C# Software Engineer (AI Evaluation)
About the Role
A high-impact AI initiative focused on improving the reliability and performance of conversational systems in software engineering contexts. The work centers on evaluating how AI models reason about code, generate solutions, and communicate technical concepts across varying levels of complexity.
This opportunity is ideal for experienced software engineers with deep expertise in C# who are comfortable analyzing complex logic, identifying subtle issues, and validating technical outputs. Strong problem-solving ability and familiarity with real-world engineering standards are critical for success.
The work involves reviewing AI-generated code and explanations, executing and validating outputs, and applying structured evaluation frameworks. Precision, consistency, and the ability to assess both correctness and clarity are essential in ensuring high-quality model behavior.
What You'll Do
- Evaluate AI-generated responses to coding and software engineering tasks for accuracy and reasoning quality
- Execute code to validate outputs and verify functional correctness
- Analyze algorithm design, efficiency, and edge case handling
- Annotate responses with detailed feedback on strengths and deficiencies
- Identify logical errors, bugs, and inconsistencies in generated code
- Assess clarity and completeness of technical explanations
- Apply standardized evaluation frameworks, taxonomies, and benchmarks
- Ensure outputs align with expected conversational and engineering standards
Requirements
- Bachelor’s, Master’s, or PhD in Computer Science or a related field
- 5+ years of professional experience in software engineering or similar roles
- Advanced proficiency in C# and strong understanding of software design principles
- Ability to independently solve medium to hard-level algorithmic problems
- Experience executing and debugging code across real-world scenarios
- Strong analytical skills for evaluating logic, performance, and correctness
- Familiarity with large language models and their practical limitations
- High attention to detail and structured evaluation approach
- Fluent English communication skills
- Experience contributing to open-source projects with accepted pull requests
- Exposure to model evaluation, RLHF, or data annotation workflows
- Background in competitive programming or technical assessments
- Experience reviewing production-level codebases
- Familiarity with multiple programming paradigms or ecosystems
- Ability to clearly explain technical concepts to non-technical audiences