Senior C Software Engineer (AI Evaluation)
About the Role
A high-impact AI initiative focused on improving the reliability and performance of conversational systems in software engineering and low-level programming contexts. The work emphasizes evaluating how AI models reason about C-based systems, generate efficient solutions, and communicate technical concepts with precision.
This opportunity is ideal for experienced engineers with strong expertise in C programming, particularly those familiar with memory management, systems-level problem solving, and performance optimization. The role requires the ability to critically assess complex technical outputs and identify subtle flaws in logic or implementation.
The work involves reviewing AI-generated code and explanations, executing and validating outputs, and applying structured evaluation methodologies. Success depends on precision, consistency, and a deep understanding of system-level programming principles.
What You'll Do
- Evaluate AI-generated responses to C programming and systems-level engineering tasks for correctness and reasoning quality
- Execute and test code to validate outputs and ensure functional accuracy
- Analyze memory usage, pointer safety, and performance characteristics
- Identify bugs, undefined behavior risks, and edge case failures
- Annotate responses with detailed feedback on accuracy and clarity
- Assess algorithmic efficiency and low-level implementation quality
- Ensure explanations align with best practices in systems programming
- Apply standardized evaluation frameworks, benchmarks, and taxonomies
Requirements
- Bachelor’s, Master’s, or PhD in Computer Science or a related field
- 5+ years of professional experience in software engineering or systems programming
- Advanced proficiency in C, including memory management and pointer operations
- Strong understanding of operating systems concepts and low-level debugging
- Ability to independently solve medium to hard-level algorithmic problems
- Experience compiling, running, and debugging code in real environments
- Strong analytical skills for evaluating correctness, efficiency, and safety
- Familiarity with large language models and their strengths and limitations
- High attention to detail and structured evaluation approach
- Fluent English communication skills
- Experience contributing to open-source projects with accepted pull requests
- Exposure to model evaluation, RLHF, or data annotation workflows
- Background in competitive programming or technical assessments
- Experience reviewing production-level or systems codebases
- Familiarity with multiple programming paradigms or ecosystems
- Ability to explain complex low-level concepts to non-technical audiences