Senior Rust Software Engineer (AI Evaluation)
About the Role
A high-impact AI initiative focused on improving the reliability and reasoning capabilities of conversational systems in software engineering contexts. The work centers on evaluating how AI models generate, explain, and reason about code across diverse technical scenarios.
This opportunity is ideal for experienced software engineers with deep Rust expertise who are comfortable analyzing complex logic, validating outputs, and identifying subtle flaws in generated code. Strong analytical thinking and practical engineering judgment are essential.
The work involves structured evaluation of model-generated responses, including code execution, fact-checking, and detailed annotation, where precision and consistency in technical assessment are critical to improving system performance.
What You'll Do
- Evaluate AI-generated code responses for correctness, clarity, and completeness
- Execute and validate code outputs using appropriate development tools
- Identify logical errors, inefficiencies, and edge case failures
- Annotate responses with structured feedback on strengths and weaknesses
- Assess code quality, readability, and adherence to best practices
- Verify technical claims using reliable public references
- Apply standardized evaluation frameworks and scoring guidelines
- Ensure alignment with expected conversational and engineering standards
Requirements
- 5+ years of experience in software engineering or related technical roles
- Strong expertise in Rust and systems-level programming
- Ability to solve medium to hard algorithmic problems independently
- Experience executing, debugging, and validating code across environments
- Familiarity with code quality standards and software engineering best practices
- Strong attention to detail in reviewing technical reasoning and outputs
- Fluent English communication skills (written and technical)
- Experience using LLMs in coding workflows and understanding their limitations
- Bachelor’s, Master’s, or PhD in Computer Science or a related field
- Experience contributing to open-source projects with accepted pull requests
- Familiarity with model evaluation, RLHF, or annotation workflows
- Background in competitive programming
- Experience reviewing production-level codebases
- Exposure to multiple programming languages or paradigms
- Ability to explain complex technical concepts clearly to varied audiences