About the Role

A high-impact AI initiative focused on improving the reliability and reasoning capabilities of conversational systems in software engineering contexts. The work centers on evaluating how AI models generate, explain, and reason about code across diverse technical scenarios.

This opportunity is ideal for experienced software engineers with deep Rust expertise who are comfortable analyzing complex logic, validating outputs, and identifying subtle flaws in generated code. Strong analytical thinking and practical engineering judgment are essential.

The work involves structured evaluation of model-generated responses, including code execution, fact-checking, and detailed annotation, where precision and consistency in technical assessment are critical to improving system performance.

What You'll Do

Evaluate AI-generated code responses for correctness, clarity, and completeness
Execute and validate code outputs using appropriate development tools
Identify logical errors, inefficiencies, and edge case failures
Annotate responses with structured feedback on strengths and weaknesses
Assess code quality, readability, and adherence to best practices
Verify technical claims using reliable public references
Apply standardized evaluation frameworks and scoring guidelines
Ensure alignment with expected conversational and engineering standards

Requirements

5+ years of experience in software engineering or related technical roles
Strong expertise in Rust and systems-level programming
Ability to solve medium to hard algorithmic problems independently
Experience executing, debugging, and validating code across environments
Familiarity with code quality standards and software engineering best practices
Strong attention to detail in reviewing technical reasoning and outputs
Fluent English communication skills (written and technical)
Experience using LLMs in coding workflows and understanding their limitations
Bachelor’s, Master’s, or PhD in Computer Science or a related field
Experience contributing to open-source projects with accepted pull requests
Familiarity with model evaluation, RLHF, or annotation workflows
Background in competitive programming
Experience reviewing production-level codebases
Exposure to multiple programming languages or paradigms
Ability to explain complex technical concepts clearly to varied audiences

Senior Rust Software Engineer (AI Evaluation)

About the Role

What You'll Do

Requirements

Explore Similar Global AI Roles

Senior Finance Strategy Specialist

Senior Graphic Designer – Presentation & Pitch Deck Design

Strategic Project Lead – Legal AI Operations