Senior PowerShell Software Engineer (AI Evaluation)
About the Role
A high-impact AI initiative focused on improving the reasoning accuracy and reliability of conversational systems in software engineering and automation contexts. The work emphasizes evaluating how models generate and explain scripts, system operations, and infrastructure-related logic.
This opportunity is ideal for experienced engineers with deep PowerShell expertise who can rigorously assess generated scripts, validate system-level behaviors, and identify subtle issues in logic, execution, and automation workflows. Strong analytical thinking and attention to detail are critical.
The work involves structured evaluation of AI-generated outputs, including script execution, fact-checking, and annotation, where consistency and technical precision directly influence model performance improvements.
What You'll Do
- Evaluate AI-generated PowerShell scripts for correctness, clarity, and completeness
- Execute and validate scripts in controlled environments to verify outputs
- Identify logical flaws, inefficiencies, and edge case failures in automation workflows
- Annotate model responses with detailed, structured feedback
- Assess script quality, maintainability, and adherence to best practices
- Verify technical accuracy using trusted documentation and references
- Apply standardized evaluation frameworks and scoring methodologies
- Ensure alignment with expected system behavior and conversational guidelines
Requirements
- 5+ years of experience in software engineering, automation, or related technical roles
- Advanced expertise in PowerShell scripting and Windows-based systems
- Strong understanding of system administration, automation, and scripting best practices
- Ability to independently solve complex algorithmic and scripting challenges
- Experience executing, debugging, and validating scripts across environments
- High attention to detail in reviewing technical outputs and reasoning
- Fluent English communication skills (written and technical)
- Experience using LLMs in development workflows and understanding their limitations
- Bachelor’s, Master’s, or PhD in Computer Science or a related field
- Experience contributing to open-source projects with accepted pull requests
- Familiarity with model evaluation, RLHF, or annotation workflows
- Background in competitive programming or technical problem solving
- Experience reviewing scripts or code in production environments
- Exposure to multiple programming languages or automation ecosystems
- Ability to clearly explain technical concepts to non-technical audiences