Generative AI MLOps Engineer – JAX/PyTorch Stack
About the Role
A high-performance AI engineering initiative is seeking experienced MLOps professionals to support the development of next-generation large language models and distributed AI systems. The work focuses on training infrastructure, model optimization, kernel-level acceleration, and scalable ML systems within advanced Generative AI environments.
This opportunity is ideal for engineers with strong hands-on experience in JAX, PyTorch, distributed ML systems, and GPU kernel optimization. Candidates who have worked on production-scale ML infrastructure, performance tuning, or training pipeline engineering in fast-paced technical environments will be especially well aligned with the role.
The work involves designing and evaluating ML systems tasks, improving infrastructure-level reasoning workflows, and contributing to technically rigorous AI training environments where scalability, optimization, and engineering precision are critical.
What You'll Do
- Design and evaluate ML infrastructure and MLOps-focused technical tasks
- Develop structured solutions for distributed training and model optimization problems
- Guide engineering workflows related to AI training infrastructure and framework-level performance
- Evaluate technical outputs and provide detailed written engineering feedback
- Develop evaluation rubrics for ML systems design, kernel optimization, and infrastructure reasoning
- Optimize GPU workloads using Triton or Pallas-based kernel programming approaches
- Collaborate with technical subject matter experts to improve consistency and quality across training datasets
- Support scalable training workflows involving JAX and PyTorch ecosystems
- Maintain high standards for system reliability, performance, and engineering clarity
Requirements
- 2+ years of professional experience in MLOps, ML infrastructure, or ML systems engineering
- Hands-on production experience with JAX and/or PyTorch at scale
- Experience developing or optimizing GPU kernels using Triton or Pallas
- Strong understanding of distributed training systems and model performance optimization
- Ability to explain complex technical decisions clearly through written communication
- Experience working with large-scale AI training pipelines and infrastructure workflows
- Demonstrated technical growth and increasing engineering responsibility over time
- Availability to work at least 30 hours per week during weekdays
- Must be located in India
- Comfort working independently in remote asynchronous environments
- Preferred: Experience supporting Generative AI or large language model training systems
- Preferred: Familiarity with infrastructure evaluation frameworks, benchmarking, or systems-level assessment workflows
- Preferred: Experience using AI-assisted engineering tools such as ChatGPT or related developer systems