Back to all jobs
Artificial Intelligence / Machine Learning Infrastructure

Generative AI MLOps Engineer – JAX/PyTorch Stack

Remote (India) Contract Platform: Mercor

About the Role

A high-performance AI engineering initiative is seeking experienced MLOps professionals to support the development of next-generation large language models and distributed AI systems. The work focuses on training infrastructure, model optimization, kernel-level acceleration, and scalable ML systems within advanced Generative AI environments.

This opportunity is ideal for engineers with strong hands-on experience in JAX, PyTorch, distributed ML systems, and GPU kernel optimization. Candidates who have worked on production-scale ML infrastructure, performance tuning, or training pipeline engineering in fast-paced technical environments will be especially well aligned with the role.

The work involves designing and evaluating ML systems tasks, improving infrastructure-level reasoning workflows, and contributing to technically rigorous AI training environments where scalability, optimization, and engineering precision are critical.

What You'll Do

  • Design and evaluate ML infrastructure and MLOps-focused technical tasks
  • Develop structured solutions for distributed training and model optimization problems
  • Guide engineering workflows related to AI training infrastructure and framework-level performance
  • Evaluate technical outputs and provide detailed written engineering feedback
  • Develop evaluation rubrics for ML systems design, kernel optimization, and infrastructure reasoning
  • Optimize GPU workloads using Triton or Pallas-based kernel programming approaches
  • Collaborate with technical subject matter experts to improve consistency and quality across training datasets
  • Support scalable training workflows involving JAX and PyTorch ecosystems
  • Maintain high standards for system reliability, performance, and engineering clarity

Requirements

  • 2+ years of professional experience in MLOps, ML infrastructure, or ML systems engineering
  • Hands-on production experience with JAX and/or PyTorch at scale
  • Experience developing or optimizing GPU kernels using Triton or Pallas
  • Strong understanding of distributed training systems and model performance optimization
  • Ability to explain complex technical decisions clearly through written communication
  • Experience working with large-scale AI training pipelines and infrastructure workflows
  • Demonstrated technical growth and increasing engineering responsibility over time
  • Availability to work at least 30 hours per week during weekdays
  • Must be located in India
  • Comfort working independently in remote asynchronous environments
  • Preferred: Experience supporting Generative AI or large language model training systems
  • Preferred: Familiarity with infrastructure evaluation frameworks, benchmarking, or systems-level assessment workflows
  • Preferred: Experience using AI-assisted engineering tools such as ChatGPT or related developer systems
Application Note: By submitting your profile for this partnered position, our team can quickly review your background and reach out to present you with this specific opportunity or match you with similar AI Training projects.