About this role

Call For Referral is focused on advancing next-generation AI systems, and they are seeking a Machine Learning Ops Engineer to support AI research and engineering teams. The role involves improving ML infrastructure, designing advanced MLOps tasks, and contributing to large-scale model training performance.

Responsibilities:

Support AI research and engineering teams in improving ML infrastructure and training systems
Design advanced MLOps and ML systems tasks with accurate, structured technical solutions
Evaluate ML systems outputs and provide detailed technical feedback
Develop evaluation rubrics and frameworks for distributed systems, training pipelines, and kernel-level optimization
Collaborate with domain experts to maintain consistency and quality across AI training workflows
Contribute to improvements in large-scale model training performance and infrastructure reliability

Requirements:

2+ years of professional experience in ML infrastructure, MLOps, or ML systems engineering
Hands-on production experience with JAX and/or PyTorch at scale
Experience writing or optimizing GPU kernels using Pallas or Triton
Strong understanding of ML training systems and distributed infrastructure
Demonstrated career progression in engineering or AI infrastructure roles
Ability to commit to a full-time 40-hour/week weekday schedule
Strong written communication and technical documentation skills

Machine Learning Ops Engineer | Remote | $90 –$140/hr

Key skills

About this role

Responsibilities:

Requirements: