Wizard AI is building a high-performing AI Shopping Agent, and they are seeking a Senior MLOps Engineer to help run their machine learning systems reliably in production. The role involves owning the end-to-end lifecycle of ML systems, improving production ML pipelines, and collaborating with various teams to enhance system performance and scalability.

Responsibilities:

Build and improve production ML pipelines, making it easy to move models from experimentation to reliable production use
Help own and evolve our multi-engine inference platform (LLMs, embeddings, and extraction), improving how different workloads are served and scaled
Put strong foundations in place for model versioning, rollouts, and rollbacks so systems stay reproducible and safe to iterate on
Define and monitor key system metrics like latency, availability, and GPU utilization, and set clear expectations around performance
Improve overall system performance — whether that’s reducing latency, increasing throughput, or making better use of GPU resources
Design systems that are resilient and cost-aware, with thoughtful approaches to autoscaling, failure isolation, and graceful degradation
Bring solid engineering practices (testing, CI/CD, observability) into ML workflows to help the team move faster without sacrificing reliability
Partner closely with ML, Data, Product, and DevOps to turn ideas into production-ready systems and help guide technical decisions

Requirements:

5–8+ years of experience in software, ML, platform, or infrastructure engineering, with hands-on ownership of production ML systems
Experience deploying and running LLMs or other deep learning models in real-world environments
Strong Python skills and a solid foundation in software engineering
Familiarity with cloud platforms (AWS, GCP, Azure) and common ML tooling (model registries, experiment tracking, etc.)
A good understanding of inference performance — batching, memory usage, quantization, and how systems behave across CPU and GPU
Experience working with (or curiosity about) systems that serve different types of models with different constraints
Ability to think through tradeoffs between speed, cost, and reliability in a practical way
Comfort working in a fast-moving environment where things evolve quickly

Senior Machine Learning Engineer (Inference Platform)

Key skills

About this role

Responsibilities:

Requirements: