Design and build scalable infrastructure supporting training and inference workflows.
Develop high-performance APIs and backend services for AI model serving.
Optimize GPU utilization, latency, and throughput for multimodal workloads.
Build distributed systems supporting large-scale generative models.
Improve observability, monitoring, and reliability of AI systems.
Partner closely with Applied Science teams to productionize research systems.
Drive improvements in deployment workflows, automation, and platform usability.

Degree in Computer Science, Engineering, or comparable combination of education and practical experience.
Strong object-oriented programming skills (Python, C++, Java, Go, or similar).
Strong data structures and algorithms foundations.
Experience building production backend or distributed systems.
Understanding of cloud infrastructure concepts and containerized systems.
Experience with Kubernetes, Docker, or container orchestration.
Familiarity with GPU-based ML workloads or distributed training/inference systems.
Experience with model serving frameworks (vLLM, Triton, Ray Serve, or similar).
Experience with observability tools and performance debugging.
Familiarity with PyTorch or ML workflows.
Interest in optimizing systems for efficiency, scalability, and developer velocity.

Software Engineer – AI Infrastructure, Training, Inference

Key skills