Odyssey is an AI lab pioneering general-purpose world models, and they are seeking an engineer to build the infrastructure that supports groundbreaking research and products. The role involves developing a low-latency model inference platform, scaling data processing infrastructure, and collaborating with researchers to optimize workflows.

Responsibilities:

Develop and operate our low-latency model inference platform, ensuring high availability, scalability, and efficient resource utilization for Odyssey’s world models
Engineer and scale our core data processing infrastructure (e.g., Flyte, Ray with k8s) to handle petabyte-scale datasets
Design, build, and maintain our large-scale, GPU-based training clusters for deep learning, focusing on usability, high throughput and reliability
Automate infrastructure provisioning, configuration, monitoring, and alerting using Infrastructure as Code (IaC) principles
Drive performance tuning, cost optimization, and reliability improvements across the entire stack
Collaborate closely with researchers and product developers to understand their requirements, optimize their workflows, and improve platform usability

Member of Technical Staff, Infrastructure Engineer

Key skills

About this role

Responsibilities: