About this role

Orbifold AI is building foundational infrastructure for the next generation of physical AI, collaborating with leading robotics and world model research teams. They are seeking a Machine Learning Engineer to scale and optimize ML infrastructure, managing large volumes of multimodal data for advanced AI applications.

Responsibilities:

Architect, build, and optimize distributed ML pipelines on Ray (Ray Core, Ray Train, Ray Serve) and PyTorch, designed for the demands of multimodal video, image, and sensor data at scale
Profile and tune distributed training jobs and inference deployments to maximize GPU/CPU utilization and reduce latency
Build robust abstractions and internal tools that let our researchers and product engineers deploy PyTorch models onto our Ray clusters seamlessly
Design and maintain high-throughput video processing pipelines (e.g. FFmpeg, NVDEC/NVENC, frame-level indexing) that feed our curation, training, and evaluation workloads
Ensure the high availability, fault tolerance, and observability of our distributed compute systems
Build the serving infrastructure for our evaluation harnesses, verification models, and RL environments
Collaborate with research, data, and product engineering teams to translate modeling constraints into scalable infrastructure solutions

Machine Learning Engineer, Applied AI Infrastructure

Key skills

About this role

Responsibilities: