About this role

Figure is an AI robotics company developing autonomous general-purpose humanoid robots. They are seeking a Helix AI Engineer, Video Pretraining to lead the development of large-scale video foundation models that enable capabilities in perception, prediction, and embodied reasoning.

Responsibilities:

Design and train large-scale video foundation models on diverse datasets spanning internet-scale video and robot-collected data
Develop pretraining strategies that capture temporal dynamics, motion, and object interaction from raw video sequences
Build models that learn transferable representations for downstream tasks such as perception, tracking, prediction, and control
Explore architectures for video understanding and generation, including transformer-based and diffusion-based approaches
Implement efficient data pipelines and training strategies for high-throughput video ingestion and large-scale distributed training
Optimize model performance across compute, memory, and training efficiency constraints
Collaborate closely with generative modeling, agent, and robot learning teams to integrate pretrained models into the autonomy stack
Design evaluation frameworks and benchmarks to measure temporal understanding, prediction quality, and generalization

Helix AI Engineer, Video Pretraining

Key skills

About this role

Responsibilities: