Thinking Machines Lab's mission is to empower humanity through advancing collaborative general intelligence. They are seeking a Research Vision Expertise to advance the science of visual perception and multimodal learning, focusing on the interaction of vision and language at scale and developing multimodal systems for real-world integration.

Responsibilities:

Own research projects on training and performance analysis of multimodal AI models
Curate and build large-scale datasets and evaluation benchmarks to advance vision capabilities
Work with our data infrastructure engineers, pretraining researchers and engineers, and product team to create frontier multimodal models and the products that leverage them
Publish and present research that moves the entire community forward. Share code, datasets, and insights that accelerate progress across industry and academia

Research Vision Expertise

Key skills

About this role

Responsibilities: