Thinking Machines Lab is focused on advancing collaborative general intelligence and creating multimodal AI systems. The role involves conducting research on visual perception and multimodal learning, developing architectures, datasets, and evaluation methods to enhance AI's understanding and interaction with the physical world.
Responsibilities:
- Own research projects on training and performance analysis of multimodal AI models
- Curate and build large-scale datasets and evaluation benchmarks to advance vision capabilities
- Work with our data infrastructure engineers, pretraining researchers and engineers, and product team to create frontier multimodal models and the products that leverage them
- Publish and present research that moves the entire community forward. Share code, datasets, and insights that accelerate progress across industry and academia