Thinking Machines Lab's mission is to empower humanity through advancing collaborative general intelligence. They are seeking a Research Vision Expertise to advance the science of visual perception and multimodal learning, focusing on the interaction of vision and language at scale and developing multimodal systems for real-world integration.
Responsibilities:
- Own research projects on training and performance analysis of multimodal AI models
- Curate and build large-scale datasets and evaluation benchmarks to advance vision capabilities
- Work with our data infrastructure engineers, pretraining researchers and engineers, and product team to create frontier multimodal models and the products that leverage them
- Publish and present research that moves the entire community forward. Share code, datasets, and insights that accelerate progress across industry and academia