Luma AI is a company focused on leveraging data to enhance advanced capabilities in their foundation models. They are seeking a Research Scientist / Engineer to address fundamental data challenges and develop innovative solutions in multimodal AI systems.
Responsibilities:
- Identify capability gaps and research solutions
- Design datasets and data-mixture ablations to systematically improve model capabilities across vision, audio, and language
- Develop evaluation frameworks and benchmarking approaches for multimodal AI capabilities
- Create prototypes and demonstrations that showcase new multimodal capabilities
Requirements:
- Strong programming skills in Python and PyTorch
- Experience with large-scale dataset
- Experience with multimodal data processing pipeline
- Understanding of computer vision, audio processing, and / or natural language processing techniques
- Expertise working with interleaved multimodal data
- Hands-on experience with Vision Language Models, Audio Language Models, or generative video models