Nuance Labs is an early-stage deep tech startup focused on building a real-time human foundation model that integrates text, speech, and vision. The Research Scientist in Speech Synthesis will work on developing and applying advanced models for speech synthesis and audio generation, contributing to innovative multimodal AI solutions.
Responsibilities:
- Have a PhD (or equivalent experience) in training speech synthesis models (text-to-speech, speech-to-speech, etc.), training audio generation models, or related fields, with a track record of pushing the research frontier
- Know deep learning inside out and can run the whole ML pipeline, from data wrangling and rapid prototyping to large-scale training, benchmarking, and evaluation
- Love blank-page problems, chart your own course, and make progress without waiting for someone to hand you a task list
- Move quickly from research breakthroughs to practical, real-world applications
- Write code that’s clean enough your future self will thank you for
- Play well with other brilliant minds from different domains