Cartesia is on a mission to architect AI that learns from and interacts with the world like humans do. As a Senior Applied Researcher in Audio Understanding, you will tackle challenging problems in audio perception, leading high-impact projects critical to building truly aware AI.
Responsibilities:
- Architect and develop novel, large-scale models for complex audio understanding tasks, including multi-speaker ASR, diarization, and non-speech audio classification and deploy them to production at scale
- Pioneer research in areas like self-supervised learning for audio, few-shot learning, and robust audio-visual perception
- Set new standards for how we evaluate and benchmark our audio understanding systems
- Build large scale pre-training and fine-tuning datasets for audio understanding capabilities