Identify and correct audio artifacts, loudness inconsistencies, frequency imbalances, and sibilance issues across large-scale voice datasets.
Design and implement scalable audio processing pipelines for voice data
Define and implement scalable audio processing pipelines (EQ, compression, de-essing, dynamic range optimization) and normalization strategies across inter
and intra
voice recordings.
Optimize audio quality across real and synthetic voices to ensure a consistent product experience across multiple use cases.
Lead audio quality decisions during on-site voice actor recording sessions, including microphone selection, placement, gain staging, and environment setup.
Define, document, and enforce audio quality standards for external vendors, including recording setup requirements, signal characteristics, and post-processing expectations, ensuring vendor-produced audio meets Deepgram’s training and product needs even when recordings are not done on-site.
Convert expert-driven, manual audio workflows into automated, repeatable, code-based systems.
Collaborate closely with research to improve training data quality, especially TTS speaker-specific fine-tuning.
Contribute to synthetic data pipelines by defining and validating acoustic characteristics, guiding how different “sound profiles” should be produced and evaluated.
Requirements
Professional audio engineering experience (studio, podcast, radio, live sound, or equivalent).
Deep understanding of EQ, compression, limiting, de-essing, and mastering techniques.
Strong familiarity with professional audio tools (Adobe Audition, Logic Pro, Pro Tools, or similar).
Hands-on experience with FFmpeg and command-line audio processing tools.
Solid understanding of microphone characteristics, placement, and acoustic principles.
A highly trained ear for subtle audio quality differences across voices and environments.
Programming ability (Python preferred) to automate and scale audio workflows.
Experience building custom audio plugins or DSP tools.
Open-source contributions to audio or signal-processing projects.
Background in batch or programmatic audio processing at scale.
Familiarity with ML audio preprocessing for ASR or TTS.
Experience managing large-scale audio datasets.
Comfort working in creative/audio communities and technical open-source ecosystems.