Natera is a global leader in cell-free DNA testing, dedicated to oncology, women’s health, and organ health. They are seeking a Senior Software Engineer specialized in Voice AI to own the architecture and delivery of their Voice AI platform, which handles thousands of patient calls daily. The role involves designing and implementing complex voice AI systems that improve patient access to genetic testing results.
Responsibilities:
- Own the end-to-end voice AI architecture — from Twilio media streams through LLM orchestration to TTS output and call disposition
- Design and implement multi-agent systems using tool calling, agent handoffs, and shared conversation state for complex patient workflows
- Build and optimize real-time audio pipelines — WebSocket streaming, codec handling (mulaw/PCM), VAD configuration, and interruption management
- Architect analytics and observability infrastructure for voice-specific metrics: per-segment latency (STT/LLM/TTS), call efficacy, disposition accuracy, and ASR error rates
- Solve voice-specific challenges: turn-taking timing, silence detection thresholds, barge-in recovery, medical term recognition, and end-to-end latency optimization
- Integrate voice agents with internal services via secure authenticated APIs
- Drive platform reliability — eliminate single points of failure, implement multi-provider LLM failover, and design graceful degradation paths
- Collaborate with product and clinical operations to improve self-serve efficacy rates and reduce call escalations
- Mentor team members on voice AI best practices and contribute to architectural decisions
Requirements:
- 5+ years of software engineering experience, with at least 2 years building production voice AI or conversational AI systems
- Deep experience with voice AI pipelines — you understand the end-to-end flow from telephony through STT, LLM processing, TTS, and back to the caller, and you've solved real problems at each stage
- Production experience with agentic architectures — multi-agent orchestration, tool calling, agent handoffs, memory/state management, and LLM-driven decision making in real-time conversation contexts
- Strong understanding of voice-specific challenges: VAD tuning, turn-taking, interruption/barge-in handling, latency budgets, audio codec management, and the differences between voice and text-based AI UX
- Hands-on experience with telephony systems — Twilio (media streams, SIP, IVR), or equivalent platforms with WebSocket-based audio streaming
- Proficiency in TypeScript/Node.js with strong async programming patterns; experience with NestJS or similar frameworks
- Experience with STT/TTS providers (Deepgram, OpenAI, ElevenLabs, Azure Speech) and understanding of ASR accuracy challenges (domain-specific vocabulary, noise handling)
- Production experience with LLM APIs — OpenAI (especially Realtime API), Anthropic Claude, or equivalent; prompt engineering for conversational agents
- High agency and autonomy — you don't wait for permission, detailed specs, or hand-holding. You unblock yourself, seek out the highest-impact work, and drive it to completion
- Excellent communication — you can translate complex voice AI architecture decisions for product and clinical stakeholders
- Experience in healthcare, biotech, or regulated environments (HIPAA, PHI handling, zero-retention architectures, BAA compliance)
- AWS infrastructure experience — ECS Fargate, Lambda, DynamoDB, Bedrock, Kafka/MSK, API Gateway, CDK
- Background in real-time systems: WebSocket lifecycle management, connection resilience, streaming protocols
- Experience building analytics pipelines for voice/conversational metrics (call efficacy, disposition tracking, latency observability)
- Familiarity with RAG architectures (vector stores, embedding models, chunking strategies) for knowledge-grounded voice agents
- Track record of migrating or evaluating vendor platforms while maintaining production uptime
- Experience with Datadog APM, LLM Observability, or equivalent monitoring for AI systems
- Prior experience in a high-growth startup or zero-to-one product environment