Together AI is a research-driven artificial intelligence company focused on building advanced voice applications. They are seeking a Senior Platform Engineer to own the API and infrastructure layer for voice workloads, ensuring the reliability and performance of their Voice AI platform.

Responsibilities:

Own the real-time API layer (WebSocket + HTTP streaming) that powers Together's voice platform
Design autoscaling and orchestration for voice workloads running on tens of thousands of GPUs
Build the developer experience — APIs, observability, and tooling — for a fast-growing product area
Work with production voice customers (contact centers, AI agents, communication platforms) to ship what they actually need
Build and harden real-time WebSocket and HTTP streaming APIs for STT and TTS — including connection lifecycle management, backpressure, error handling, and reconnection, at the reliability bar needed for production voice agents
Design and ship autoscaling for voice model endpoints that handles bursty, real-time traffic patterns — accounting for concurrent connection limits, streaming state, and hard latency ceilings
Implement voice-specific API features: word-level alignment, speaker diarization in realtime, audio format flexibility (g711/mulaw for telephony, PCM, WebRTC formats), pronunciation controls, and multi-context WebSocket support
Build voice-specific observability — latency breakdowns, audio quality signals, and dashboards that help both the team and customers debug issues
Own multi-model normalization across our model partners (Cartesia, Deepgram, Rime, and others), ensuring consistent API behavior regardless of the underlying provider
Collaborate with the ML engineering side of the team on the interface between the API layer and the model serving stack, ensuring latency and reliability requirements are met end-to-end
Contribute to developer experience — API design, documentation, integration cookbooks, playground and showcasing how best-in-class voice agents are built
Lay the groundwork for multiple new products down the line

Requirements:

5+ years of experience building large-scale, real-time distributed systems and API services
Deep expertise in real-time streaming infrastructure — WebSocket server architecture, Server-Sent Events, bidirectional streaming, connection multiplexing, and stateful protocol design
Expert-level programming in TypeScript and Python; experience with Rust is a plus
Strong distributed systems fundamentals: load balancing, autoscaling, rate limiting, and traffic shaping for latency-sensitive workloads
Experience with Kubernetes — including custom autoscalers, resource management, and health checking for stateful services
Strong product sense — you care about API ergonomics and think about what developers building voice apps actually need
Comfort working on a small, early-stage team where you'll wear multiple hats and move fast
Experience with audio or media protocols (WebRTC, g711, PCM encoding) is a strong plus
Familiarity with ML model serving infrastructure and how inference engines work is a plus — you'll interface with the serving layer regularly
Bachelor's or Master's degree in Computer Science, Computer Engineering, or related field, or equivalent practical experience
Experience with Rust is a plus
Full-stack experience (React, Next.js) is a nice-to-have for contributing to developer-facing tooling

Senior Platform Engineer, Voice AI

Key skills

About this role

Responsibilities:

Requirements: