Zoom is a leading collaboration platform company dedicated to enhancing communication through innovative solutions. They are seeking a Product Engineer focused on customer deployments for Agentic Voice AI, responsible for leading the deployment of next-generation Zoom Virtual Agents and ensuring optimal performance of conversational AI systems.
Responsibilities:
- Leading multi-step conversational workflow design using agentic frameworks to enable LLM agents to execute complex multi-tool tasks, including database checks and transactions
- Fine-tune the underlying Large Language Models (LLMs) to ensure optimal performance, low latency, and minimal hallucination in real-time voice conversations
- Developing and deploy custom full-stack Python/JS to connect the agent's decision-making with customer back-end systems (CRM, ERP, knowledge bases) as external "tools."
- Crafting and iterating advanced LLM prompts (e.g., chain-of-thought, few-shot) to guide conversation, maintain context, and ensure accurate tool use in live voice interactions
- Prompt-engineer TTS components (e.g., ElevenLabs, Azure, Cartesia) to fine-tune tone, pacing, emphasis, and prosody to context and brand
- Conducting A/B testing and performance monitoring on TTS output, ensuring high acoustic quality, latency requirements, and emotional appropriateness
- Acting as the embedded technical expert, working directly with customer engineering and executive teams to map complex business goals to agentic solutions
- Managing end-to-end production deployments with continuous optimization and performance tuning driven by real-world voice traffic data
- Translating field challenges, LLM failure modes, and TTS needs into actionable input for Core Engineering and Product to drive platform innovation
Requirements:
- Have 3+ years in a technical, customer-facing role (FDE, Solutions Architect, Technical Lead) focused on AI/ML or highly technical SaaS products
- Possess proven hands-on experience designing and deploying LLM solutions using modern agentic frameworks (e.g., LangChain, LlamaIndex, internal tools)
- Have deep expertise in voice stacks and TTS integration, using prompt engineering to control prosody, tone, and pacing
- Possess proven ability to write and manage iterative prompts guiding both generative text logic and speech synthesis quality (prosody and emotion)
- Have solid software development skills in Python for prototyping, API integration, and production deployment, coupled with familiarity with modern DevOps practices
- Own exceptional ability to distill complex AI and engineering concepts into clear, actionable insights for both technical teams and C-level executives