Cresta is on a mission to turn every customer conversation into a competitive advantage by unlocking the true potential of the contact center. The Staff Machine Learning Engineer will work across high-impact AI initiatives, leading the development of next-generation AI systems and ensuring their reliability and performance.
Responsibilities:
- Define and lead the technical vision for Cresta’s next-generation Agentic AI systems, including Agentic Assist and enterprise AI Agents
- Architect scalable, production-grade LLM systems that integrate reasoning, retrieval, planning, tool use, and real-time decision-making into cohesive, intelligent workflows
- Design and evolve multi-agent orchestration frameworks that combine RAG, structured knowledge, domain-adapted models, and automated actions
- Establish best practices for building robust, reliable, and cost-efficient LLM-powered systems in high-scale production environments
- Own evaluation strategy for complex, non-deterministic AI systems, including offline benchmarking, online experimentation, LLM-as-a-judge methodologies, and systematic failure analysis
- Proactively identify and mitigate agent failure modes such as hallucinations, tool misuse, retrieval errors, prompt brittleness, context drift, and multi-step reasoning breakdowns
- Define measurable quality standards (accuracy, faithfulness, task completion, latency, cost efficiency, robustness) and drive continuous system improvement
- Influence cross-team architecture decisions across ML, backend, and product engineering to ensure seamless integration of AI capabilities
- Mentor senior engineers, raise the technical bar, and contribute to long-term AI strategy and roadmap planning
- Translate cutting-edge research advances into practical, high-impact production systems
Requirements:
- Bachelor's degree in Computer Science, Mathematics, or a related field; Master's or Ph.D. strongly preferred
- 7+ years of experience building and deploying machine learning systems in production, including deep hands-on experience with LLMs at scale
- Demonstrated leadership in architecting complex AI systems, particularly agentic or multi-step LLM workflows
- Deep expertise in transformer-based models, embeddings, retrieval systems, and Retrieval-Augmented Generation (RAG) pipelines
- Experience designing evaluation frameworks for LLM systems beyond single-turn prompts, including robustness testing and production monitoring
- Strong systems thinking: ability to design for scalability, latency constraints, cost efficiency, security, and long-term maintainability
- Extensive experience with modern ML frameworks (e.g., PyTorch, TensorFlow, Hugging Face) and distributed/cloud-based infrastructure
- Proven ability to influence technical direction across teams as a senior individual contributor
- A strong bias toward action — able to prototype rapidly while maintaining production rigor