Cresta is on a mission to turn every customer conversation into a competitive advantage by unlocking the true potential of the contact center. The Staff Machine Learning Engineer will work across high-impact AI initiatives, leading the development of next-generation AI systems and ensuring their reliability and performance.

Responsibilities:

Define and lead the technical vision for Cresta’s next-generation Agentic AI systems, including Agentic Assist and enterprise AI Agents
Architect scalable, production-grade LLM systems that integrate reasoning, retrieval, planning, tool use, and real-time decision-making into cohesive, intelligent workflows
Design and evolve multi-agent orchestration frameworks that combine RAG, structured knowledge, domain-adapted models, and automated actions
Establish best practices for building robust, reliable, and cost-efficient LLM-powered systems in high-scale production environments
Own evaluation strategy for complex, non-deterministic AI systems, including offline benchmarking, online experimentation, LLM-as-a-judge methodologies, and systematic failure analysis
Proactively identify and mitigate agent failure modes such as hallucinations, tool misuse, retrieval errors, prompt brittleness, context drift, and multi-step reasoning breakdowns
Define measurable quality standards (accuracy, faithfulness, task completion, latency, cost efficiency, robustness) and drive continuous system improvement
Influence cross-team architecture decisions across ML, backend, and product engineering to ensure seamless integration of AI capabilities
Mentor senior engineers, raise the technical bar, and contribute to long-term AI strategy and roadmap planning
Translate cutting-edge research advances into practical, high-impact production systems

Requirements:

Bachelor's degree in Computer Science, Mathematics, or a related field; Master's or Ph.D. strongly preferred
7+ years of experience building and deploying machine learning systems in production, including deep hands-on experience with LLMs at scale
Demonstrated leadership in architecting complex AI systems, particularly agentic or multi-step LLM workflows
Deep expertise in transformer-based models, embeddings, retrieval systems, and Retrieval-Augmented Generation (RAG) pipelines
Experience designing evaluation frameworks for LLM systems beyond single-turn prompts, including robustness testing and production monitoring
Strong systems thinking: ability to design for scalability, latency constraints, cost efficiency, security, and long-term maintainability
Extensive experience with modern ML frameworks (e.g., PyTorch, TensorFlow, Hugging Face) and distributed/cloud-based infrastructure
Proven ability to influence technical direction across teams as a senior individual contributor
A strong bias toward action — able to prototype rapidly while maintaining production rigor

Staff Machine Learning Engineer

Key skills

About this role

Responsibilities:

Requirements: