DigitalOcean is a leading cloud infrastructure provider dedicated to simplifying cloud services for developers and businesses. They are seeking a Manager for Customer Success Engineering who will lead a team of Customer Success Engineers, focusing on delivering exceptional support experiences for strategic customers in AI and cloud environments. The role involves operational management, team development, and cross-functional collaboration to enhance customer satisfaction and product quality.
Responsibilities:
- Lead, hire, train, mentor and develop a high-performing team of Customer Success Engineers (CSEs), driving accountability, performance, and career growth
- Establish performance metrics (KPIs/SLAs) and conduct regular 1:1s, performance reviews, and career development planning
- Own end-to-end support operations, including queue management, escalations, and shift planning to ensure consistent 24x7 coverage
- Drive improvements in key support metrics such as CSAT, response times, resolution times, and overall support quality
- Build and strengthen technical expertise within the team across core areas such as Kubernetes (DOKS), Databases, Compute, and AI/ML workloads
- Act as the ultimate point of technical escalation for our largest, most strategic enterprise customers across Cloud and AI/ML workloads, stepping in to manage critical incidents and high-severity (Sev1/Sev2) issues
- Design and implement customized support plans, SLAs, and escalation pathways tailored to the needs of strategic accounts
- Partner closely with Technical Account Managers (TAMs), Growth Account Managers (GAM) to conduct Executive Business Reviews (EBRs) and ensure customers are maximizing the value of our Cloud and AI/ML products
- Proactively identify risks and opportunities within strategic accounts to improve customer experience, adoption, and retention
- Serve as the Voice of the Customer (VoC) to Product and Engineering teams, synthesizing support data to advocate for bug fixes, feature requests, and UX improvements
- Own and continuously improve escalation protocols between AI/ML Support and CloudOps, Infrastructure Engineering, and Product — including Jira escalation routing, Sev1 bridge management, and post-incident documentation
- Own the development and maintenance of SOPs, escalation runbooks, HVC support playbooks, and knowledge base content — treating documentation infrastructure as a core operational lever for team scalability
- Contribute to the vision for AI and automation within support—building intelligent tooling and driving the team toward an automation-first model to improve efficiency, scalability, and customer experience
- Foster a culture of continuous learning, ensuring the team stays ahead of evolving cloud technologies, AI/ML frameworks, and industry trends
Requirements:
- 5+ years of experience in Technical Support, Customer Success, or Technical Account Management within B2B SaaS, Cloud, or AI/ML environments ideally including experience supporting AI-native, high-growth companies with 24x7 production dependencies on GPU infrastructure
- 2+ years of people management experience leading technical, customer-facing teams, preferably in a high-growth, post-acquisition, or rapidly scaling environment
- Solid understanding of AI/ML concepts, including Generative AI, Large Language Models (LLMs), natural language processing (NLP), and MLOps
- Deep familiarity with GPU infrastructure (NVIDIA H100/H200, bare metal GPU provisioning) and AI inference workloads is strongly preferred
- Proficiency in reading and debugging code (Python preferred) and troubleshooting RESTful APIs and cloud architecture
- Excellent verbal and written communication skills, with the ability to translate complex technical or AI concepts for diverse audiences OR to both highly technical engineers and non-technical business executives
- Proven ability to remain calm under pressure and de-escalate high-stakes situations with enterprise clients
- Hands-on experience with ML frameworks (e.g., TensorFlow, PyTorch, Scikit-learn) and AI toolchains (e.g., LangChain, Hugging Face)
- Experience with major cloud platforms (AWS, Google Cloud, Azure) and their native AI/ML services
- Bachelor's or Master's degree in Computer Science, Data Science, Engineering, or a related technical field
- ITIL or equivalent service management certification