Drive the design and development of distributed services within our AI Infrastructure ecosystem, including complex orchestration for LLM inference and hosting services.
Create, refine, and assess system design proposals for our high-scale, multi-tenant inference cloud ecosystem, ensuring they meet rigorous standards for availability and resiliency.
Lead the operational strategy for critical services, defining SLOs and leveraging advanced observability to maintain platform health in a high-scale environment.
Partner deeply with Product Management, TPMs, and Engineering Management peers to align technical roadmaps with business priorities.
Work on new architecture initiatives that enable fleet optimization and help evolve DigitalOcean into a market leader for AI-native networking and infrastructure.
Requirements
Deep experience with distributed and cloud services, including messaging systems, databases, and infrastructure as code, observability, and security.
Advanced knowledge of cloud networking (VPCs, Load Balancers), containerization (Kubernetes), and cloud storage (block, object, NFS).
Proven experience building AI/ML products, specifically focusing on Gen AI platforms, LLM hosting, and inference workflows.
Significant experience running customer-facing, high-availability services across multiple regions.
Experience integrating and building with open-source software and a bias for technical ownership.
Expert proficiency in GoLang or Python and familiarity with gRPC for service-to-service communication.
Tech Stack
Cloud
GRPC
Kubernetes
NFS
Python
Benefits
We provide employees with reimbursement for relevant conferences, training, and education.
All employees have access to LinkedIn Learning's 10,000+ courses to support their continued growth and development.
Competitive array of benefits to support you from our Employee Assistance Program to Local Employee Meetups to flexible time off policy.
You may qualify for a bonus in addition to base salary; bonus amounts are determined based on company and individual performance.
We also provide equity compensation to eligible employees, including equity grants upon hire and the option to participate in our Employee Stock Purchase Program.