DigitalOcean is a cutting-edge technology company focused on simplifying cloud and AI for builders. They are seeking a Staff Forward Deployed Engineer to solve complex cloud infrastructure challenges, collaborate with strategic customers, and drive AI adoption through advanced application development and tooling.
Responsibilities:
- Partner deeply with strategic ANEs to manage complex migrations, build production-ready PoCs, and execute hands-on application builds on our GPU infrastructure
- Operationalize field learning by building migration planners, reusable systems, benchmarking frameworks, model optimization agents, and deployment automation scripts (e.g., Terraform/Pulumi) that standardize how AI workloads are deployed and tuned across DigitalOcean
- Create pre-built GenAI agents, solution templates, and notebooks (e.g., LangGraph, CrewAI) tailored to popular and advanced business needs
- Act as the frontline technical voice for CPTO, surfacing architectural gaps and edge cases to inform core product development. Transition field-built tools into native product features
- Lead the early testing and integration of emerging AI frameworks to ensure a first-class developer experience on DigitalOcean
- Co-develop delivery frameworks with Strategic and Technical partners to enable repeatable, high-quality deployments at scale
- Produce field-tested demo kits and deployment guides that support new product releases with practical, validated assets
Requirements:
- Significant experience in the AI/ML lifecycle, specifically hosting large language or multimodal models using inference engines like vLLM, SGLang, or Modular
- Deep understanding of common LLM architectures and optimization techniques (e.g., continuous batching, quantization)
- Act as the subject matter expert on modern GPU families (NVIDIA/AMD) and their software stacks (CUDA, ROCm, TensorRT, OpenAI Triton)
- Expert proficiency in Kubernetes (K8s) and the design of distributed systems, including microservices, messaging systems, databases, and Infrastructure as Code
- Ability to seamlessly integrate AI workloads with core cloud services (Networking, VPC, Storage, Compute)
- Hands-on experience with distributed inference serving frameworks (e.g., llm-d, NVIDIA Dynamo, or Ray Serve)
- Understanding of GPU-level optimization and experience with interconnect technologies like NVlink, XGMI, or RoCE to maximize hardware efficiency
- Strong production coding skills in Python or Go with the ability to build high-quality tools, automation, and internal assets
- Proven ability to benchmark AI infrastructure and perform GPU utilization tuning to optimize customer ROI and workload performance
- Proven ability to establish instant technical credibility with CTOs and Lead Architects while managing high-stakes technical migrations
- Experience with emerging AI orchestration frameworks like LangGraph, CrewAI, or LlamaIndex
- A background in technical consulting or 'Forward Deployed' roles at high-growth AI or infrastructure companies
- Experience building and scaling 'Center of Excellence' models or partner delivery frameworks
- Active contributor to open-source AI projects or technical communities