About this role

DigitalOcean is a cutting-edge technology company focused on simplifying cloud and AI for builders. They are seeking a Senior Engineer 2 to lead the technical strategy for AI Inference Optimization, ensuring high performance and efficiency in inference services.

Responsibilities:

Lead the technical strategy for benchmarking and performance optimizations at the inference engine and GPU kernel layers, ensuring our infrastructure extracts maximum value from every TFLOP
Engineer solutions for complex performance issues, including attention layer optimizations, memory and precision management, and advanced parallelization across multi-node GPU clusters
Proactively implement cutting-edge optimization techniques to keep DigitalOcean at the forefront of the Gen AI landscape
Act as the subject matter expert on modern GPU families (NVIDIA/AMD) and their software stacks (CUDA, ROCm, TensorRT, OpenAI Triton), advising on hardware procurement and software integration
Lead by example through high-quality code and design reviews, elevating the technical bar for the team without the administrative overhead of direct management
Partner with Product Management and TPMs to translate 'theoretical hardware limits' into 'shippable product features,' ensuring our platform is both powerful and developer-friendly
Maintain a strong presence in the GPU infrastructure and model performance optimization communities, contributing to and integrating the best of open-source AI

Senior Engineer 2: Inference Optimizations

Key skills

About this role

Responsibilities: