Akamai Technologies is a leader in digital experience delivery and security, and they are seeking a Senior Engineering Manager to lead their ML engineering team within the Akamai Inference Cloud. The role involves overseeing model validation, optimization, and lifecycle management, while driving technical strategies for responsible AI practices and building a high-performing team.
Responsibilities:
- Building and scaling a team of ML engineers focused on model validation, quantization, safety systems, and lifecycle automation
- Leading technical strategy for model quality and safety including guardrails, content filtering, red-teaming, and compliance enforcement
- Driving model optimization initiatives spanning quantization, distillation, and inference performance tuning
- Owning the model onboarding pipeline from security scanning and validation through optimization and production deployment
- Building and managing the bring-your-own-model pipeline that enables customers to onboard, validate, optimize, and serve custom models with consistent quality and safety standards
- Establishing engineering standards and best practices for responsible AI, model evaluation frameworks, and fine-tuning infrastructure
- Collaborating with platform, runtime, and developer experience teams to deliver seamless model lifecycle capabilities
Requirements:
- 10 years of relevant experience and a Bachelor's degree or its equivalent experience building and scaling high-performing ML engineering teams
- Demonstrate experience leading teams that shipped model lifecycle, model safety, or ML optimization products in production
- Possess hands-on understanding of model quantization, inference optimization, and serving frameworks such as vLLM, TensorRT, or Triton
- Show expertise in responsible AI practices including content safety, adversarial robustness, guardrail systems, and compliance frameworks
- Have experience with LLM architectures, fine-tuning workflows, and model evaluation methodologies
- Show proficiency with cloud-native technologies including Kubernetes and distributed systems at global scale