Provide executive oversight for OpenStack compute storage, and networking services
Ensure scalable VM lifecycle management, resource optimization, and operational maturity
Own end‑to‑end reliability and performance of AI compute platforms, including model training/inference pipelines, GPU scheduling and autoscaling, and high‑performance compute environments
Partner with ML, Data, and Product to build next-gen AI compute platforms
Drive adoption of automation-first operations, GitOps, and infrastructure-as-code
Own the multi‑year platform roadmap across hybrid compute, Kubernetes, virtualization, AI, and networking while driving cross‑org alignment and leading large‑scale modernization across CI/CD, observability, and infrastructure
Drive organizational strategy, prioritization, staffing plans, hiring, and budgeting
Build a high-performance, inclusive culture focused on ownership, excellence, and continuous improvement
Requirements
10+ years infrastructure/SRE/platform engineering experience
5+ years managing engineering teams (including managers or tech leads)
Deep experience with Kubernetes, virtualization, and cloud/networking
Strong leadership, communication, and cross-functional alignment
Proven record of accomplishment improving platform uptime, performance, and reliability