NVIDIA has been transforming computer graphics and accelerated computing for more than 25 years. They are seeking a product manager to lead the Kubernetes distribution strategy and manage the NKE product surface, ensuring optimal AI workload performance across environments.
Responsibilities:
- Own the NKE product surface: control plane lifecycle management, API server availability, component upgrades, and cluster provisioning and teardown
- Define our Kubernetes distribution strategy — packaging, conformance, version policy, and release cadence for NVIDIA-managed and on-premises environments
- Drive upstream Kubernetes alignment: feature adoption, contribution strategy, and release tracking that keeps us current without introducing instability
- Own developer and operator tooling for cluster management, diagnostics, and day-2 operations across environments
- Define and publish tooling that enables on-premises customers and partners to deploy, run, and upgrade NVIDIA Kubernetes clusters independently
- Drive service reliability, upgrade safety, and multi-tenant isolation at the provider layer
Requirements:
- Bachelor's degree in Computer Science, Engineering, or a similar area, or equivalent experience
- 8+ years of product management experience in Kubernetes infrastructure, Kubernetes services, or platform engineering
- Deep understanding of Kubernetes internals: control plane architecture, etcd, scheduling, networking, storage integration, and upgrade mechanics
- Experience shipping a Kubernetes distribution, K8s service, or enterprise platform product
- Track record of leading upstream open source alignment alongside production delivery constraints
- Experience with on-premises and hybrid deployment models, not just public cloud
- Building or operating EKS, AKS, GKE, OpenShift, Rancher, or similar K8s platforms
- Hands-on experience with NVIDIA GPU infrastructure, DGX systems, or GPU-aware K8s scheduling
- Shipping Kubernetes tooling used by operators in production (cluster management, diagnostics, lifecycle automation)
- K8s conformance certification, CIS benchmarks, or security hardening for enterprise or government environments
- Contributions to upstream Kubernetes or CNCF ecosystem projects