NVIDIA is the platform upon which every new AI‑powered application is built! We are seeking a deeply technical, hands‑on Senior Engineering Manager to lead the NVIDIA Inference Microservices (NIM) Factory team. You will lead and scale a world‑class engineering organization that delivers day‑0 model launches and follows through with enterprise‑grade software to delight customers with reliable, performant, and secure AI services at massive scale.
Responsibilities:
- Lead the NIM Factory engineering team (containers, orchestration, workflow, observability, platform APIs); attract, hire, onboard, and grow top talent
- Define vision, strategy, and roadmap for how we build, ship, and operate NIM from day‑0 launch through enterprise‑grade hardening (security, reliability, performance, compliance)
- Own end‑to‑end delivery of cross‑functional programs; align stakeholders and manage dependencies
- Drive predictable delivery across multiple programs; manage priorities, resourcing, schedules, and dependencies
- Establish engineering excellence: code health and reviews, documentation, CI/CD, testing
- Collaborate with research and platform teams on inference architecture and scalable deployment patterns
Requirements:
- 10+ overall years building and delivering production software systems, including 5+ years leading engineering teams as a manager; experience leading multiple teams or managing managers is a plus
- Proven track record driving complex, cross‑functional programs from inception to successful production launch and scale
- Strong foundation in cloud‑native engineering (containers, Kubernetes, microservices) and modern SDLC practices (CI/CD, testing, observability)
- Proficiency with cloud languages such as Python; ability to read code, guide designs, and drive high‑quality engineering outcomes
- Demonstrated ability to hire, coach, and develop senior engineers/tech leads; build inclusive teams and a culture of ownership and excellence
- Excellent communication and stakeholder management; ability to influence across product, research, security, and operations
- A degree in Computer Science, Computer Engineering, or a related field (BS or MS) or equivalent experience
- Led teams that built and operated large‑scale LLM inference or model‑serving platforms (Triton, TensorRT‑LLM, vLLM) in production
- Experience architecting next-generation container build systems or CI/CD platforms at scale
- Built organizations across multiple time zones; established durable engineering processes that improved quality and velocity
- Proven success building talent pipelines, mentoring managers/tech leads, and increasing team engagement and retention
- Contributions to open‑source ecosystems, technical publications, or talks in containers, Kubernetes, GPU, or inference communities