Netflix is a company dedicated to entertaining the world through innovative storytelling and technology. They are seeking a Technical Program Manager to lead critical initiatives across their Cloud Infrastructure and AI Platform, focusing on scalability, modernization, and internal partnerships.
Responsibilities:
- Lead initiatives that drive adoption and scalability of infrastructure platforms, including large-scale platform modernization, migrations, and feature rollouts
- Partner closely with leaders, engineers, PMs, and practitioners to decompose complex problem spaces into technical execution plans with well-defined milestones, ownership, and success criteria
- Drive execution rigor by tracking progress, managing dependencies, and proactively identifying risks, trade-offs, and mitigation strategies while implementing minimal process
- Communicate clearly and consistently with technical leadership and stakeholders on progress, risks, and decisions
- Use insights from program execution to influence platform evolution, architectural direction, and operational practices over time
Requirements:
- 7+ years of experience leading large-scale technical programs in infrastructure, internal platforms, or distributed systems, working directly with engineering teams
- Experience driving programs across infrastructure layers to develop scalable platforms that support diverse business needs
- Proven experience creating partnerships with cross-functional teams, driving large-scale technical strategies, debating technical approaches, and building long-term scalable solutions with engineers
- Proven ability to identify gaps in solutions and weigh in on product vs. technology trade-offs
- Excellent written and verbal communication skills, with the ability to articulate technical constraints, trade-offs, and business impact to diverse audiences
- Self-starter who enjoys quickly bringing organization and direction
- Proven ability to operate in 0→1 and evolving problem spaces, quickly bringing order and execution discipline without adding too much process
- Understanding of challenges in high-scale distributed systems, architectures, and data layers
- Experience with fleet-wide cloud efficiency and/or building a culture of cost efficiency
- Experience with machine learning training and experimentation
- Experience with machine learning pipelines
- Experience with inference at production scale
- Experience applying AWS cloud services at a large scale
- Experience with Kubernetes used natively in platform services
- Experience with 3rd party vendors/tools for training and inference experimentation uses