Role Overview

Lead SRE managers and senior individual contributors, develop leaders through coaching and delegation, and build a strong bench of technical and leadership talent.
Drive talent reviews, succession planning, organizational design discussions, performance management, and development planning for high-potential team members.
Foster a culture of operational excellence, innovation, and continuous learning across the organization.
Define multi-quarter roadmaps and OKRs aligned with business objectives, and balance investments across reliability, scalability, cost, and security priorities.
Develop strategies to reduce technical debt and operational toil, plan capacity and resources across teams, and lead initiatives that span multiple teams or value streams.
Make build versus buy versus partner decisions for platform capabilities and support technology investment decisions within the area of responsibility.
Own reliability across multiple critical systems or value streams, including SLO and SLI frameworks, operational metrics, incident management, on-call practices, and disaster recovery and business continuity capabilities.
Provide technical direction on major infrastructure decisions, participate in architecture reviews, and drive adoption of best practices across teams, including infrastructure as code, GitOps, observability, CI/CD, and zero-trust security architecture.
Partner with Engineering, Product Management, Security, Compliance, Finance, and executive stakeholders on reliability requirements, roadmap planning, infrastructure initiatives, and broader technology strategy.
Communicate vision, strategy, progress, incident summaries, and investment needs to stakeholders at all levels, including VP and Director audiences, while influencing technical and process decisions across the organization.

Requirements

3 to 5 years of management experience or equivalent, ideally including experience managing managers.
8+ years of hands-on technical experience in Site Reliability Engineering, DevOps, or Operational Excellence.
Proven track record of leading engineering teams through operational challenges.
Deep technical knowledge of AWS, Kubernetes, and modern infrastructure practices.
Strong understanding of SRE principles and experience applying them at scale.
Experience setting strategy and executing multi-quarter initiatives.
Demonstrated ability to develop managers and senior engineers.
Excellent communication skills with the ability to influence senior leadership.
Strong judgment in balancing competing priorities and making trade-offs.
Experience managing distributed or hybrid teams.

Tech Stack

AWS
Kubernetes

Benefits

Medical, dental, vision and life insurance
Retirement savings – 401(k) plan with generous company matching contributions (up to 6%), financial advisory services, potential company discretionary contribution, and a broad investment lineup
Tuition reimbursement up to $5,250/year
Business-casual environment that includes the option to wear jeans
Generous paid time off upon hire – including a paid time off program plus ten paid company holidays and three floating holidays each calendar year
Paid volunteer time — 16 hours per calendar year
Leave of absence programs – including paid parental leave, paid short
and long-term disability, and Family and Medical Leave (FMLA)
Business Resource Groups (BRGs) – BRGs facilitate inclusion and collaboration across our business internally and throughout the communities where we live, work and play. BRGs are open to all.

Senior Manager – Site Reliability Engineering

Key skills

About this role

Role Overview

Requirements

Tech Stack

Benefits