Empower is seeking a Senior Manager for Site Reliability Engineering to lead multiple teams and set strategic directions for reliability across various value streams. The role involves developing talent, driving operational excellence, and partnering with various stakeholders to enhance reliability and scalability.
Responsibilities:
- Lead SRE managers and senior individual contributors, develop leaders through coaching and delegation, and build a strong bench of technical and leadership talent
- Drive talent reviews, succession planning, organizational design discussions, performance management, and development planning for high-potential team members
- Foster a culture of operational excellence, innovation, and continuous learning across the organization
- Define multi-quarter roadmaps and OKRs aligned with business objectives, and balance investments across reliability, scalability, cost, and security priorities
- Develop strategies to reduce technical debt and operational toil, plan capacity and resources across teams, and lead initiatives that span multiple teams or value streams
- Make build versus buy versus partner decisions for platform capabilities and support technology investment decisions within the area of responsibility
- Own reliability across multiple critical systems or value streams, including SLO and SLI frameworks, operational metrics, incident management, on-call practices, and disaster recovery and business continuity capabilities
- Provide technical direction on major infrastructure decisions, participate in architecture reviews, and drive adoption of best practices across teams, including infrastructure as code, GitOps, observability, CI/CD, and zero-trust security architecture
- Partner with Engineering, Product Management, Security, Compliance, Finance, and executive stakeholders on reliability requirements, roadmap planning, infrastructure initiatives, and broader technology strategy
- Communicate vision, strategy, progress, incident summaries, and investment needs to stakeholders at all levels, including VP and Director audiences, while influencing technical and process decisions across the organization
Requirements:
- 3 to 5 years of management experience or equivalent, ideally including experience managing managers
- 8+ years of hands-on technical experience in Site Reliability Engineering, DevOps, or Operational Excellence
- Proven track record of leading engineering teams through operational challenges
- Deep technical knowledge of AWS, Kubernetes, and modern infrastructure practices
- Strong understanding of SRE principles and experience applying them at scale
- Experience setting strategy and executing multi-quarter initiatives
- Demonstrated ability to develop managers and senior engineers
- Excellent communication skills with the ability to influence senior leadership
- Strong judgment in balancing competing priorities and making trade-offs
- Experience managing distributed or hybrid teams
- Experience managing SRE organizations with multiple teams
- Financial services or other highly regulated industry experience
- Deep understanding of compliance frameworks such as SOC 2, PCI DSS, and FINRA
- Track record of leading transformational initiatives such as cloud migrations or platform rebuilds
- AWS certifications at the professional level
- Experience with large-scale Kubernetes implementations
- Background in Agile or Scrum practices and metrics-driven management
- Experience with FinOps and cloud cost optimization
- Previous experience in Fortune 500 companies or similarly complex organizations