Own the vision, strategy, and roadmap for Rootly’s infrastructure and developer platform
Build and lead a high performing Platform Engineering organization that may include SRE, infrastructure, DevEx, and internal tooling
Establish a culture where reliability, performance, and developer experience are non negotiables
Act like an owner, spotting problems early, mobilizing teams, and driving solutions from concept to completion
Architect a highly available, redundant, and scalable infrastructure foundation
Lead capacity planning, cost management, performance tuning, and long term infrastructure scaling
Drive operational maturity through infrastructure as code, declarative infrastructure, configuration management, and repeatable automation
Enable product engineers to move extremely quickly by optimizing local dev environments, ephemeral cloud environments, fast CI and CD, and reliable canaries
Provide tooling that abstracts infrastructure complexity and removes friction from development
Ensure every engineer can ship confidently, frequently, and safely
Own platform wide SLOs, SLIs, and error budgets and use them to drive prioritization
Oversee observability tooling, monitoring, alerting, and incident response processes
Partner with product engineering teams to ensure services meet reliability and performance goals and to improve runbooks and postmortems
Drive high quality execution with urgency while balancing long term bets with tactical wins
Raise the bar and inspire engineers to think bigger, move faster, and deliver exceptional results
Collaborate closely with Product, Engineering, and leadership to align platform investments with company strategy
Recruit, mentor, and develop top tier platform engineers and create a culture of excellence
Requirements
10+ years in platform, infrastructure, SRE, or DevOps roles, with increasing leadership responsibility
Experience leading platform or SRE teams, including hiring, mentoring, and building culture
Deep expertise with cloud infrastructure, AWS preferred, distributed systems, scaling, and redundancy
Proven experience designing or operating high scale production systems and delivering operational maturity
Strong background in observability, performance tuning, and scaling strategies
Comfortable writing production grade software to solve infrastructure problems, Ruby or Go is a plus
Strong architectural judgement and systems thinking that anticipates scaling pain before it becomes real.
Tech Stack
AWS
Cloud
Distributed Systems
Ruby
Go
Benefits
Competitive compensation and early equity in a fast-growing, venture-backed company.
Comprehensive medical, dental, and vision coverage.
3 weeks of vacation, plus unlimited sick and mental health days, and a company-wide end-of-year shutdown to recharge.
$500 stipend for home office setup.
A fast-moving, high-impact environment where your leadership and ideas directly shape the future of the company.