About this role

Customer.io is a platform used by over 8,000 companies for automated communication. They are seeking a Site Reliability Engineer to scale their infrastructure, reduce operational toil, and improve reliability as the company grows.

Responsibilities:

Build and scale infrastructure to support billions of messages per day and real-time events
Automate deployments, alerting, and incident response
Make our on-call better - clear alerts, solid documentation, and faster resolution
Tune MySQL and other datastore performance and improve reliability across distributed systems
Collaborate across teams to debug, ship, and support systems in production
Share knowledge and raise the bar through sharing your progress publicly with short videos, thoughtful writing, and mentorship
Leverage AI tools to prototype, move faster, and make better decisions

Requirements:

7+ years in SRE or infrastructure roles, improving production systems at scale
Deep MySQL experience - schema design, performance tuning, and operational tooling
Fluency in cloud-native tech (GCP a plus) and Terraform
Proficiency in Go and Bash for scripting and systems programming
Skill in observability, incident response, and debugging distributed systems
A preference for action over perfection, and pride in owning technical decisions

Senior Site Reliability Engineer

Key skills

About this role

Responsibilities:

Requirements: