Home
Jobs
Saved
Resumes
Senior Site Reliability Engineer at Drata | JobVerse
JobVerse
Home
Jobs
Recruiters
Companies
Pricing
Blog
Jobs
/
Senior Site Reliability Engineer
Drata
Website
LinkedIn
Senior Site Reliability Engineer
San Francisco, California, United States of America
Full Time
2 hours ago
$166,900 - $225,900 USD
No Visa Sponsorship
Apply Now
Key skills
AWS
Cloud
Docker
Kubernetes
Linux
MySQL
Python
Terraform
Bash
GitHub Actions
ECS
Fargate
Datadog
Git
GitHub
CI/CD
About this role
Role Overview
You are the reliability expert for your aligned product team
Lead Production Readiness Reviews (PRRs) before new services launch
Partner with product engineering leads and staff engineers to define SLOs and SLIs for critical services
Participate in team planning and architecture reviews to provide proactive reliability guidance
Build reusable artifacts
SLO templates, observability checklists, alerting standards, reference dashboards
Build and maintain Datadog monitors, dashboards, and alert routing
Handle infrastructure requests: ECS task management, secret rotations, Terraform changes, capacity adjustments
Identify repeated manual work and convert it into self-service tooling or runbooks
Design and build shared platform infrastructure
reusable Terraform modules, standardized observability stacks, service templates
Participate in the on-call rotation and lead incident response when needed
Requirements
6+ years of experience in Site Reliability Engineering, Cloud Engineering, or building and maintaining scalable, resilient services
Robust knowledge of cloud computing technologies: Terraform, Docker, Git, and Linux
Hands-on experience with Datadog for monitoring, alerting, dashboards, SLO tracking, and distributed tracing
Experience building software systems as a software engineer
Experience developing tooling and automation in Python and/or Bash
Experience with CI/CD pipeline automation, specifically GitHub Actions
Experience with disaster recovery practices and incident management
Strong understanding of observability concepts
monitoring, logging, distributed tracing, and metrics
and how to apply them to production systems
Experience with container orchestration and deployment technologies including AWS ECS Fargate and/or Kubernetes
Experience working with relational databases (MySQL proficiency is a plus)
Tech Stack
AWS
Cloud
Docker
Kubernetes
Linux
MySQL
Python
Terraform
Benefits
Up to 100% employer-paid premiums for medical, dental, and vision coverage for employees and their dependents
Comprehensive wellness benefits and healthcare concierge services
401(k) plan
Company-paid life and disability insurance
Tax-advantaged spending accounts
Paid Parental Leave policy after six months of employment
Access to Kindbody fertility and family-building benefits
Paid time off and flexible vacation policy
Generous annual stipends for professional and personal development
Apply Now
Home
Jobs
Saved
Resumes