You will set reliability targets and error budgets, define and measure SLIs, and drive continuous improvement of SLOs
You will participate in on-call rotations, triaging incidents and providing emergency responses
You will define org-wide incident management processes, provide education on incident response, and track and improve on metrics such as MTTD and MTTR
You will analyze critical user journeys and dependencies, provide architectural consultations, and drive operational best practices including documentation and runbooks
You will perform cost engineering, identify optimization opportunities, and own capacity planning
You will develop solutions for change management, monitoring, and disaster recovery
Requirements
Bachelor's degree or equivalent industry experience in Computer Science, Electrical Engineering, or related fields
5+ years of SRE and/or DevOps experience on cloud based systems
Experience in large-scale production systems with major cloud technologies, such as AWS, Istio, K8s, and Grafana
Experience with Terraform and Infrastructure as Code
Experience with multiple modern programming languages, such as Go, Rust, Python
A track record of both independent and collaborative impact
Business level communication skills in verbal and written English
Experience with leading organization wide SRE initiatives
Professional experience with modern Continuous Integration/Continuous Delivery (CI/CD) tooling
Professional Experience or familiarity with Bazel
Master's degree in Computer Science or related fields
10+ years of SRE and/or DevOps experience on cloud based systems
Japanese language skills
Tech Stack
AWS
Cloud
Grafana
Kubernetes
Python
Rust
Terraform
Go
Benefits
Competitive Salary
Based on experience
Work Hours
Flexible working time
Paid Holiday
20 days per year (prorated)
Sick Leave
6 days per year (prorated)
Holiday
Sat & Sun, Japanese National Holidays, and other days defined by our company
Japanese Social Insurance
Health Insurance, Pension, Workers’ Comp, and Unemployment Insurance, Long-term care insurance
Housing Allowance
Retirement Benefits
Rental Cars Support
In-house Training Program (software study/language study)