Shopify is a leading commerce platform that empowers entrepreneurs and businesses worldwide. They are seeking experienced Site Reliability Engineers to help build and maintain resilient systems that support millions of merchants, ensuring high performance and reliability of their platform.
Responsibilities:
- Help Shopify run its planet scale systems by enabling our engineering teams to create resilient systems
- Build and improve tools to keep our platform resilient and performant
- Ensure we never fail for the same reason twice
- Go on-call and respond to automated alerts and execute playbooks
- Directly impact production systems underpinning commerce for millions of merchants, who generate revenue for their livelihood, their families, and their employees, through the businesses they’ve built on our platform
- Identify gaps in our processes and build or improve tools to support incident management
- Develop production tooling and services to improve our platform’s resilience
- Clean up the noise in our signals, ensuring we can get an understanding of our platform and more efficiently debug problems
Requirements:
- Experience as a Site Reliability Engineer or software engineer
- Ability to build and scale robust and performant systems
- Experience with incident management and developing production tooling
- Ability to go on-call and respond to automated alerts
- Comfortable working in a collaborative and candid engineering culture
- Ability to thrive in a fast-paced and changing environment
- Strong critical thinking and problem-solving skills
- Ability to work digital-first
- Availability for on-call work from 0800 UTC - 1400 UTC during on-call weeks