Motive empowers the people who run physical operations with tools to make their work safer, more productive, and more profitable. As a Staff Site Reliability Engineer on the Platform team, you will design, scale, and manage AWS-backed services for millions of connected IoT devices and ensure high availability and performance.

Responsibilities:

Collaborate with other engineering and product teams to design and build the infrastructure and services required to deliver new features to customers in a cloud-native and event-driven fashion
Leverage and progress our IaC (Terraform) and CM (Helm) code and strategies for advanced scaling and self-service usage by engineering teams
Identify and remove bottlenecks from systems in production throughout AWS services and with our Kubernetes platform
Ensure 99.99% customer-facing uptime
Continuously improve the monitoring and alerting capabilities of our platform, enabling us to be proactive instead of reactive

Requirements:

8+ years of professional SRE/DevOps experience, and a demonstrated ability working on high volume production systems
Demonstrable systems architect expertise, solving complex technical problems and implementing company wide solutions
Advanced knowledge of AWS services and technologies (ALB/ELB, IAM permissions, DynamoDB, SNS, EKS/Fargate, etc.)
Experience with infrastructure as code and configuration management (Terraform and Helm charts especially) to design and provision new services
Knowledge of Python, Bash or other scripting languages. Knowledge of Ruby or Golang is a plus
High-level of ownership and drive to work with others and see improvements through to production

Staff Site Reliability Engineer

Key skills

About this role

Responsibilities:

Requirements: