Collaborate with other engineering and product teams to design and build the infrastructure and services required to deliver new features to customers in a cloud-native and event-driven fashion.
Leverage and progress our IaC (Terraform) and CM (Helm) code and strategies for advanced scaling and self-service usage by engineering teams.
Identify and remove bottlenecks from systems in production throughout AWS services and with our Kubernetes platform.
Ensure 99.99% customer-facing uptime.
Continuously improve the monitoring and alerting capabilities of our platform, enabling us to be proactive instead of reactive.
Requirements
8+ years of professional SRE/DevOps experience, and a demonstrated ability working on high volume production systems
Demonstrable systems architect expertise, solving complex technical problems and implementing company wide solutions
Advanced knowledge of AWS services and technologies (ALB/ELB, IAM permissions, DynamoDB, SNS, EKS/Fargate, etc.)
Experience with infrastructure as code and configuration management (Terraform and Helm charts especially) to design and provision new services
Knowledge of Python, Bash or other scripting languages. Knowledge of Ruby or Golang is a plus
High-level of ownership and drive to work with others and see improvements through to production.
Tech Stack
AWS
Cloud
DynamoDB
Kubernetes
Python
Ruby
Terraform
Go
Benefits
Health, pharmacy, optical and dental care benefits