Help design, build, and maintain infrastructure systems that power our core SaaS platform, with a focus on performance, security, reliability, and developer experience
Collaborate cross-functionally with Product Engineering, SRE, and Security teams to drive infrastructure projects from concept through production
Evaluate existing systems and implement architectural improvements that increase scalability, resilience, and operational efficiency
Build and manage infrastructure using infrastructure-as-code tools such as Terraform, Pulumi, or similar technologies
Use configuration management tools such as Ansible, Chef, or similar to improve consistency and automation across environments
Support and evolve containerized platforms and orchestration systems, including technologies such as Docker, Kubernetes, ECS, or Nomad
Contribute to observability and operational excellence through monitoring, logging, alerting, and instrumentation improvements
Document system designs, platform standards, and operational best practices, and instruct developers on how to use them effectively
Mentor and guide junior engineers, contributing to a culture of technical excellence, ownership, and continuous learning
Participate in an on-call rotation and help maintain the health, availability, and performance of production systems
Requirements
Significant experience in backend, DevOps, or infrastructure engineering roles in large-scale SaaS, PaaS, or IaaS environments
Extensive experience working with Linux in production environments
Advanced knowledge of at least one major cloud provider, including AWS, GCP, or Azure
Strong experience with infrastructure as code and automation using tools such as Terraform, Pulumi, Ansible, Chef, or similar
Expertise in containerization and orchestration technologies such as Docker, Kubernetes, ECS, or Nomad
Experience scripting and writing code in Bash, Python, Go, or similar languages
A proven track record contributing to large-scale infrastructure migrations, high-availability initiatives, or performance optimization efforts
Familiarity with observability stacks such as Prometheus, Grafana, ELK, OpenTelemetry, or similar tools
Excellent communication skills, with a collaborative mindset and the ability to work effectively across engineering teams
A strong mentoring orientation and a desire to help raise the bar for engineering quality and operational excellence.
Tech Stack
Ansible
AWS
Azure
Chef
Cloud
Docker
Google Cloud Platform
Grafana
Kubernetes
Linux
Prometheus
Python
Terraform
Go
Benefits
medical, dental and vision benefits
life insurance
short term and long-term disability
401(k) retirement plan
vacation and sick leave
equity (stock) based compensation and/or variable pay programs based on performance relative to goals and targets