Role Overview
- Design, build, and maintain scalable, reliable infrastructure to support In Tandem’s technical platform
- Partner with Engineering teams as an internal consultant on CI/CD pipelines while developing and maintaining best practices for managing maintainable pipelines.
- Architect and operate containerized environments using primarily AWS ECS and RDS codified in Terraform
- Manage and evolve cloud infrastructure across AWS services
- Troubleshoot and resolve complex system issues related to networking, security, performance, and reliability
- Partner closely with application engineers to enable safe, efficient, and repeatable software delivery
- Lead infrastructure initiatives that improve platform resilience
- Influence DevOps best practices, mentor teammates, and drive continuous improvement across the engineering organization
Requirements
- 5+ years of experience in DevOps, infrastructure, or site reliability engineering
- Experience working in AWS and cloud environments
- Strong experience with containerization and orchestration (Docker, Docker Swarm, Kubernetes, ECS, or similar)
- Experience with Terraform or similar infrastructure-as-code platform to both build out new and codify existing infrastructure
- Experience building and maintaining CI/CD pipelines (Bitbucket Pipelines, Bamboo, Jenkins, GitHub Actions or similar tools)
- Strong troubleshooting skills across networking, security, performance, backups, patching, and system reliability
- Experience partnering closely with application engineers in a product-driven organization
- An inclination to make small improvements where possible and large improvements where necessary
- Fluency with AI (MCP, Code Generation, Automated PR Reviewing, etc.)
What would be great to have:
- Experience with centralized logging and monitoring tools (Splunk, New Relic, Cloudwatch, or similar)
- Exposure to databases, data modeling, data architecture, and an understanding of application performance considerations
- Network design experience
- Experience with security compliance control systems (ex: Vanta)
- Exposure to serverless technologies (ex: AWS Lambda)
- Skills in cost optimization for AWS infrastructure
- Experience modernizing or migrating legacy infrastructure
Tech Stack
- AWS
- Cloud
- Docker
- Jenkins
- Kubernetes
- Splunk
- Terraform
Benefits
- Medical: In Tandem pays 100% of the premium for employees AND 99% for all additional family members
- 401k: Up to a 4% match with immediate vesting
- Paid leave for all new parents
- Learning & Development stipend for employees
- Paid Time Off: 11 Holidays + Winter Break (3 Days) + Volunteer Time Off (1 Day) + Floating Holiday (1 Day)
- Personal Time Off:
- 15 days for 0-1 years of employment
- 20 days 1-3 years of employment
- Supportive and flexible working environment – work from anywhere!