Clay is a company focused on helping organizations turn growth ideas into reality through creativity and unique data solutions. They are seeking a Site Reliability Engineer to join their infrastructure team, responsible for building and maintaining robust infrastructure solutions while ensuring system reliability and performance.
Responsibilities:
- Architect, design, implement, and manage robust, scalable, and secure infrastructure solutions
- Develop, maintain, and enforce best practices for CI/CD, infrastructure as code, and automation
- Oversee the management and optimization of cloud infrastructure, ensuring high availability, performance, and cost-efficiency
- Implement monitoring, logging, and alerting solutions to maintain system health and quickly resolve issues
- Lead incident response efforts, troubleshooting and resolving complex issues in a timely manner
- Participate in an oncall rotation
- Work with teams across the company to ensure we achieve the right balance of developer velocity, reliability and performance, and cost efficiency
Requirements:
- 5+ years of experience
- Experience with containerization and orchestration tools
- Strong understanding of CI/CD concepts and tools
- Knowledge of infrastructure automation tools
- Experience with oncall and incident response
- Proficiency in one or more programming languages
- Familiarity with our stack or ability to learn unfamiliar technologies quickly: Aurora Postgres RDS, Elasticache Redis, Docker + ECS, Lambda, OpenSearch, Terraform and Atlantis, CircleCI, Netlify, Playwright, Cloudwatch, Datadog, Mezmo, Typescript, Python