Close is a bootstrapped, profitable company focusing on building a communication-focused CRM for small scaling businesses. They are seeking a Site Reliability Engineer to join their Infrastructure Team, responsible for maintaining and building robust systems that support mission-critical applications and ensuring high availability and performance of their services.
Responsibilities:
- Fully automating our database’s lifecycles with Argo Workflow
- Eliminating all static credentials where they may be
- Reducing downtime and disruption due to maintenance or disaster to new lows
- Help us improve our multi-region disaster recovery system
Requirements:
- 5+ years of experience building modern infrastructure systems for Senior 1 & 2 level candidates
- 8+ years of experience for Staff level candidates
- You are respected as an expert on the systems you run
- You have been the final point of escalation in the support of mission critical production systems
- Familiarity with some of the following technologies: AWS, Terraform, Kubernetes, Ansible, MongoDB, PostgreSQL, Elasticsearch
- Strong grasp of common networking and data transfer protocols such as DNS, HTTP, TCP
- Able to speak and write in English
- Located in the USA (ET, CT, MT, PT)
- Contributed open source code related to our tech stack
- Experience maintaining very large databases
- Experience with successful disaster response
- Experience with multi-region architectures
- Experience running MLOps systems
- Experience scaling Temporal