Alpha Consulting Corp. is seeking a Site Reliability Engineer to join their infrastructure team. In this role, you will bridge the gap between development and operations, ensuring systems are scalable, reliable, and efficient through automation and management of deployments.
Responsibilities:
- Write and maintain robust Shell scripts to automate system tasks
- Develop and enhance features within our core application using Python
- Take ownership of the production environment by managing deployments using Spinnaker
- Ensure that all code moving to production is stable and performant
- Conduct thorough code reviews for both infrastructure-as-code and application logic to maintain high standards of reliability and security
- Manage and provision cloud infrastructure using Pulumi
- Maintain and scale our containerized workloads using Kubernetes and Docker
- Utilize Git and GitHub for collaborative development, branching strategies, and CI/CD integration
Requirements:
- 5+ years in an SRE, DevOps, or Systems Engineering role
- Proficiency in Shell scripting for system automation is a must
- Strong Python programming skills with the ability to contribute to application-level codebases
- Deep understanding of containerization (Docker) and orchestration (Kubernetes)
- Experience performing code reviews and managing high-stakes deployments in production environments
- A data-driven approach to troubleshooting and system performance optimization