About this role

Scalence L.L.C. is a technology company that develops environments, applications, and tools for clients. They are seeking a dedicated Site Reliability Engineer to bridge development and operations, ensuring systems are scalable, reliable, and efficient through automation and management of complex deployments.

Responsibilities:

Write and maintain robust Shell scripts to automate system tasks
Develop and enhance features within our core application using Python
Take ownership of the production environment by managing deployments using Spinnaker
Ensure that all code moving to production is stable and performant
Conduct thorough code reviews for both infrastructure-as-code and application logic to maintain high standards of reliability and security
Manage and provision cloud infrastructure using Pulumi
Maintain and scale our containerized workloads using Kubernetes and Docker
Utilize Git and GitHub for collaborative development, branching strategies, and CI/CD integration

Requirements:

5+ years in an SRE, DevOps, or Systems Engineering role
Proficiency in Shell scripting for system automation is a must
Strong Python programming skills with the ability to contribute to application-level codebases
Deep understanding of containerization (Docker) and orchestration (Kubernetes)
Experience performing code reviews and managing high-stakes deployments in production environments
A data-driven approach to troubleshooting and system performance optimization

Site Reliability Engineer (with Python coding)

Key skills

About this role

Responsibilities:

Requirements: