Loadsmart is seeking a Senior Site Reliability Engineer to design infrastructure and software platform architecture. The role involves building and maintaining infrastructure automation, troubleshooting issues, and managing databases while adhering to DevOps methodologies.
Responsibilities:
- Design infrastructure, networking, and software platform architecture
- Define platform guidelines, requirements and processes while considering DevOps methodology
- Build and maintain: infrastructure automation using Infrastructure as Code tools; auditable delivery of infrastructure definition and changes; automation of Continuous Integration and Continuous Deployment pipelines; Developer Experience and Productivity initiatives service catalogs and service maturity; the application platform used by all engineering teams; multiple Kubernetes clusters
- Design, develop and maintain core systems using common programming languages
- Build and maintain internal tooling used by all engineering teams
- Troubleshoot infrastructure, internal applications, networking, and security issues
- Build and maintain an observability platform, guidelines, and standards
- Define the internal platform SLI/SLO/SLAs
- Manage backup policies and operation
- Maintain the fleet of databases, including upgrades, security patches, performance analysis, optimizations and troubleshooting
- Conduct security risk assessments, vulnerability scans, VPNs, tests
Requirements:
- Bachelor's or foreign equivalent in Computer Science, Computer Engineering, or Information Technology
- 2 years experience in job offered or 2 years experience as Reliability Engineer, Cloud Engineer, Software Engineer or related occupation