Zoom is a company that helps people stay connected through its collaboration platform. They are seeking a DevOps Engineer to enable continuous availability for Team Chat and Zoom Events, managing critical applications and ensuring their stability and security.
Responsibilities:
- Designing and implementing zero-downtime solutions for highly available services (99.999%)
- Developing and maintaining disaster recovery (DR) strategies across datacenters in different regions
- Diagnosing and resolving complex production issues, including performance and functional challenges
- Collaborating with vendors, infrastructure teams, and engineering partners to enhance security and service availability
- Administrating monitoring tools and infrastructure
- Providing troubleshooting support for outages across systems and Zoom backend services
- Developing and implementing CI/CD pipelines to streamline production deployments and configurations
- Participating in on-call shifts and incident management and work after hours/weekends for application releases/deployments
- Being the primary contact for your region, addressing program queries, providing support and updates, and managing relevant content and processes
Requirements:
- Bring 2 - 3 years experience as a DevOps Engineer or Site Reliability Engineer (SRE)
- Have knowledge of CI/CD tools and integration
- Demonstrate experience scripting languages such as Bash, Python, Groovy
- Able to do configuration and deployment management using tools such as Terraform, Ansible, Jenkins
- Have expertise with containerization: Kubernetes, AWS EKS, Docker
- Have analytical and troubleshooting skills
- Be willing to learn, be proactive, and think creatively