Zscaler is a pioneer and global leader in zero trust security. They are seeking a Senior Site Reliability Engineer to be responsible for all aspects of the Zscaler production data center services, ensuring availability, performance, and efficiency.
Responsibilities:
- Expertly navigate networking principles, firewalls, and load balancing solutions to ensure robust infrastructure performance
- Partner with Software Engineering and Infrastructure teams to design, implement, and deploy comprehensive end-to-end monitoring solutions
- Execute seamless patches and upgrades, ensuring all administrative tools and utilities remain current and high-performing
- Proactively monitor applications and services, participating in an on-call rotation to resolve issues and implement strategic prevention measures
- Troubleshoot complex technical challenges and provide clear, candid communication regarding issues and their resolutions
Requirements:
- 3+ years of experience in 24/7 SRE/NOC operations, production cloud platforms, and related automation workflows
- Proficiency with a programming language such as Python or Go, and scripting expertise in Bash
- Demonstrated ability to adapt to fast-paced environments with the flexibility to support after-hours and weekend deployments
- Solid understanding of networking standard protocols including HTTP, DNS, TCP/IP, ICMP, the OSI Model, Subnetting, and Load Balancing
- U.S. citizenship is required due to the nature of the customers assigned to this role
- Prior experience specifically within a NOC or SRE on-call environment
- Proven track record in CI/CD and the development of automation frameworks
- Deep understanding of SRE fundamentals, including the implementation of Golden Signals