CyberArk, a Palo Alto Networks company, is the global leader in identity security, trusted by organizations around the world to secure human and machine identities in the modern enterprise. The Senior Site Reliability Engineer will support CyberArk’s AWS infrastructure, manage reliability and performance of SaaS environments, and develop automation to enhance operations.
Responsibilities:
- Support CyberArk’s AWS infrastructure and software
- Measure and manage the reliability and performance of SaaS environments as well as work on building automation to prevent problem reoccurrence
- Use cloud configuration management, deployment and compliance tools such as CloudFormation, Helm, Kubernetes, Terraform, Salt and Ansible across both Windows and Linux environments
- Ensure cloud-based architectures meet availability and recoverability requirements
- Implement AWS best practices, cloud-based monitoring, alerting, and observability - PagerDuty, CloudWatch, Grafana, Datadog and OpenSearch
- Support and guide tooling initiatives that enhance team output and reliability
- Develop and continuously improve automation of manual processes
- Work with engineering and product teams to identify areas of improvement
- Respond to production incidents and participate in on-call rotations
Requirements:
- B.S. in Computer Science or equivalent experience
- Minimum 5 years of experience managing AWS infrastructure
- Minimum of 3 years in a senior, architect or a technical lead role of site reliability
- Advanced knowledge in Ansible automation, Kubernetes, CI/CD
- Advanced knowledge of Infrastructure as a Code (Terraform), CloudFormation
- Solid understanding/experience of web services, databases and related infrastructure/ architectures
- Proven track record of managing reliability and performance for large-scale, enterprise-level SaaS environments
- Strong level of scripting and automation expertise, using Python or an equivalent language
- Strong analytical and problem-solving abilities, with a proactive approach to identify and mitigate issues
- Previous experience with FedRAMP or DOD compliance requirements and audits
- A deep understanding of Site Reliability, infrastructure, and Cloud Platforms
- Experienced in supporting an enterprise-level SaaS environment
- Previous experience using or managing CyberArk SaaS products is a plus