CyberArk, a Palo Alto Networks company, is the global leader in identity security, and they are seeking a FedRAMP Staff Site Reliability Engineer to join their team. The role involves architecting and designing automation for cloud-based infrastructure while providing guidance on reliability and performance in SaaS environments.
Responsibilities:
- Architect, lead, and design future deployment and management automation for CyberArk’s cloud-based infrastructure and software
- Provide guidance to Site Reliability Engineers on managing the reliability and performance of SaaS environments and building automation to prevent recurring issues
- Architect, Develop and guide the team with the use of cloud configuration management, deployment and compliance tools such as CloudFormation, Helm, Kubernetes, Terraform, Salt and Ansible across both Windows and Linux environments
- Ensure cloud-based architecture meets availability and recoverability requirements
- Implement best practice for cloud-based monitoring, alerting, and observability using tools like - Pager Duty, CloudWatch, Grafana, Datadog and OpenSearch
- Support and guide tooling initiatives that enhance team output and reliability
- Develop and continuously improve automation of manual processes
- Collaborate with engineering and product teams to identify areas for improvement, prepare architecture roadmaps, and advocate to the Product Management group
- Respond to production incidents and participate in on-call rotations
Requirements:
- B.S. in Computer Science or equivalent experience
- Minimum 5 years of experience managing AWS infrastructure
- Minimum of 7 years in a senior, architect or a technical lead role of site reliability, systems engineering or software development
- A deep understanding of Site Reliability, infrastructure, and Cloud Platforms
- Solid understanding/experience of web services, databases and relating infrastructure/architectures
- Previous experience with FedRAMP or DOD compliance requirements and audits
- Strong level of scripting and automation expertise, using Python or an equivalent language
- Proven track record of managing reliability and performance for large-scale, enterprise-level SaaS environments
- Strong analytical and problem-solving abilities, with a proactive approach to identify and mitigate issues
- Extensive experience designing and managing AWS infrastructure components including VPC, ELB/ALB, IAM, KMS, EC2, Route53, AWS Config, CloudTrail, CloudFormation across both AWS commercial and GovCloud Regions
- Must be a U.S Citizen or Green Card Holder for meeting with FedRAMP High authorized access requirements