Claroty is a company focused on securing mission-critical systems and is seeking a Site Reliability Engineer (SRE) to support their FedRAMP-compliant deployment in AWS GovCloud. The SRE will be responsible for ensuring high availability, security, and compliance of cloud-based environments while driving automation and incident response best practices.
Responsibilities:
- Manage and optimize Claroty’s cloud-based infrastructure in AWS GovCloud, ensuring FedRAMP compliance and high availability
- Monitor and enhance system performance, scalability, and reliability through observability tools, automation, and best practices
- Implement and maintain security controls aligned with FedRAMP, NIST 800-53, and other federal cybersecurity standards
- Develop and manage infrastructure automation using Terraform and Ansible
- Enhance DevSecOps pipelines, automate deployments, and improve system resilience through tools like GitLab CI/CD, Jenkins, and Kubernetes
- Implement and manage monitoring solutions (Prometheus, Grafana, ELK Stack), respond to incidents, and conduct post-mortems
- Configure and maintain VPCs, VPNs, security groups, and firewalls in AWS GovCloud, ensuring compliance with FedRAMP requirements
- Manage rollout strategy for new technologies and oversee their execution to ensure minimal disruption to existing systems
- Act as the first line of response for critical incidents, assessing issues, triaging, and coordinating with the team to prevent further problems and swiftly restore services
- Monitor system performance metrics closely and detect any degradation early to prevent outages and disruptions
- Conduct regular infrastructure upgrades to accommodate changes, developments, and advancements in the technological landscape
- Oversee the release of updates and new functionalities, ensuring a seamless transition while handling any potential negative impacts on production
- Work closely with DevOps, security teams, developers, and federal stakeholders to maintain a compliant and secure cloud environment
Requirements:
- 6-8+ years of experience in SRE, DevOps, or Cloud Engineering roles
- Hands-on experience with AWS GovCloud, including EC2, EKS, MSK, S3, RDS, IAM, CloudTrail, and CloudWatch
- Strong expertise in Infrastructure as Code (Terraform, Ansible)
- Experience with FedRAMP, NIST 800-53, and cloud security best practices
- Proficiency in Kubernetes, Docker, and container orchestration
- Knowledge of Linux system administration and scripting (Python, Bash)
- Experience with logging, monitoring, and observability tools in a cloud-native environment
- Strong troubleshooting, problem-solving, and automation mindset
- U.S. Citizenship (required for working in GovCloud environments)