Peraton is a next-generation national security company that drives missions of consequence spanning the globe. They are seeking a Senior AWS Cloud Site Reliability Engineer (SRE) to ensure the reliability, scalability, and performance of their cloud infrastructure on AWS, collaborating with cross-functional teams for seamless software releases and continuous improvement.
Responsibilities:
- Design, implement, and manage infrastructure as code (IaC) solutions using tools like AWS CloudFormation, Terraform or Helm Charts to automate continuous database deployment and scaling processes
- Implement robust monitoring and alerting systems to proactively identify and address potential issues before they impact system performance
- Conduct performance analysis and optimization of AWS infrastructure components to enhance system efficiency and reduce latency
- Participate in on-call rotations to respond to and resolve incidents promptly
- Work closely with security teams to implement and enforce best practices for securing AWS environments
- Facilitate clear communication across teams, providing updates on release status, known issues, and any potential impact on stakeholders
- Collaborate with development, QA, and operations teams to plan and coordinate database schema releases
- Develop and maintain automated deployment pipelines using industry-standard tools such as GitLab CI/CD, Liquibase, or similar
- Proactively identify areas for process improvement within the release management lifecycle
- Collaborate with QA teams to establish and execute release validation procedures
Requirements:
- Bachelor's Degree and 8 years of experience or 12 years of experience and a HS Degree/Diploma
- Proven experience as a Site Reliability Engineer or similar role with a strong emphasis with relational databases
- In-depth knowledge of AWS services like RDS and DynamoDB and expertise in managing cloud infrastructure
- Advanced level programming and/or scripting in 3 or more of the following languages: Python, Java, Chef, Helm, Playwright, Bash, JavaScript, Terraform
- Strong understanding of DevOps principles and continuous integration/continuous deployment (CI/CD) pipelines
- Proficiency in CI/CD tools such as GitLab CI/CD, Liquibase, or others
- Familiarity with infrastructure as code (IaC) tools like CloudFormation, Terraform, Helm Charts, or similar technologies
- Hands-on experience with version control systems (GitLab, GitHub, AWS CodeCommit) and branching strategies
- Experience with containerization and orchestration tools (e.g., Amazon Elastic Compute Service (ECS), Amazon Elastic Kubernetes Service (EKS), Docker, Kubernetes)
- Familiarity with monitoring tools (e.g., CloudWatch, Prometheus, Grafana, Datadog) and log analysis
- Attention to detail, with a focus on maintaining high-quality software releases
- Solid understanding of Agile methodologies and their application in release management
- Excellent problem-solving and troubleshooting skills
- Strong communication and collaboration skills
- Must be a US Citizen
- Must be able to obtain and maintain the required agency clearance (6C Public Trust)
- Relevant certifications in DevOps or related fields are a plus
- High Risk Public Trust or Secret Clearance preferred
- 3 or more years in SRE or Platform Engineering group for high availability/critical platforms/applications
- 2 or more years managing relational databases