Parsons is looking for an amazingly talented Senior Site Reliability Engineer to join our team! In this role, you will play a crucial role in ensuring the reliability, scalability, and performance of our systems while collaborating closely with development and operations teams.
Responsibilities:
- Design, implement, and maintain scalable and reliable infrastructure solutions
- Develop and manage monitoring and alerting systems to ensure system health and performance
- Collaborate with development teams to improve system architecture and deployment processes
- Automate repetitive tasks to improve efficiency and reduce human error
- Troubleshoot and resolve complex system issues, ensuring minimal downtime
- Document processes, systems, and configurations to ensure knowledge sharing and continuity
Requirements:
- Active Secret clearance
- Bachelor's degree from an accredited college or university or equivalent experience
- A minimum of 6+ years of experience in a DevOps or related role
- Current Security+ certification
- Proficient in Linux/Unix systems administration, cloud computing (AWS, Azure, GCP), automation (Python, Go, Bash, Ansible, Terraform), monitoring and alerting (Prometheus, Grafana, Datadog), and CI/CD pipelines
- Experience with incident management, performance tuning, capacity planning, and security best practices
- Strong problem-solving skills and attention to detail
- Must possess excellent communication and a willingness to continuously learn and adapt to the ever-evolving technology landscape and collaboration skills, with the ability to work effectively in a team environment
- Knowledge of database management and optimization
- Previous experience in a DevOps or Agile environment