Leidos is a leading technology company, and they are seeking a Principal Site Reliability Engineer to support their MDEC business. This role involves providing technical leadership and strategic guidance throughout the application lifecycle, ensuring customer applications align with government objectives.
Responsibilities:
- Effectively coordinate, lead, and support developers, operations, product owners, and key stakeholders through partnering to implement Service Level Objectives, Service Level Indicators, and Incident Management processes
- Serve as a primary technical resource for DevOps strategy, automation, and deployment standards, guiding development and operations teams
- Design, build, and maintain robust and scalable CI/CD pipelines for effective and efficient code releases
- Implement Infrastructure as Code (IaC) using tools such as Terraform or Ansible
- Automate configuration management to promote mature change management and release strategies
- Establish relevant and actionable monitoring and alerting for microservice-based applications and infrastructures
- Collaborate with other teams to ensure smooth and secure build, release, and deployment processes
- Integrate DevSecOps principles and automated security tools into CI/CD pipelines
- Advise and support troubleshooting infrastructure and deployment issues
- Champion a DevOps culture, evaluate new technologies, and stay updated on industry trends
- Support current and future complex and demanding customers with an established team, providing coaching and guidance on DevOps best practices
- Develop and integrate custom-developed software solutions to leverage automated deployment technologies
- Develop, prototype, and deploy solutions within Commercial/Government Cloud Solutions leveraging Infrastructure platform services
- Coordinate closely with team members, Product Owners, and Scrum Masters to ensure User Story alignment and implementation to customer use cases
- Analyze infrastructure needs driven by developed software to meet customer mission needs, utilizing proof-of-concept, performance, and end-to-end testing
- Support the Agile software development lifecycle following Program SAFe practices
- Document and maintain containerized images and deployment artifacts across different environments, leveraging GitFlow constructs
- Communicate key project data to team members and build team cohesion and effectiveness
- Leverage Atlassian tool suite like JIRA and Confluence to track activities
- Apply and identify best practices and standard operating procedures
- Architect innovative solutions to meet the technical needs of customers
Requirements:
- Must be able to obtain multiple clearances
- Must have a US Citizenship
- Master's Degree with 15+ years of prior relevant experience. Will consider work experience in lieu of a degree
- Proficient with common Agile practices, service-oriented environments, and development practices
- 3-5 years of experience with at least one object-oriented programming language: Python, Java, or C
- Must have experience in software architecture
- Expert in the software development lifecycle, with experience delivering within DevOps toolsets/practices
- Design and management of scalable CI/CD pipelines using tools like Jenkins or GitLab CI
- Direct experience utilizing software testing performance tools
- Experience with scripting languages such as Python or Bash
- Experience working in an Agile development environment and tempo
- Solid knowledge of Databricks, Kubernetes/Openshift and AWS cloud services
- Experience with Docker, Terraform, and Helm
- Proficiency with Linux/Unix systems administration
- Working knowledge of monitoring and logging tools (Prometheus, Kibana, CloudWatch)
- Experience with database technologies (SQL Server, Postgres)
- Excellent communication skills (written and verbal)
- Well versed with using version control systems (Git)
- Well versed with using issue/problem tracking systems (Jira)
- AWS certification
- Experience with AWS Data management services
- Experience developing with cloud data services (S3, RDS, EFS)
- Experience with RESTful APIs, JSON, and AJAX
- Active DOJ or DoD clearance (or ability to obtain)