Tria Federal is a company that delivers digital services and technology solutions to support the health and safety of veterans, service members, and civilians. They are seeking a highly skilled Lead DevOps Engineer to join their collaborative team, focusing on delivering virtual infrastructure for health IT solutions using AWS and various DevOps tools.
Responsibilities:
- Use our company’s cloud-based performance management software updates function to get weekly, bi-weekly, semi-monthly, or monthly updates from each direct report
- Manage Team availability on your program
- Enforce the Cloud Practice SOP (standard operating procedures)
- Be the example by modeling what you want from your Staff
- Work Jira/Kanban board tickets; at least 75% of your job is to be technically hands-on
- Help your staff grow; pick someone on the team to mentor and assign staff to mentor other staff
- Be available for all customer calls
- Follow up with your staff to make sure everything is going well, and tasks are getting completed in a timely manner, address injections as they arise
- Make sure all work has a Jira ticket assigned to it
- Make sure all work you and your staff does is well documented, and PRs/Commits are attached to Jira tickets
- Uphold the Customer SLA, remember as people managers we are responsible for the work and performance of those who report to us
- Respond to emails and slack messages quickly, if you're in a meeting, have slack represent that and check your messages as soon as it's over
- Have a reoccurring productive and meaningful standup aligned with Kanban workflow, record it, make sure everyone attends
- Represent your team to the customer
- Report performance issues to the practice leadership as soon as they happen
- Make sure all staff are on any all hands calls
- Report all good news and wins back to the practice
- Report program issues, failures immediately to practice leadership
- Contribute and help other teams and encourage your staff to do the same
- Share your work with the practice
- Help grow and mature the practice
Requirements:
- Ability to obtain a U.S. Federal Position of Trust clearance designation
- Must reside in and be able to perform work in the United States
- Must have lived in the United States for 3 of the last 5 years
- Bachelor's degree is required
- Lead level experience with at least 8+ years of overall experience in all the required skills
- 6 or more years of experience working with Kanban and agile tools like Jira, Git, and Confluence
- 8 or more years of hands-on experience with orchestration tools such as terraform
- Strong hands-on experience with Windows Server administration (2019/2022), including IIS configuration, Windows services, Active Directory integration, and Group Policy management
- Demonstrated experience building and maintaining CI/CD pipelines for .NET applications (ASP.NET, .NET 4.x, 8+) and Node.js applications deployed to AWS ECS and Lambda
- Proven experience with AWS ECS (Fargate and EC2 launch types), including service definitions, task definitions, load balancer integration, auto-scaling, and blue/green deployments
- Hands-on experience with AWS Lambda functions using Node.js and/or Go Lang, including event-driven architectures, API Gateway integration, and Step Functions orchestration
- Experience deploying, configuring, and maintaining Apache NiFi data flow pipelines for healthcare data integration, including dataflow versioning with NiFi Registry
- Proficiency with Terraform for infrastructure-as-code across AWS services including ECS, Lambda, API Gateway, CloudFront, RDS, VPC networking, IAM, etc
- Strong PowerShell and Bash scripting skills for automation of Windows and Linux administration tasks, deployment workflows, and operational runbooks
- Demonstrated experience leading a DevOps or platform engineering team of at least 2 direct reports, with accountability for team deliverables, performance, and professional growth
- Proven ability to manage work using Kanban methodology, including maintaining Kanban boards, enforcing WIP limits, tracking cycle time and throughput metrics, and driving continuous improvement
- Experience mentoring and growing engineering staff, including establishing technical standards, conducting code reviews, and building a culture of accountability and continuous learning
- Strong customer-facing communication skills with experience representing technical teams to federal program stakeholders, managing expectations, and upholding SLAs
- Track record of balancing hands-on technical work (at least 75% of role) with people management responsibilities, including performance management and corrective action when needed
- Strong experience with Windows Server environments (2019/2022) and solid understanding of Linux Systems (CentOS, RedHat, Amazon Linux), hosts, networks, security, applications, and proficiency in shell scripting and PowerShell
- Strong experience with containerization, particularly with ECS (Fargate and EC2 launch types), including deploying .NET and Node.js workloads to ECS
- Strong understanding of Zero Trust Architecture
- Solid understanding and proven experience with configuration management tools like Ansible, Jenkins, Terraform, Containerization, and data integration tools such as Apache NiFi
- Believes in automation for consistent, scalable and fool-proof delivery of infrastructure and applications
- Support production issues/high severity issues on weekends or off hours as required
- Experience managing and mentoring staff, managing expectations and prioritization of work
- Ability to balance many priorities and directions at once and keep the team focused and on schedule
- Experience working day to day tasks while also innovating to make sure the team not only supports the program but actively innovates to make it better
- Experience in working as DevOps leader focusing on CI/CD and CM tools and modern frameworks including .NET, Node.js, and serverless architectures in the ecosystem
- Solid hands-on experience with working on AWS
- Strong experience with serverless resources, particularly AWS Lambda (Node.js and .NET runtimes), API Gateway, and CloudFront
- Solid expertise troubleshooting and managing both Windows Server and Linux systems
- Hands-on experience in using Ansible, PowerShell, or Python
- Demonstrated ability come up with System designs, architecture, process flows and Concept of Operations for large complex systems
- Should have an AWS Professional certification or be ready to obtain a certification within 60 days from the date of joining
- Please provide any code in GitHub or such that you have worked on personally or in the open-source community
- Experience with tools such as Jenkins to enable CI/CD
- Experience with Splunk is a plus
- Knowledge of Healthcare, Medicare and Medicaid systems and data is a plus