Spend your days working to automate and improve reliability and continue to push the ARINCDirect infrastructure forward, ensuring it is resilient and reproducible.
Be responsible for service availability, performance, monitoring, incident response, and capacity planning.
Create, improve, and manage environments to ensure decisions on resource allocation, problem identification, and capacity planning are based on accurate data-driven insights.
Maintain a physical infrastructure using Linux
Help facilitate a push towards Kubernetes and declarative infrastructure
Impact technology decision and direction to grow and support the ARINCDirect platform.
Collaborate closely with fellow SREs on your team and extend your collaboration across other teams and disciplines to design dependable and scalable solutions and services.
Identify, implement, and champion process improvements to enhance productivity, collaboration, and delivery efficiency, while ensuring alignment with company goals and industry best practices.
Requirements
Typically requires a degree in Science, Technology, Engineering or Mathematics (STEM) and minimum 8 years prior relevant experience or an Advanced Degree in a related field and minimum 5 years of experience or in absence of a degree, 12 years of relevant experience.
Must be authorized to work in the U.S. without sponsorship now or in the future.
Experience as a SRE, Platform Engineer, or related position within a Linux or UNIX environment working on large, complex infrastructures and/or projects using Docker and Kubernetes solutions
Experience automating configuration and infrastructure with tools such as Saltstack, Ansible, Terraform or other declarative languages.
Experience with hardware; including servers, network switches, & cabling.
Experience managing infrastructure using GitOps with continuous delivery (CD) pipelines.
Established proficiency in at least one (ideally more) of the following: Python, Linux Shell (bash, awk, sed).
Experience with PostgreSQL, or equivalent RDBMS and SQL in general.
Familiarity with Cloud infrastructure, ideally AWS.
Understanding of SRE principles including building observability solutions and exposing metrics to inform SLO's and KPI's.
Understanding of how IT infrastructure services work, including: DNS, DHCP, LDAP, NFS.
Understanding of network segmentation, routing and VPNs.
Tech Stack
Ansible
AWS
Cloud
DNS
Docker
Kubernetes
Linux
NFS
Postgres
Python
RDBMS
SaltStack
SQL
Terraform
Unix
Benefits
Medical, dental, and vision insurance
Three weeks of vacation for newly hired employees
Generous 401(k) plan that includes employer matching funds and separate employer retirement contribution, including a Lifetime Income Strategy option
Tuition reimbursement program
Student Loan Repayment Program
Life insurance and disability coverage
Optional coverages you can buy: pet insurance, home and auto insurance, additional life and accident insurance, critical illness insurance, group legal, ID theft protection
Birth, adoption, parental leave benefits
Ovia Health, fertility, and family planning
Adoption Assistance
Autism Benefit
Employee Assistance Plan, including up to 10 free counseling sessions