OnTrac is a leading provider of same-day and next-day delivery services in the U.S., and they are seeking a Senior GCP Cloud Platform Engineer. This role involves transitioning the cloud environment to fully automated infrastructure, maturing the CI/CD environment, and standardizing Terraform and Ansible playbooks to ensure a resilient and secure cloud infrastructure.
Responsibilities:
- Lead the transition from manual infrastructure management to a fully automated CI/CD lifecycle in Azure DevOps, bridging the current state to push-button deployments
- Own the Terraform and Ansible ecosystem to ensure changes are made through code, reducing manual tweaks and eliminating configuration drift
- Act as the senior voice with the engineering team to establish standards, share best practices, and guide decisions on adopting new tools
- Help architect the next phase of infrastructure by moving services into modern, elastic environments that support high availability and self-healing
- Lead container strategy by selecting workloads ready for orchestration, setting image standards, and ensuring consistent deployment patterns
- Define critical system health metrics with the team and participate in the on-call rotation to keep 24/7 delivery services operational and resilient
Requirements:
- 5+ years in Cloud Infrastructure, Systems Engineering, or Platform Operations
- Hands-on experience managing Google Cloud (GCP) environments, specifically Shared VPCs, IAM hierarchies, and Cloud SQL
- Strong Linux administration skills; a 'Linux-first' troubleshooter comfortable at the command line when deployments fail
- Proven experience building and managing automated release flows in Azure DevOps (e.g., Pipelines/Releases/Boards)
- High proficiency with Terraform and Ansible, including organizing existing code into a professional, modular library
- Strong understanding of the 'why' and 'how' behind containerization, including evaluating when workloads should move to a managed container platform vs. staying on VMs
- Experience designing and managing auto-scaling compute environments and container orchestration to handle variable traffic without manual intervention
- Ability to understand application dependencies and troubleshoot build/deployment errors to unblock developers in mixed environments