Oversee the architecture, engineering, and day-to-day operations of our OpenShift and Kubernetes clusters on GCP and Azure, ensuring high availability, security, and scalability.
Manage relationship with key technology vendors, negotiating contracts and ensuring optimal utilization.
Manage budgets and resources effectively ensuring cost-effectiveness and optimal return on investment, prioritize technical debt to mitigate risk and ensure infrastructure health.
Take full ownership of the platform's service delivery, including defining, monitoring and meeting Service Level Agreements (SLAs), Service Level Objectives (SLOs), and Mean Time to Resolution (MTTx) to ensure the platform meets the needs of our internal customers.
Drive automation initiatives to streamline operations, reduce manual tasks, and improve overall efficiency. This includes developing and enforcing best practices for Infrastructure as Code (IaC).
Collaborate with cross-functional teams to define the platform's strategic direction and roadmap, aligning its capabilities with business needs and user requirements.
Serve as an escalation point for complex technical issues, leading troubleshooting efforts and post-mortem analyses to prevent future occurrences.
Requirements
Proven 5+ years of experience in a platform engineering, DevOps, or site reliability engineering.
Bachelor’s degree in computer science, Information Technology or a related technical field.
Deep technical expertise with container orchestration platforms, specifically OpenShift and Kubernetes.
Hands-on experience with managing and operating cloud infrastructure on GCP and/or Azure.
Strong understanding of Infrastructure as Code principles and tools (e.g., Terraform, GitOps).
Experience in managing relationships with enterprise software vendors.
A strong understanding of service management principles, including SLAs, SLOs, and incident management.
Excellent leadership, communication and interpersonal skills, with the ability to collaborate effectively with both technical and non-technical stakeholders.
Tech Stack
Azure
Cloud
Google Cloud Platform
Kubernetes
OpenShift
Terraform
Benefits
Immediate medical, dental, vision and prescription drug coverage
Flexible family care days, paid parental leave, new parent ramp-up programs, subsidized back-up child care and more
Family building benefits including adoption and surrogacy expense reimbursement, fertility treatments, and more
Vehicle discount program for employees and family members and management leases
Tuition assistance
Established and active employee resource groups
Paid time off for individual and team community service
A generous schedule of paid holidays, including the week between Christmas and New Year’s Day
Paid time off and the option to purchase additional vacation time.