managing the hybrid infrastructure roadmap, balancing in-house and managed service capabilities, and embedding automation-first, resilient design from the start.
Ensure operational reliability and service-aware operations
overseeing Site Reliability Engineering, uptime, performance, and proactive observability to reduce operational risk, toil and enable self-healing platforms.
Develop people and partnerships
leading and coaching our high-performing teams, building our internal team’s capabilities, and managing relationships with strategic partners to enable service improvement.
Champion continuous improvement and transformation – driving DevOps, automation, and modern Cloud practices, identifying opportunities to reduce cost and complexity, and fostering a culture of learning, experimentation, and safe challenge for all.
Requirements
Degree in a technology or computing discipline
Extensive experience leading hybrid platform operations (cloud/on-prem)
Proven track record of leading SRE, DevOps and infrastructure automation practices
Experience managing network and platform engineering teams
Strong working knowledge of platform observability and telemetry tools
Experience managing outsourced and in-house technical teams
Understanding of ITIL processes (esp. Change, Incident, Problem)
Familiarity with containerised environments and Infrastructure as Code
Relevant professional certifications (e.g. Azure, AWS, ITIL, SRE) (not essential)
Tech Stack
AWS
Azure
Cloud
Benefits
Full private healthcare with no excess
26 days leave, rising with service + Bank Holidays, with the option to swap Christmas and Easter holidays for those celebrated by your religion
A flexible working culture
Competitive pension scheme – we double-match your contributions up to 6%