Lead, mentor, and develop a team of L2/L3 Production Support Engineers.
Define, track, and optimise key operational metrics including Mean Time to Resolution (MTTR), Mean Time Between Failures (MTBF).
Lead the diagnosis and resolution of application-level issues using software development techniques and best practices.
Establish and refine incident management processes. Lead critical incident resolution and coordinate cross-functional response efforts.
Champion rigorous RCA practices across the team.
Identify opportunities to streamline support workflows, reduce manual effort through automation, and eliminate toil.
Serve as the primary technical contact for internal and external stakeholders.
Maintain oversight of production SaaS platforms, infrastructure stability, and system performance.

Minimum 6+ years in production support, DevOps, or Site Reliability Engineering roles, with at least 3 years leading or mentoring technical teams.
Proven experience troubleshooting application code issues using software development techniques: debuggers, profilers, log analysis, code review, and systematic problem-solving methodologies.
Demonstrated expertise building and scaling metrics-driven teams. Evidence of implementing or improving MTTR, MTBF, or similar KPIs with measurable results.
Strong background supporting SaaS/cloud-native production systems in high-availability, high-traffic environments.
Hands-on experience with incident management frameworks (ITIL, blameless post-mortems, RCA methodologies).
Excellent communication skills: ability to distil technical complexity for non-technical stakeholders and present data-backed insights to leadership.
Experience with containerized systems (Docker, Kubernetes) or cloud platforms (AWS, Azure, Google Cloud) is a plus.
Familiarity with observability tools (Datadog, New Relic, Splunk, Prometheus/Grafana) and APM instrumentation is a plus.
Background in software development or systems engineering (demonstrable coding ability in at least one language) is a plus.
Formal leadership or management certification; Agile/Scrum experience is a plus.
Located in or willing to work from the York, UK office.

To learn more about our values, mission and the wide-range of perks offered to employees at Comply, visit https://www.comply.com/careers/.

Production Support Manager

Key skills