Serve as a key operational owner for the health and stability of our commercial product portfolio
Ensure effective incident response, support operations, and continuous improvement across products in production
Act as the bridge between support, engineering, and product teams
Lead major incident response, coordinating efforts across L1/L2 support partners, Engineering, Platform Operations, and vendors
Own root cause analysis and problem management
Analyze incident trends, support metrics, and operational data to identify opportunities for improvement and increased stability
Define and enhance monitoring and alerting capabilities
Ensure production changes meet operational readiness and governance standards
Establish and maintain operational documentation, runbooks, and support processes
Communicate operational status, risks, and improvements clearly to stakeholders
Requirements
4+ years of experience in application support, product operations, SRE, DevOps, or similar roles
Experience managing production environments, incident response, and support models (L2/L3)
Strong understanding of modern application architectures, including APIs, cloud platforms (Azure, AWS, or GCP), and data systems (e.g., Snowflake, SQL-based platforms)
Experience working with monitoring and logging tools to diagnose and resolve issues
Familiarity with incident, change, and problem management processes
Ability to perform light technical work such as scripting, configuration updates, or database querying when needed
Tech Stack
AWS
Azure
Cloud
Google Cloud Platform
SQL
Benefits
We are passionate about developing our people, through career development and progression