Lead the design and evolution of our ML platform, infrastructure and MLOps capability
Build and maintain scalable, reliable and secure systems for model training, testing, deployment, monitoring and lifecycle management
Develop the infrastructure and tooling that enable ML Engineers, Data Scientists and Researchers to work efficiently and ship models with confidence
Design robust workflows for CI/CD, model versioning, reproducibility, experimentation, feature management and release management
Own and improve the production environment for machine learning systems, ensuring strong standards for availability, performance, observability and resilience
Define and implement monitoring across model and platform layers, including system health, data quality, drift, latency, throughput and cost efficiency
Build or optimise internal self-service tooling and platform capabilities to reduce friction for teams working on ML use cases
Partner closely with ML, Data, Software and Platform Engineering teams to productionise models and improve the end-to-end ML development lifecycle
Support the scaling of infrastructure for both training and inference workloads, including high-throughput, real-time or compute-intensive use cases where relevant
Drive best practice in governance, security, compliance, auditability and operational rigour across the ML lifecycle
Improve the efficiency and cost-effectiveness of ML systems, including cloud resource usage, compute environments and deployment patterns
Mentor engineers and act as a technical leader across ML platform and operations topics
Help define the roadmap for ML enablement, ensuring the platform can support current needs while scaling for future growth
Requirements
Proven experience in a senior MLOps, ML Platform, ML Infrastructure, Platform Engineering or Machine Learning Systems role
Strong hands-on background in software engineering and cloud infrastructure, ideally with direct experience supporting production machine learning environments
Experience building and operating systems that support the full ML lifecycle, from experimentation and training through to deployment and monitoring
Strong knowledge of Python and sound engineering principles, including testing, automation and code quality
Strong experience with cloud platforms such as GCP
Experience with Docker, Kubernetes and modern containerised deployment patterns
Strong experience with CI/CD pipelines, infrastructure-as-code and workflow orchestration
Experience with tools such as Airflow or similar platform and orchestration technologies
Good understanding of model observability, data quality, feature pipelines, lineage and reproducibility
Experience designing scalable infrastructure for ML workloads, including training, batch inference and real-time serving
Strong appreciation of reliability, security, governance and operational excellence in customer-facing or production-critical systems
Ability to operate across both strategic and hands-on technical work
Strong communication skills and the ability to work effectively across engineering, product and data teams
Nice-to-haves: Experience supporting computer vision, deep learning, LLM or other compute-intensive ML workloads
Familiarity with feature stores, model registries and automated retraining pipelines
Experience building internal developer platforms or self-service ML tooling
Experience in regulated, high-security or high-availability environments
Experience leading or mentoring engineers in a scale-up or high-growth technology business
Familiarity with responsible AI, model governance or risk controls in production ML setting
Tech Stack
Airflow
Cloud
Docker
Google Cloud Platform
Kubernetes
Python
Benefits
25 days Annual Leave, plus 8 Bank Holidays (more holiday with service
up to an extra 5 days off per year based on your continuous service)
Growth Shares allocated after passing probation (6 months of service)
Salary sacrifice schemes including: Pension, Cycle To Work and Electric Car Scheme
Nursery Sacrifice Scheme
Work Overseas Perk
Work globally for up to 2 weeks
Life Assurance
SmartHealth
Access to private GP, Psychologist, Nutritionist along with tailored fitness plans for both you and your family
Benefit from personalized 1:1 career coaching with our in-house Occupational Psychologist
Award winning L&D platform with personal allocated training budgets
Enhanced paid family leave
Pension
5% employee, 3% employer
Flexible hybrid working environment
Free Barista Coffee/Tea, biscuits with fruit in the WeWork office
Free access to WeWork discounts and free online well-being sessions
Vitality Health
a range of options available on this below. The Vitality Programme includes a number of reward benefits that all employees have access to as part of the plan, for example: Private Health cover including Dental, Optical, and Audiology
50% off monthly gym memberships
Apple watches significantly discounted based member vitality status
Half price trainers with Runners Need
Weekly rewards – Free coffee with Café Nero
Monthly rewards – Free Cinema ticket
Discounts on travel with Expedia (hotels) and Mr & Mrs Smith with discounts getting greater throughout the year based on a members vitality status
Amazon prime free months based on activity
Up to 25% cashback at Waitrose when buying healthy foods
75% off stays at Champneys Health Spas
Allen Carr’s £299 no smoking programme for free
Access to Vitality Healthy Mind with 30% off Headspace subscriptions and the ability to earn Vitality points for using Buddhify, Calm and Headspace
Discounts on Weight Watchers
50%-80% off Comprehensive Private Health screenings