Maintain and operate a cloud infrastructure platform to support a SaaS software product
Maintain a Machine Learning experiment and development environment infrastructure
Work with internal teams to monitor and deploy production, development, and research environments
Improve and maintain good practice and processes for deployment, security, and reliability at scale.
Operate and maintain the cloud infrastructure deployed within our AWS environment
Gitops deployments of software
CI/CD pipelines
Maintaining and monitoring Kubernetes clusters
Security maintenance and best practice implementation
Support internal teams with solutions for their deployment and infrastructure requirements
Requirements
2+ years of industry experience in applied DevOps
Strong written and verbal communication skills
Strong proficiency with Linux operating systems
Experience of cloud providers (AWS preferred) including cost monitoring and management
Expert Git skills
Maintaining CI/CD pipelines
Monitoring and logging frameworks (Cloudwatch/Cloudtrail, Prometheus, Grafana etc)
Experience working with Kubernetes in a live production environment
Experience of GitOps working methodology and deployment strategies in a live production environment
Experience with IaC tooling at scale, terraform/opentofu preferred
Ability to use scripting languages for complex operations, e.g. bash
Understanding of software deployment and ability read and understand python code.
Experience with Machine Learning model management e.g. model registries, packaging, and deployment
Experience with Machine Learning model training infrastructure and logging (MLFlow preferred)
Experience working with validated software platforms with strict versioned release cycles
Experience producing maintainable, testable, documented, and production-grade code.
Tech Stack
AWS
Cloud
Grafana
Kubernetes
Linux
Prometheus
Python
Terraform
Benefits
A comprehensive benefits package that includes an annual bonus plan, private medical insurance, life insurance, and a contributory pension scheme
25 days annual leave, plus bank holidays and enhanced maternity leave
A diverse work environment that brings together experts in many fields, including software engineering, devops, data science, machine learning, quality assurance, regulatory affairs, and clinical operations.