Partnerize is on a mission to transform the way businesses grow with their partnership automation platform. They are seeking a Site Reliability Engineer to join their Technical Operations team, responsible for building infrastructure, delivering projects, and ensuring system availability, scalability, and security.

Responsibilities:

Provide primary operational support and engineering for multiple large, distributed software applications
Measure and optimise system performance, with an eye toward pushing our capabilities forward, getting ahead of customer needs, and innovating to continually improve
Build software and systems to manage platform infrastructure and applications
Improve reliability, quality, and time-to-market of our suite of software solutions
Partner with development teams to improve services through rigorous testing and release procedures
Participate in system design consulting, platform management, and capacity planning
Work closely with development and tech teams ensuring technical Issues and projects are correctly managed
Deliver small and large technical projects in a prompt and timely manner
Act as an escalation for Support Incidents and assignments while maintaining a high level of quality
Be responsible for continuous improvement, continuous delivery and continuous integration
Participate in the On-Call Rotation

Requirements:

Understanding of databases (MySQL, PostgreSQL, Redis)
Understanding of scripting languages (Python, Bash, PHP)
Knowledge of platform or application automated deployment and configuration management (Ansible, Docker, terraform etc)
An awareness of AWS and other cloud infrastructure services
Linux system administration skills
Ability to troubleshoot, diagnose and solve issues independently
Self-learner, ability to document learning as experience is gained
Experience as part of a team supporting and maintaining an infrastructure
A proactive approach to spotting problems, areas for improvement, and performance bottlenecks
Ability to prioritise workload including occasional Incidents and Problem management
Knowledge of and experience with the ITIL practices
An interest in development, new technologies and innovation
Supporting development teams into the refactoring of technical debt
Kubernetes, docker and containers
Experience with monitoring systems, e.g. Zabbix, Nagios etc
Experience of JIRA and Confluence
Nginx or web server technologies
Gluster or storage technologies
Git and version control
Elasticsearch technologies - especially Kibana
Experience with Apache Kafka and Druid

Site Reliability Engineer

Key skills

About this role

Responsibilities:

Requirements: