Partnerize is on a mission to transform the way businesses grow with their partnership automation platform. They are seeking a Site Reliability Engineer to join their Technical Operations team, responsible for building infrastructure, delivering projects, and ensuring system availability, scalability, and security.
Responsibilities:
- Provide primary operational support and engineering for multiple large, distributed software applications
- Measure and optimise system performance, with an eye toward pushing our capabilities forward, getting ahead of customer needs, and innovating to continually improve
- Build software and systems to manage platform infrastructure and applications
- Improve reliability, quality, and time-to-market of our suite of software solutions
- Partner with development teams to improve services through rigorous testing and release procedures
- Participate in system design consulting, platform management, and capacity planning
- Work closely with development and tech teams ensuring technical Issues and projects are correctly managed
- Deliver small and large technical projects in a prompt and timely manner
- Act as an escalation for Support Incidents and assignments while maintaining a high level of quality
- Be responsible for continuous improvement, continuous delivery and continuous integration
- Participate in the On-Call Rotation
Requirements:
- Understanding of databases (MySQL, PostgreSQL, Redis)
- Understanding of scripting languages (Python, Bash, PHP)
- Knowledge of platform or application automated deployment and configuration management (Ansible, Docker, terraform etc)
- An awareness of AWS and other cloud infrastructure services
- Linux system administration skills
- Ability to troubleshoot, diagnose and solve issues independently
- Self-learner, ability to document learning as experience is gained
- Experience as part of a team supporting and maintaining an infrastructure
- A proactive approach to spotting problems, areas for improvement, and performance bottlenecks
- Ability to prioritise workload including occasional Incidents and Problem management
- Knowledge of and experience with the ITIL practices
- An interest in development, new technologies and innovation
- Supporting development teams into the refactoring of technical debt
- Kubernetes, docker and containers
- Experience with monitoring systems, e.g. Zabbix, Nagios etc
- Experience of JIRA and Confluence
- Nginx or web server technologies
- Gluster or storage technologies
- Git and version control
- Elasticsearch technologies - especially Kibana
- Experience with Apache Kafka and Druid