AnsibleAWSCloudDockerElasticSearchFirewallsGrafanaKubernetesLinuxMySQLPrometheusPuppetRabbitMQRedisTerraformGitLab CIHelmElasticsearchGitLabPerformance OptimizationCI/CDCollaborationRemote Work
About this role
Role Overview
Manage and enhance Kubernetes clusters, including configuration, upgrades, scaling, and deployment automation using Helm and Docker
Operate, maintain, and optimize Linux-based systems in an on-premise datacenter environment
Manage, monitor, and troubleshoot RabbitMQ clusters, ensuring message delivery reliability, scalability, and fault tolerance
Administer and optimize Redis, Elasticsearch, and MySQL databases for performance, stability, and data integrity
Support and execute database migration and infrastructure modernization projects within the on-prem environment
Implement and maintain infrastructure-as-code practices using Terraform, Ansible, GitLab CI/CD and Puppet
Maintain and improve monitoring, logging, and alerting systems (e.g., Prometheus, Grafana, Zabbix, ELK)
Collaborate with development and product teams to support deployment pipelines and performance optimization
Participate in incident response, root cause analysis, and on-call rotations to ensure system reliability
Ensure all systems meet security and data protection requirements
Requirements
5+ years of hands-on experience in system or infrastructure engineering, focused on on-premise environments
Strong expertise in Linux system administration (Debian/Ubuntu or RHEL/CentOS)
Deep understanding of RabbitMQ, including clustering, high availability, performance tuning, and troubleshooting
Experience managing and optimizing Redis and Elasticsearch in production
Solid practical knowledge of Kubernetes, Docker, and Helm, including cluster management, deployments, upgrades, and troubleshooting
Experience with infrastructure automation and configuration management using Terraform, Ansible, GitLab CI/CD and Puppet
Strong understanding of networking, including routing, firewalls, VPNs, and secure access
Experience with monitoring and observability tools (Prometheus, Grafana, Zabbix, ELK)
Excellent problem-solving and analytical skills, with attention to performance, reliability, and maintainability
Familiarity with cloud providers such as AWS would be a plus.