AnsibleAWSAzureCloudJenkinsKafkaKubernetesLinuxNode.jsOpenShiftRabbitMQTCP/IPGoAmazon Web ServicesGitHub ActionsHelmTeamCityGitHubAtlassianJiraConfluenceCommunicationNegotiation
About this role
Role Overview
You will serve as a primary point responsible for the overall health, performance, and capacity of gaming platform services.
Troubleshoot issues across the entire stack: hardware, software, application, and network.
Identify and drive opportunities to improve automation for the company.
Gain deep application-level knowledge of the systems and contribute to their overall design.
Manage timely resolution of all critical and/or complex problems meeting SLA requirements.
Develop, configure, and optimize service and application monitoring and telemetry.
Assist in the rollouts and deployment of new product features and installations.
Develop tools to improve our ability to rapidly deploy and effectively monitor applications and services in a large-scale environment.
Work closely with development teams to ensure that platforms are designed with "operability" in mind.
Requirements
You have a technology or business graduate degree, or equivalent experience and knowledge of IT governance and operations.
Strong knowledge of current IT methodologies and systems technologies and standards.
Actively contributes SRE/DevOps best practices.
Passion to replace manual work with code to enable self-running systems.