AnsibleAWSAzureCloudGrafanaITSMJenkinsLinuxMongoDBMySQLOpenShiftOraclePrometheusSplunkSQLSubversionUnixNew RelicDynatraceGitLabBitbucketAtlassianCI/CDDecision Making
About this role
Role Overview
Helps lead projects that are focused on managing and maintaining optimum platform infrastructure performance, reliability, and security using SRE practices, observability tools, manual and automated procedures, documentation, people and processes and continuous delivery(CI/CD) tools, processes, and designs
Develops complex services to automate monitoring activities and provide critical information to facilitate response and resolution of performance and availability issues and incidents
Troubleshoots and analyzes service disruptions to determine the root cause of issues and develop solutions for improved reliability
Keeps documentation and runbooks up to date to effectively deal with new incidents that might arise
Leads post incident reviews and documents findings for future informed decision making
Requirements
A Bachelor's degree in a quantitative or business field (e.g., statistics, mathematics, engineering, computer science)
Requires 4 – 6 years of related experience
Experience with Linux Operating System; Operating Systems; Unix Operating System; Windows Operating System
Experience with observability/monitoring tools such as Splunk, Dynatrace, Elastic, New Relic, Prometheus, Grafana
Experience with enterprise level CICD Tools such as Ansible, Jenkins, Cloudbees, OpenShift
Experience with working in public cloud platforms like AWS and Azure
Experience with Programming Tools
Experience with building and operating highly scaled applications
Experience with MongoDB; MySQL; Oracle Database Management System (DBMS); PL SQL; SQL (Programming Language)
Experience with varying code repositories, auto deployments, branching with tools such as Gitlab, Bitbucket, Subversion
Experience with IT service management tools such as Service Now, Atlassian, BMC
Tech Stack
Ansible
AWS
Azure
Cloud
Grafana
ITSM
Jenkins
Linux
MongoDB
MySQL
OpenShift
Oracle
Prometheus
Splunk
SQL
Subversion
Unix
Benefits
health insurance
401K and stock purchase plans
tuition reimbursement
paid time off plus holidays
flexible approach to work with remote, hybrid, field or office work schedules