Centene Corporation is a diversified national organization dedicated to improving health outcomes through technology. The Lead Site Reliability Engineer will lead complex projects focused on managing platform infrastructure performance, reliability, and security, while also mentoring teams and driving continuous improvement initiatives.
Responsibilities:
- Leads team to identify problems with systems and services and drives regular deployment of new versions of the systems and their subcomponents
- Leads projects from end-to-end that are focused on building and maintaining observability/monitoring for the application, monitoring key performance indicators, maintaining alerting, and continuously improving visibility
- Drives decisions around periodic system validation and testing, service monitoring, and standing up new services/tools
- Uses advanced knowledge and experience to identify strategies that increase system reliability and performance through on-call rotation and process optimization
- Leads post incident reviews and documents findings for future informed decision making
- Drives implementation of approved proposals to optimize Software Development Life Cycle (SDLC) to boost service reliability
- Leads functional and development teams to investigate and document issues and leads internal team to develop solutions to mitigate them
- Leads root cause and problem solving initiatives
- Understand and adapt new technologies, tools, methods, and processes from Microsoft and industry
- Coaches and mentors team. Designs and implements key performance indicators
- Contributes to engineering and organization success by welcoming related, different, and new requests; helping others accomplish job results
- Trains the engineering team on new systems, protocols, and best practices
- Drive and coach others through reviews of design, code, and test cases
- Performs other duties as assigned
- Complies with all policies and standards
Requirements:
- A Bachelor's degree in a quantitative or business field (e.g., statistics, mathematics, engineering, computer science)
- 5 – 7 years of related experience
- Experience with Linux Operating System; Operating Systems; Unix Operating System; Windows Operating System
- Experience with observability/monitoring tools such as Splunk, Dynatrace, Elastic, New Relic, Prometheus, Grafana
- Experience with enterprise level CICD Tools such as Ansible, Jenkins, Cloudbees, OpenShift
- Experience working in public cloud platforms like AWS and Azure
- Experience with Programming Tools
- Experience building and operating highly scaled applications
- Experience with MongoDB; MySQL; Oracle Database Management System (DBMS); PL SQL; SQL (Programming Language)
- Experience with varying code repositories, auto deployments, branching with tools such as Gitlab, Bitbucket, Subversion
- Experience with IT service management tools such as Service Now, Atlassian, BMC