CVS Health is dedicated to shaping a more connected and compassionate health experience. The Staff Software Development Engineer in Automation Production Support is a senior technical leader responsible for ensuring the stability and operational excellence of enterprise automation solutions while collaborating with various teams to enhance automation resilience and scalability.
Responsibilities:
- Serve as the technical owner for production support of automation and RPA solutions across critical business processes
- Lead incident triage, root cause analysis, and permanent remediation for high-severity automation failures
- Establish and enforce runbooks, support models, escalation paths, and on-call readiness for automation platforms
- Proactively identify systemic issues and implement stability, resiliency, and performance improvements
- Provide hands-on technical leadership for automation design, debugging, and optimization in production environments
- Review automation code and configurations to ensure adherence to standards, security, and reliability best practices
- Partner with development teams to ensure production readiness of new automations before release
- Guide architectural decisions that reduce operational complexity and technical debt
- Design and maintain monitoring, alerting, and health dashboards for automation platforms
- Drive adoption of AIOps, SRE, and automation-first support practices where applicable
- Improve observability by defining meaningful metrics such as uptime, failure rates, recovery times, and business impact
- Act as a key escalation point for automation-related production issues
- Collaborate with infrastructure, application, and vendor teams to resolve platform-level issues
- Communicate incidents, risks, and remediation plans clearly to technical and non-technical stakeholders
- Mentor engineers and support analysts in automation support best practices
Requirements:
- Extensive experience in software development and production support for enterprise systems
- Strong expertise in automation/RPA platforms, scripting, and debugging complex workflows
- Proven ability to lead incident response and root cause analysis in high-availability environments
- Deep understanding of SDLC, CI/CD, release management, and production readiness standards
- Bachelor's degree in Computer Science, Engineering, or equivalent practical experience
- Experience supporting mission-critical automation in regulated or high-volume environments
- Familiarity with monitoring tools, job schedulers, and orchestration platforms
- Experience implementing resiliency patterns, failover strategies, and automation governance
- Demonstrated leadership in driving operational maturity and continuous improvement