McKesson is an impact-driven, Fortune 10 company that touches virtually every aspect of healthcare. The Sr. Database Site Reliability Engineer (DB SRE) is responsible for owning the reliability, availability, and operational maturity of business-critical Azure PostgreSQL platforms, applying SRE principles to enhance database services in a cloud-native environment.
Responsibilities:
- Own and continuously improve the reliability, availability, and performance of Azure PostgreSQL platforms across dev, stage, and prod
- Design, build, and operate cloud database infrastructure using Infrastructure as Code (Terraform)
- Apply SRE principles to stateful systems, including environment isolation, blast-radius reduction, and automation-first operations
- Define and implement database observability (metrics, logs, dashboards, alerts) using enterprise monitoring tools (e.g., Datadog)
- Lead incident response for database-related production issues and participate in on-call rotations
- Troubleshoot complex issues across performance, replication, connectivity, failover, and permissions
- Define and validate high availability, backup, restore, disaster recovery, and point-in-time recovery (PITR) strategies
- Enforce least-privilege access, support audits, and ensure compliance with security and governance requirements
- Collaborate with platform, application, security, and network teams to design scalable, secure database architectures
- Provide senior technical leadership, set reliability standards, and mentor less-experienced engineers
Requirements:
- 7+ years hands-on experience operating PostgreSQL databases in cloud environments (Azure strongly preferred)
- Strong production experience (7+ years) supporting high-availability, business-critical database platforms
- Deep expertise with Infrastructure as Code, particularly Terraform
- Experience owning or participating in on-call rotations and incident response
- Strong understanding of database operations, including performance tuning, replication, backup/restore, and recovery
- Experience designing and operating database observability and monitoring solutions
- Solid knowledge of cloud security principles, including least-privilege access and audit readiness
- Proven ability to communicate effectively with technical and non-technical stakeholders
- Background as an SRE with strong database depth (not a traditional DBA role)
- Experience with CI/CD pipelines, Git/GitOps workflows
- Familiarity with Kubernetes (AKS preferred), Helm, and ArgoCD
- Experience operating stateful workloads in Azure cloud environments
- Exposure to regulated or highly controlled environments
- Broader cloud platform experience beyond databases
- Bachelor's degree preferred; relevant experience considered in lieu of degree
- Typically 7+ years of relevant experience in SRE, platform, or infrastructure engineering roles