Key skills

Cloud experienceInfrastructure as CodeScripting/programmingCI/CD pipelinesMonitoringalertingNetworking fundamentalsPerformance engineeringContainersorchestrationData/analytics optimizationHigh-compliance environmentsPythonJavaC++CBashAnalyticsAWSAzureTerraformKubernetesDockerJenkinsAnsiblePackerOracleGitGitLabCI/CD

About this role

Oracle Health is seeking a Senior Site Reliability Engineer to build a modern, automated healthcare platform that millions rely on. The role involves designing, automating, and operating secure, highly available cloud services to drive reliability, speed, and efficiency across the platform.

Responsibilities:

Own service reliability end-to-end: architecture, production operations, and on-call excellence
Build automation and self-healing systems using IaC (e.g., Terraform) and CI/CD
Design, implement, and evolve observability (metrics, tracing, logging) and SLO/error budgets
Lead capacity planning, performance tuning, and cost/sustainability initiatives
Develop tooling and services to improve scalability, availability, and developer productivity
Partner with cross-functional teams to deliver features safely (canary/blue‑green, progressive delivery)
Drive incident response, root-cause analysis, and prevention through automation
Prototype and standardize platform services and best practices across teams

Requirements:

US citizenship and the ability to obtain/maintain a federal security clearance
Experience operating large-scale, distributed, fault-tolerant systems in production
Strong scripting/programming (Python, Bash; Java/C++ a plus)
Infrastructure as Code and automation (Terraform; Ansible/Chef/Puppet/Packer a plus)
CI/CD pipelines and tooling (Git, GitLab/Jenkins/Rundeck)
Cloud experience (OCI, AWS, Azure or similar)
Deep knowledge of monitoring, alerting, incident management, and postmortems
Solid grasp of networking, security fundamentals, and performance engineering
Experience in regulated or high-compliance environments
Data/analytics and platform sustainability optimization
Containers and orchestration (Kubernetes, Docker)

Senior Site Reliability Engineer – Cloud Automation (Oracle Health Cloud, Remote US)

Key skills

About this role

Responsibilities:

Requirements: