Oracle is seeking a Principal Software Engineer for their Health Data Intelligence (HDI) team to enhance the reliability and performance of their analytics platform. The role involves developing and optimizing infrastructure and data pipelines while collaborating with a team to ensure efficient healthcare analytics globally.
Responsibilities:
- Work with Site Reliability Engineering (SRE) team on the shared full stack ownership of a collection of services and/or technology areas
- Understand the end-to-end configuration, technical dependencies, and overall behavioral characteristics of production services
- Responsible for the design and delivery of the mission critical stack, with focus on security, resiliency, scale, and performance
- Authority for end-to-end performance and operability
- Partner with development teams in defining and implementing improvements in service architecture
- Articulate technical characteristics of services and technology areas and guide Development Teams to engineer and add premier capabilities to the Oracle Cloud service portfolio
- Understand and communicate the scale, capacity, security, performance attributes, and requirements of the service and technology stack
- Demonstrate clear understanding of automation and orchestration principles
- Act as ultimate escalation point for complex or critical issues that have not yet been documented as Standard Operating Procedures (SOPs)
- Utilize a deep understanding of service topology and their dependencies required to troubleshoot issues and define mitigations
- Understand and explain the affect of product architecture decisions on distributed systems
- Professional curiosity and a desire to a develop deep understanding of services and technologies
- Develop & Maintain: Implement and tune infrastructure components for the Oracle HDI Analytics Platform to ensure system stability and uptime
- Data Pipeline Execution: Build and refine scalable data pipelines, leveraging Vertica and ETL processes to ensure efficient data ingestion and transformation
- BI Support: Assist in the integration and optimization of BI and reporting tools to ensure seamless data visualization for healthcare leaders
- Operational Excellence: Apply DevOps and SRE principles to automate routine tasks, manage deployments via CI/CD, and monitor system health using Prometheus/Grafana
- Cloud Integration: Support platform-agnostic initiatives across Oracle Cloud and AWS, ensuring cost-efficient and compliant resource usage
- Incident Response: Participate in on-call rotations or troubleshooting sessions to resolve production issues and implement preventative fixes
- Collaboration: Work closely with senior engineers to execute technical roadmaps and provide peer reviews for code and infrastructure changes
Requirements:
- U.S. citizenship is required for this position, as the successful candidate will be required to obtain (and maintain) a U.S. government security clearance after hire
- Experience implementing and maintaining high-availability systems with a focus on performance monitoring and fault tolerance
- Proficiency in Data Warehousing platforms (e.g., Vertica, Snowflake) and ETL frameworks; understanding of columnar storage and large-scale data processing
- Practical experience integrating or supporting Business Intelligence tools (e.g., Tableau, Power BI, Oracle Analytics) to surface data-driven insights
- Competency in CI/CD pipelines (Jenkins, Kubernetes), Infrastructure as Code (Terraform), and observability tools (Prometheus, Grafana)
- Working knowledge of public cloud environments (OCI, AWS, or Azure) with an emphasis on deployment and resource management
- Strong ability to troubleshoot complex production issues, perform root-cause analysis, and document technical findings
- Solid foundation in Python, Java, or Go, along with containerization (Docker) and shell scripting
- 8+ years of software engineering experience, with 5+ years focused on cloud infrastructure, SRE, or DevOps
- Proven ownership of production system reliability and uptime in cloud environments
- Strong expertise in cloud infrastructure design and automation
- Strong expertise in distributed systems and performance optimization
- Strong expertise in data warehousing and ETL frameworks
- Strong expertise in columnar databases (e.g., Vertica)
- Hands-on experience with Infrastructure as Code (Terraform)
- Hands-on experience with containerization (Docker) and orchestration (Kubernetes)
- Hands-on experience with observability stacks (Prometheus, Grafana)
- Experience integrating BI/reporting tools (Tableau, Power BI, Oracle Analytics, etc.)
- Proficiency in Python, Java, or Go
- Strong problem-solving skills with a track record of improving system reliability, automation, and scalability
- Experience in healthcare or regulated environments (HIPAA, compliance frameworks)
- Familiarity with Oracle HDI or large-scale analytics platforms
- Experience working in environments requiring security clearance