Oracle is a leading company in AI and cloud solutions, focusing on innovative healthcare technologies. The role of a Senior Site Reliability Engineer involves engineering and supporting production systems and services, with an emphasis on reliability, scalability, and performance.
Responsibilities:
- Shared full stack ownership of services and/or technology areas with development partners
- Understand end-to-end configuration, dependencies, and behavioral characteristics of production services
- Ensure services are designed with focus on security, resiliency, scale, and performance
- On-call support of applications and infrastructure
- Application deployment and monitoring
- Act as liaison between US Federal Cloud support and development teams
- Partner with development teams to define and implement improvements in service architecture
- Articulate technical characteristics and dependencies across services
- Support federal project submission process and security compliance
- Understand and communicate scale, capacity, and performance characteristics
- Perform tuning, optimization, and resource utilization improvements
- Maintain instrumentation and metrics to describe service behavior
- Ensure resiliency, backup/restore, and disaster recovery capabilities
- Perform security patching and vulnerability remediation
- Ensure compliance with corporate and federal security standards
- Apply automation and orchestration principles to reduce manual effort
- Improve CI/CD and operational processes
- Identify root causes and prevent recurring issues
- Act as SME for complex or critical production issues
- Support major incident resolution when root cause is unclear
- Provide deep system and service troubleshooting
Requirements:
- U.S. Citizenship on U.S. soil is required. This position requires eligibility to receive a federal security clearance
- At least 4+ years of combined higher education and/or related work experience
- Bachelor's degree in Information Systems, Computer Science, Computer Engineering, Software Engineering or related field, or equivalent work experience
- Good knowledge and hands-on experience with Linux and SQL
- At least 1 year of experience in Application Support, CI/CD pipelines, or Infrastructure Support
- A BS or MS in Computer Science, or equivalent
- Identifies solutions to knowledge of server hardware and software configuration, networking, standard internet services, scripting languages, cloud computing patterns, technology security and compliance
- Experience running large scale customer facing web services
- Identifies solutions to understanding of load balancing technologies and experience with development in programming languages, databases and big data stores, and container technologies
- Work involves defining and documenting technical architecture of complex and highly scalable products
- A minimum of 5+ years experience of running large scale customer facing web services
- Proficiency in container platforms, big data processing, and relational databases (Hadoop ecosystem, Map/Reduce, stream processing, etc.)
- Experience with programming languages such as Java, C#, C/C++, or Ruby
- Experience with production operations and practices for deploying and supporting code
- Ability to collaborate effectively with team members and stakeholders
- Experience with cloud and data processing technologies