Oracle is a leading company in AI and cloud solutions, focused on delivering innovative services to customers. They are seeking a Senior Software Development Engineer to join their Automation for Sovereign Cloud team, where the role involves leading the design and architecture of microservices, developing scalable distributed systems, and mentoring junior engineers.
Responsibilities:
- Architect and Design: Lead the design and architecture of microservices that support distributed systems, handling data egress and integration between multiple realms. Drives operational readiness & excellence of their features & subsystems
- Build Distributed Systems: Develop and maintain highly available, scalable, and secure distributed systems to support real-time data processing for analytics platforms
- Coding: Leverages industry best practices to write correct, secure, maintainable, robust code and appropriate tests. Drives design of their features & subsystems
- Data Pipeline Optimization: Create efficient and resilient data egress pipelines, ensuring seamless data flow and aggregation across OCI realms
- Collaboration: Work closely with cross-functional teams, including data engineers, product managers, and DevOps teams, to ensure alignment with business goals and infrastructure needs
- Problem Solving: Identify bottlenecks, optimize performance, and troubleshoot issues within distributed systems and data flows
- Mentorship and Leadership: Operating independently, you mentor junior engineers, promoting best practices in software engineering, distributed systems, and cloud infrastructure
- Operations: Trusted to serve as Tier2 or specialized escalation point for operations events. Leads deep dives into events during calls and in support of root cause analysis (CAPA). Serves as primary point of contact for resolving complex operations issues. Develops new metrics and dashboards to improve situational awareness. Leads operational assessments for complex systems ensuring operational issues and potential failure modes are accounted for. Participates in cross - organizational programs incl. CAPA, ECAR and region builds together guidance, standards and best practices for operations, resiliency and availability
- Innovation and Improvement: Stay updated on industry trends and innovations in cloud architecture, distributed systems, and data engineering to continuously improve our services
Requirements:
- Position requires a U.S. Citizens while possessing and maintaining a TS/SCI eligibility with poly required
- BS or MS degree in Computer Science or relevant technical field involving coding or equivalent practical experience
- 4-8 years of total experience in software development
- Able to effectively communicate technical ideas verbally and in writing (technical proposals, design specs, architecture diagrams and presentations) up to organizational leadership
- Demonstratable programming/software skills in Python/Java/GO/Rust, strong software development experience through hands on coding, and unit tests. (code is correct, secure, maintainable, with appropriate tests)
- Understanding of CS fundamentals including data structures, algorithms, complexity analysis, SDLC, secure coding
- Ability to build major new features in existing systems, drives design and operational readiness of their features & subsystems
- Improves team engineering/ops practices and development process, works independently on projects and works across teams; making nuanced trade-offs; designing for security, concurrency, availability, performance, scalability, and change
- Ability to identify and remediate issues, lead root cause analysis efforts and suggest possible solutions
- Able to develop new metrics and dashboards to improve situational awareness, reduce operational load, increase service availability, improve compliance to org standards and reduce tech debt
- Master's degree in computer science or related engineering fields
- 2+ years' experience with building on OCI
- 2+ years' experience as a technical lead/architect
- Experience working on large-scale, highly distributed, global cloud services infrastructure (compute instances, IAM, networking, storage, databases, etc) for mission-critical tier-one services
- Knowledge of Computer Networking (OSI layers, HTTP, DNS, TCP/IP, DHCP, Routers, Gateways, Subnets, etc.)
- Knowledge of Linux internals, Linux/Unix troubleshooting skills, security
- Knowledge of host virtualization technologies (KVM, Containers, Docker, etc.)