Oracle is a leading company in AI and cloud solutions, striving for innovation in RDMA cluster networking. The Senior Optical Network Engineer role focuses on the design, deployment, and operations of large-scale global Oracle Cloud Infrastructure, specifically in high-speed fiber optic networks and systems.
Responsibilities:
- Collaborate with engineers from L1 optical engineering team, network design, delivery and AI Ops, DC Ops, and DC build teams and program/project managers to develop milestones and deliverables validating optical cabling and optical transceivers build quality and validation in the AI data center builds to the OCI standards for RDMA backend networks. Will primarily use existing procedures and tools to develop and safely execute DC network builds and changes. However, may have to develop new procedures from time to time
- Provide break-fix support for optical links to meet RDMA cluster performance criteria (pre-FEC BER, Rx power, FEC bin, BOL and EOL margins etc.). Serve as the escalation point for event remediation and lead post-event root cause analysis
- Frequently develops MPOs or scripts to automate routine tasks for team and business units to improve quality of builds
- Support dashboards build with requirements to represent data at L1 layers and device roles that help identify link level issues, anomalies such as link flaps and link downs
- Serves as SME on data center build standards for DC build environment, optical cabling and optics transceivers install and troubleshooting
- Participate in AI DC deployment rotations at DC build sites with up to 50% domestic travel for optical link validations for new clusters and prove recommendations to various teams for improvement and enforcement
- Support Ops to stabilize RDMA networks after turn-up
Requirements:
- Deep level understanding of optical cables of various types (patch cords, shuffle, bulk/trunk etc.)
- High speed optical transceivers for interconnects for leaf-spine RDMA cluster networks at the L0/L1 physical layer and L2 protocol level
- Troubleshooting and automation/programming skills
- Experience in developing and supporting high-speed fiber optic network fabric links and systems
- Ability to collaborate with engineers from various teams including L1 optical engineering, network design, delivery, AI Ops, DC Ops, and DC build teams
- Experience in developing milestones and deliverables validating optical cabling and optical transceivers build quality
- Provide break-fix support for optical links to meet RDMA cluster performance criteria
- Lead post-event root cause analysis
- Develop MPOs or scripts to automate routine tasks
- Support dashboards build with requirements to represent data at L1 layers and device roles
- Serve as SME on data center build standards for DC build environment, optical cabling and optics transceivers install and troubleshooting
- Participate in AI DC deployment rotations at DC build sites with up to 50% domestic travel
- Support Ops to stabilize RDMA networks after turn-up