Define the platform-level architecture of scale-up and scale-out fabrics across our accelerator product line, from single-node systems through to rack level deployments.
Specify physical interconnect infrastructure — defining the architectural requirements and trade-offs that the AI Infrastructure Systems division executes against.
Help translate workload and performance requirements into concrete system-level architecture and interconnect specifications.
Define host integration, covering PCIe, CXL and memory attachment, and the system-level memory organization across the platform.
Own platform performance, power, cost and scalability.
Define operability and RAS (reliability, availability, serviceability) requirements.
Take a leading technical role in the system architecture team, interfacing directly with key partners and internal stakeholders.
Requirements
Significant experience (5+ years) in system, platform, or hardware architecture, with a strong track record at server and/or rack scale.
Deep, hands-on understanding of Ethernet, UALink, optical, and switched-fabric technologies.
Ability to connect workload characteristics to hardware architecture and to quantify the impact of design choices on end-to-end performance.
Demonstrated ability to lead and collaborate across multidisciplinary teams and to interface effectively with partners and senior stakeholders.
Strong problem-solving skills, a collaborative mindset, and a passion for building systems at scale.
Bonus: Experience architecting AI/HPC accelerator systems, familiarity with distributed training/inference workloads, and exposure to data-center-scale deployment.