Celestica is a leader in design and manufacturing solutions, and they are seeking a visionary Team Lead for their Network Automation Infrastructure. The role involves leading a team to design and deploy a next-generation automation platform that integrates cloud-native software development with physical hardware, focusing on SONiC and AI-driven operations.
Responsibilities:
- Architect a CI/CD Pipeline: Design the integration between Git-based workflows and physical hardware labs, ensuring code changes trigger automated builds and deployments to SONiC-based switches
- Cloud-to-On-Prem Connectivity: Lead the development of a cloud-hosted GUI and backend services that securely manage and command on-premise physical test beds
- Hardware Abstraction: Oversee the management of physical test beds, ensuring consistent state and availability for automated testing
- Framework Leadership: Standardize automated testing using SPyTest, ensuring robust coverage for NOS (Network Operating System) features
- Traffic Emulation: Integrate IXIA traffic generators into the automated suite to perform high-scale performance, stress, and regression testing
- Regression Management: Own the final validation gate, ensuring that no code reaches production without passing a rigorous, automated physical battery
- Failure Analysis Agents: Build and deploy AI/LLM-based agents to parse complex log files and SPyTest results to identify the "root cause" of test failures automatically
- Self-Healing Test Beds: Develop agents capable of test bed failure recovery (e.g., automatically power-cycling hung PDUs, re-flashing corrupted ONIE images, or re-seating virtual links)
- Quality Insights: Leverage AI to analyze long-term software quality trends and predict potential regressions before they occur
- Lead a cross-functional team of Network, Software, and DevOps engineers
- Define the technical roadmap and drive the adoption of Platform Engineering Ops culture across the organization
Requirements:
- 12 to 18 years of experience
- Bachelor degree or consideration of an equivalent combination of education and experience
- Deep expertise in SONiC, SAI (Switch Abstraction Interface), and standard protocols (BGP, EVPN, VXLAN)
- Expert-level knowledge of SPyTest and Python-based automation
- Experience with IXIA (IxNetwork/IxLoad) and physical switch hardware (Mellanox/NVIDIA, Broadcom-based whitebox)
- Strong proficiency in Python, C/C++, Rust, or Java; experience building RESTful APIs and cloud-native backends (GCP/Azure)
- Familiarity with integrating LLM APIs (like Google Gemini) for text/log analysis
- Advanced experience with GitHub Actions, Azure DevOps or Jenkins, and containerization (Docker/Kubernetes)