Celestica is a leading provider of technology solutions, and they are seeking an experienced full-stack software developer to design and develop next-generation AI-enabled automation infrastructure. The role involves building a robust control center for managing network automation test infrastructure, requiring collaboration with various teams and technical leadership in complex projects.
Responsibilities:
- Lead the design, development and implementation of technical solutions for complex projects, involving multiple domains. Participate in project planning and scheduling
- Global SME with comprehensive knowledge and industry recognition. Provides technical leadership and direction to a global team of engineers
- Take responsibility for non-technical elements of an engineering project (people, financials etc.)
- Review and interpret customer specifications and may act as primary customer contact
- Analyze trade-offs in complex systems and recommend solutions. Develops deployment strategies and plans
- Lead the deployment of strategic complex programs and coordinate site-wide deployment efforts
- May manage relationships with key vendors/partners
- Analyze, design and develop tests and test-automation suites
- Design and develop a processing platform using various configuration management technologies
- Test software development methodology (may be done in agile environment)
- Provide ongoing maintenance, support and enhancements in existing systems and platforms
- Collaborate cross-functionally with customers, users, project managers and other engineers including Peer-Reviews to achieve elegant solutions
- Provide recommendations for continuous improvement
- Work alongside other engineers on the team to elevate technology and consistently apply best practices
- Keep up to date with relevant industry knowledge and regulations
Requirements:
- 12 to 18 years of experience
- Bachelor degree or consideration of an equivalent combination of education and experience
- Deep expertise in SONiC, SAI (Switch Abstraction Interface), and standard protocols (BGP, EVPN, VXLAN)
- Expert-level knowledge of SPyTest and Python-based automation
- Experience with IXIA (IxNetwork/IxLoad) and physical switch hardware (Mellanox/NVIDIA, Broadcom-based whitebox)
- Strong proficiency in Python, C/C++, Rust, or Java; experience building RESTful APIs and cloud-native backends (GCP/Azure)
- Familiarity with integrating LLM APIs (like Google Gemini) for text/log analysis
- Advanced experience with GitHub Actions, Azure DevOps or Jenkins, and containerization (Docker/Kubernetes)
- Lead the design, development and implementation of technical solutions for complex projects, involving multiple domains
- Participate in project planning and scheduling
- Provide technical leadership and direction to a global team of engineers
- Take responsibility for non-technical elements of an engineering project (people, financials etc.)
- Review and interpret customer specifications and may act as primary customer contact
- Analyze trade-offs in complex systems and recommend solutions
- Develop deployment strategies and plans
- Lead the deployment of strategic complex programs and coordinate site-wide deployment efforts
- May manage relationships with key vendors/partners
- Analyze, design and develop tests and test-automation suites
- Design and develop a processing platform using various configuration management technologies
- Test software development methodology (may be done in agile environment)
- Provide ongoing maintenance, support and enhancements in existing systems and platforms
- Collaborate cross-functionally with customers, users, project managers and other engineers including Peer-Reviews to achieve elegant solutions
- Provide recommendations for continuous improvement
- Work alongside other engineers on the team to elevate technology and consistently apply best practices
- Keep up to date with relevant industry knowledge and regulations
- Architect a CI/CD Pipeline: Design the integration between Git-based workflows and physical hardware labs, ensuring code changes trigger automated builds and deployments to SONiC-based switches
- Lead the development of a cloud-hosted GUI and backend services that securely manage and command on-premise physical test beds
- Oversee the management of physical test beds, ensuring consistent state and availability for automated testing
- Standardize automated testing using SPyTest, ensuring robust coverage for NOS (Network Operating System) features
- Integrate IXIA traffic generators into the automated suite to perform high-scale performance, stress, and regression testing
- Own the final validation gate, ensuring that no code reaches production without passing a rigorous, automated physical battery
- Build and deploy AI/LLM-based agents to parse complex log files and SPyTest results to identify the 'root cause' of test failures automatically
- Develop agents capable of test bed failure recovery (e.g., automatically power-cycling hung PDUs, re-flashing corrupted ONIE images, or re-seating virtual links)
- Leverage AI to analyze long-term software quality trends and predict potential regressions before they occur
- Active contributor to the Azure/SONiC open-source community
- Experience building custom dashboards using React or Vue.js
- Knowledge of deploying and operating software within GCP
- Background in developing 'Self-Healing' infrastructure or AIOps