Cornelis Networks delivers high-performance scale-out networking solutions for AI and HPC datacenters. They are seeking a Senior Network Linux Kernel Engineer to design, develop, and optimize the software stack for their innovative AI networking fabric, collaborating with industry experts to enhance high-performance networking solutions.
Responsibilities:
- Design and develop high-performance kernel drivers and user-space libraries for our networking hardware
- Build and optimize networking protocols at L2 (Ethernet), L3 (IP), and L4 (TCP/UDP) layers, tailored for AI/ML workloads
- Conduct deep-dive performance analysis and software optimization across the entire stack, identifying and eliminating bottlenecks
- Collaborate with the hardware team to influence ASIC design and ensure software/hardware co-design principles are met
- Develop robust testing, validation, and debugging tools for our networking stack
- Contribute to a culture of technical excellence, continuous improvement, and collaborative problem-solving
Requirements:
- Bachelor's or Master's degree in Computer Science, Computer Engineering, or a related field
- Proven experience in low-level systems programming with C/C++
- Strong understanding of Linux kernel driver development and internal architecture
- Deep knowledge of networking fundamentals and L2, L3, and L4 protocols
- Demonstrated experience in software optimization, profiling, and performance tuning
- A self-motivated and proactive mindset with a strong sense of ownership and the ability to work effectively in a dynamic, fast-paced startup culture
- Excellent teamwork and communication skills
- Working knowledge of BSD kernel internals (e.g., FreeBSD), including networking and driver subsystems, or prior experience developing or debugging kernel-level code on BSD-based systems
- Hands-on experience with DPDK or similar user-space networking frameworks (e.g., VPP, XDP)
- Experience developing software for high-performance NICs or SmartNICs
- Understanding of the networking requirements of distributed AI/ML training workloads (e.g., NCCL, MPI)
- Familiarity with RoCE (RDMA over Converged Ethernet) or other RDMA protocols
- Experience working with Ethernet/Switch ASICs or network processor silicon (e.g., Broadcom, Marvell, NVIDIA, Intel)