LinuxPythonGoC++CPerformance OptimizationRemote Work
About this role
Role Overview
Focus on understanding system behavior across multiple layers, identifying performance bottlenecks, and driving improvements that shape how our clusters are built, operated, tuned, and validated.
Investigate and troubleshoot performance issues of GPU cluster under real workloads (training and inference).
Evaluate and integrate new hardware, system configurations and tuning approaches through software stack.
Support complex performance-related escalations from internal teams and customers.
Work closely with infrastructure, software engineering and hardware vendor teams (e.g. NVIDIA, Mellanox, Intel).
Contribute to hardware and cluster qualification (acceptance), ensuring systems meet performance expectations.
Requirements
5+ years of professional experience in system-level software development (focused on performance optimization, low-level programming).
3+ years of hands-on experience with Linux systems (administration, troubleshooting, and performance tuning).
In-depth understanding of server architecture, including PCIe devices, NICs, Linux OS/Kernel, and high-performance computing (HPC) systems.
Strong proficiency in one or more performance-oriented programming languages (C/C++, Go, Python).
Tech Stack
Linux
Python
Go
Benefits
Health insurance: 100% company-paid medical, dental and vision coverage for employees and families.
401(k) plan: Up to 4% company match with immediate vesting.
Parental leave: 20 weeks paid for primary caregivers, 12 weeks for secondary caregivers.
Remote work reimbursement: Up to $85/month for mobile and internet.
Disability & life insurance: Company-paid short-term, long-term and life insurance coverage.