Nscale is a GPU cloud provider focused on AI, delivering high-performance infrastructure for AI startups and enterprises. The Infrastructure Engineer (OpenStack Neutron Specialist) is responsible for designing, implementing, and managing scalable OpenStack networking platforms, with a specific focus on Neutron and its associated technologies to ensure network availability, performance, and security.
Responsibilities:
- Designing, implementing, and operating scalable, resilient, and secure OpenStack networking platforms, with a strong focus on Neutron and OVN/OVS
- Owning the architecture and day-to-day operation of virtual networking services, including L2/L3 networking, DHCP, metadata, floating IP, NAT, security groups, and tenant segmentation
- Troubleshooting complex control plane and data plane issues across OpenStack networking components and the underlying Linux networking stack
- Driving continuous improvement in network automation, provisioning, validation, monitoring, and recovery using infrastructure-as-code and configuration management tools
- Working closely with compute, storage, and platform engineering teams to ensure seamless integration between Neutron and the wider OpenStack ecosystem
- Leading performance tuning, scalability planning, and resilience improvements for network-heavy and latency-sensitive cloud workloads
- Acting as a 3rd/4th line escalation point for advanced networking incidents, conducting root cause analysis, and driving permanent fixes
- Supporting upgrades, lifecycle management, and change execution across OpenStack networking services with a strong focus on service continuity and operational excellence
- Contributing specialist input to infrastructure roadmap planning, platform standards, and solution design for customer and internal environments
- Supporting pre-sales and solution design activities by providing expert guidance on cloud networking capabilities, constraints, and best practices
- Contributing to upstream OpenStack networking communities, particularly Neutron and related projects such as OVN, through bug reports, code contributions, design discussions, testing, and reviews where appropriate
- Tracking upstream roadmaps, release changes, and community direction to help shape Nscale's networking strategy, upgrade planning, and platform standards
- Representing Nscale's operational requirements and real-world use cases in upstream discussions to help drive improvements that benefit both the business and the broader community
- Ensuring OpenStack networking platforms adhere to security, compliance, and operational standards
- Participating in on-call rotations and incident response activities for critical infrastructure services
Requirements:
- Strong Linux systems administration and troubleshooting experience
- Deep hands-on experience deploying, operating, upgrading, and troubleshooting large-scale OpenStack environments
- Strong specialist knowledge of Neutron, including ML2, OVN, Open vSwitch, routing, DHCP, metadata, provider networks, tenant networks, VLAN/VXLAN/Geneve, and security groups
- Strong understanding of Linux networking concepts including routing, bridging, namespaces, iptables/nftables, bonding, MTU, and packet flow analysis
- Strong experience investigating complex network behaviour using diagnostic and observability tools such as tcpdump, iproute2, ovs/ovn tooling, logs, and metrics
- Strong experience designing and building automation for cloud infrastructure using tools such as Ansible
- Strong Python and Bash skills
- Experience working with highly available production platforms and change management in mission-critical environments
- Ability to collaborate across infrastructure, support, and architecture teams to solve complex technical problems
- Ability to evaluate upstream changes, influence technical direction, and translate community developments into practical outcomes for production platforms
- Experience contributing to or working closely with upstream open-source communities is highly desirable, particularly within OpenStack, Neutron, OVN, Open vSwitch, or related networking projects
- Experience with OpenStack networking at scale in service provider, private cloud, or high-performance infrastructure environments is highly desirable
- Experience with BGP, EVPN, load balancing, or network services integration in OpenStack environments would be beneficial