NVIDIA is known as the AI computing company, focused on technological advancement in computing. The role involves designing and implementing microcontroller firmware for GPU Server platforms and collaborating with security and hardware teams to ensure code aligns with security goals.
Responsibilities:
- Design and implement Microcontroller Firmware for GPU Server platforms, focusing on but not limited to ARM M-class microcontrollers
- Develop C/C++ server manageability features in an RTOS embedded-optimized environment
- Perform hands-on work with microcontroller firmware bring-up, debugging, performance analysis, and coding manageability features for NVIDIA’s Server platforms
- Develop embedded management software to enable reporting and connectivity between server management devices
- Implement register-based communication and DMTF standard messaging protocols for seamless interaction between BMC, GPUs, switches, memory, I/O expanders, sensors, and local microcontroller peripherals
- Design a highly portable microcontroller framework that will be implemented across a wide variety of server management subsystems
- Develop and review code, write and review design documents, and collaborate with team members to meet product requirements
- Instrument code for maximum coverage, automate unit tests, maintain detailed test case reports, and provide software quality reports based on static analysis, code coverage, and microcontroller load
- Collaborate with security and hardware teams to ensure code aligns with security goals and influence hardware design and architecture review
- Develop performance-optimized active monitoring BMC solutions using DMTF Standards such as MCTP, Redfish, SPDM, and PLDM specifications
Requirements:
- A Bachelor of Science Degree (or higher) in Electrical Engineering or Computer Science or equivalent experience
- 12+ years of experience in low level microcontroller Firmware development on embedded microcontrollers using Zephyr or FreeRTOS
- Demonstrated experience in developing BMC and/or microcontroller firmware for managing CPU, GPU, Network and Storage Devices
- Experience with the following embedded interfaces - USB and I3C
- Sound experience working with ARM Integrated Development Environments (IDE), debuggers, logic and protocol analyzers, and oscilloscopes
- A deep understanding of interrupt schemes, multi-threading, DMA, memory management, and working in resource restricted embedded environments
- Strong embedded programming and scripting skills using C/C++, Bash, Python, Go, etc
- Experience reviewing and using hardware schematics, reference manuals, and datasheets for embedded development
- Expertise working with server manageability protocols such as MCTP, PLDM, SPDM, SMBUS, and OCP recovery
- Solid understanding of Linux fundamentals, various distributions, packages, upgrade mechanisms, and image building/deployment
- Hands on background working with microcontroller embedded firmware development and OOB management
- Hands-on experience implementing MCTP stack in embedded environments or FPGA
- Contributor to industry groups like Open Compute, OpenBMC, DMTF and open source
- Expertise in system software and platform security for x86/ARM based Rack/Blade server systems