Microsoft is at the forefront of redefining how software is built and experienced in the AI era. The Senior Research Software Development Engineer will focus on advancing language model engine-level capabilities through applied research and production engineering, integrating in-house innovations and state-of-the-art techniques into various Microsoft and third-party engines.
Responsibilities:
- Advance language model engine capabilities through applied research and production engineering, integrating in‑house innovations and state‑of‑the‑art techniques to improve model accuracy, speed, reliability, and expressivity across first‑party and third‑party engines
- Design, implement, and review performance‑critical engine code (primarily in Python and Rust), ensuring high standards for correctness, test coverage, security, diagnosability, and maintainability, while coaching peers through rigorous and timely code reviews
- Apply AI‑native development practices across the full SDLC, using AI tools responsibly for design, coding, testing, and analysis, and taking ownership of the quality and correctness of AI‑assisted outputs while helping establish best practices across the team
- Develop and evolve advanced inference techniques (e.g., speculative decoding, constrained decoding, structured generation), validating design choices through experimentation, benchmarking, and production telemetry
- Own engine‑level design and integration decisions, producing clear design documents, evaluating trade‑offs across multiple architectural options, and collaborating across teams to ensure solutions meet requirements for performance, scalability, reliability, security, and cost
- Drive engineering excellence in production environments, including comprehensive testing strategies, observability, live‑site readiness, incident response, and post‑incident learning, with a focus on reducing operational risk in multi‑tenant inference systems
- Contribute to and leverage open‑source LM infrastructure where appropriate, responsibly reusing and extending external code, sharing learnings with the broader community, and continuously staying current with emerging research, tools, and engine‑level techniques
Requirements:
- Bachelor's Degree in Computer Science or related technical field AND 4+ years technical engineering experience with coding in languages including, but not limited to Rust or C++, and Python OR equivalent experience
- Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include but are not limited to the following specialized security screenings: Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years
- Master's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, Rust or C++, and Python OR Bachelor's Degree in Computer Science or related technical field AND 8+ years technical engineering experience with coding in languages including, but not limited to, Rust or C++, and Python OR equivalent experience
- 5+ years of professional software engineering experience, including ownership of complex, production‑quality systems
- Strong proficiency in Python and at least one systems programming language (e.g., Rust, C++, or equivalent), with experience writing and maintaining performance‑critical code
- Open‑source contributions or industry experience in language model infrastructure (e.g., vLLM, sglang, llguidance, or comparable LM libraries), including work on core engine logic rather than application layers
- Hands‑on familiarity with advanced inference techniques, such as speculative decoding, constrained decoding, or related inference‑time capabilities