C++CloudDistributed SystemsOpen SourcePythonPyTorchCAIMachine LearningMLClaudeLlamaLangChainKubeflowLangGraphGitHub ActionsGitHubCI/CDProduct ManagementRemote Work
About this role
Role Overview
Collaborate with Staff Engineers, Engineering, Product Management, and User Experience to define customer needs and use cases.
Develop and implement comprehensive unit, integration, and end-to-end tests to guarantee the reliability and performance in the upstream project, maintaining CI/CD workflows in GitHub, and ensuring downstream quality.
Create robust AI/ML software tools to enable AI Application development and contribute to a healthy open source community.
Participate in AI-assisted code reviews, utilizing tools that provide real-time feedback, identify potential bugs, security vulnerabilities, and adherence to coding standards, contributing to a more thorough and efficient review process.
Proactively utilize AI-assisted development tools (e.g., GitHub Copilot, Cursor, Claude Code) for code generation, auto-completion, and intelligent suggestions to accelerate development cycles and enhance code quality.
Create and maintain clear, concise upstream technical documentation including API references and user guides and collaborating with our internal tech writers to create robust downstream documentation.
Evaluate and integrate the latest advancements in AI/ML technologies and toolkits to improve existing systems and develop new innovative solutions.
Requirements
7 years of advanced Python development experience as a Software Engineer in Open Source communities with focus in DevOps or CI and experience in AI/ML
Advanced knowledge developing unit, functional, and end-to-end (E2E) test cases and automation.
Advanced knowledge designing and exercising robust and scalable APIs used in highly scaled and performant Distributed Systems
Advanced knowledge creating automation for GitHub, using GitHub Actions or related continuous integration tools.
Experience with AI and Machine Learning platforms, tools, and frameworks, such as LlamaStack, LangChain, PyTorch, LLaMA.cpp, vLLM, LangGraph, and Kubeflow.
Experience developing, deploying or maintaining On-prem or Cloud Infrastructure
Ability to quickly learn and use new tools and technologies.