CloudDistributed SystemsJavaKubernetesMicroservicesOpenShiftPythonRustGoAIArtificial IntelligenceMachine LearningMLLarge Language ModelsMLOpsCI/CDCommunication
About this role
Role Overview
Contribute high-quality, maintainable code to open-source AI/ML projects and internal tooling
Develop and optimize scalable toolkits for synthetic data generation, model training, and inference-time scaling
Document system designs, API specifications, and model performance metrics to ensure transparency and reproducibility
Evaluate existing product offerings and iterate on improvements based on telemetry metrics and direct user feedback
Proficiently leverage AI-assisted development tools to accelerate coding, testing, and documentation workflows
Serve as the Technical Lead for your assigned components, defining technical standards and providing architectural guidance to the wider team
Influence the architectural direction of the Red Hat AI platform to ensure readiness for cutting-edge ML algorithms
Work across multiple squads to align technical priorities, contribute to sprint planning, and translate high-level requirements into actionable engineering tasks
Lead comprehensive code reviews and enforce best practices in testing (CI/CD), security, and maintainability
Mentor and upskill junior engineers, fostering a culture of technical excellence and continuous learning
Collaborate closely with Research Scientists and Product Managers to operationalize complex algorithms.
Requirements
Bachelor's degree in Computer Science, Engineering, or equivalent practical experience
Must be able to work Hybrid in Boston
Proficiency in at least one modern backend programming language (e.g., Python, Go, Rust, Java) with a strong grasp of distributed systems patterns
Solid experience designing and deploying microservices on containerized platforms (e.g., Kubernetes, OpenShift) at large scale
Experience with large language models and model customization techniques
Prior experience specifically building or optimizing developer tooling for ML/AI workflows (MLOps)
Demonstrated experience with rigorous testing methodologies, including unit, integration, and performance testing
Proficiency in integrating AI tools into your daily development workflow to enhance productivity and efficiency
Demonstrated interest in Artificial Intelligence/Machine Learning with a self-motivated drive to understand and navigate ambiguity in fast-paced, AI research-oriented environments
8+ years of software development experience, with a track record of delivering complex systems in cloud environments
Proven ability to lead technical initiatives
Ability to manage multiple complex projects concurrently, balancing immediate delivery with long-term architectural health
Excellent written and verbal communication skills, with the ability to articulate complex technical concepts to non-technical stakeholders.
Tech Stack
Cloud
Distributed Systems
Java
Kubernetes
Microservices
OpenShift
Python
Rust
Go
Benefits
Comprehensive medical, dental, and vision coverage
Flexible Spending Account
healthcare and dependent care
Health Savings Account
high deductible medical plan
Retirement 401(k) with employer match
Paid time off and holidays
Paid parental leave plans for all new parents
Leave benefits including disability, paid family medical leave, and paid military leave
Additional benefits including employee stock purchase plan, family planning reimbursement, tuition reimbursement, transportation expense account, employee assistance program, and more!