Red Hat is the world’s leading provider of enterprise open source software solutions, and they are seeking a Senior Machine Learning Engineer focused on model optimization algorithms. In this role, you will collaborate with product and research teams to develop state-of-the-art deep learning software and optimize LLM training and deployment pipelines.
Responsibilities:
- Contribute to the design, development, and testing of various inference optimization algorithms in the vLLM, and related projects, such as llm-d
- Create and manage inference serving deployment pipelines
- Benchmark, profile, and evaluate different parallelizations, quantization and sparsification approaches to determine the best performance for specific hardware and models
- Stay up-to-date with the latest advancements in the open source LLM model architecture, LLM Inference parallelizations/optimizations techniques, and quantization research
- Stay up-to-date of latest CPU and GPU hardware architecture and features to boost AI inference performance
- Give thoughtful and prompt code reviews
- Continuous collaboration with internal and external open source comitters and contributors while contributing to vLLM and related projects
Requirements:
- Strong understanding of machine learning and deep learning fundamentals with experience in one or more of LLM Inference Optimizations, Computer Vision, NLP, and reinforcement learning
- Experience with tensor math libraries such as PyTorch and NumPy
- Strong programming skills with proven experience implementing Python based machine learning solutions
- Ability to develop and implement research ideas and algorithms
- Experience with mathematical software, especially linear algebra
- Understanding of Linear Algebra, Gradients, Probability, and Graph Theory
- BS, or MS, or PhD in computer science or computer engineering or a related field