Drive innovation in reinforcement learning approaches for advanced models
Optimize decision-making and adaptive behavior for enhanced intelligence
Work across a broad spectrum of systems, including resource-efficient models and complex multi-modal architectures
Develop and implement state-of-the-art reinforcement learning algorithms
Establish clear performance targets and track key performance indicators
Collaborate with cross-functional teams to integrate reinforcement learning agents into production systems
Requirements
A degree in Computer Science or related field
Ideally PhD in NLP, Machine Learning, or a related field, complemented by a solid track record in AI R&D (with good publications in A* conferences)
Proven experience with large-scale reinforcement learning experiments, including online RL techniques such as Group Relative Policy Optimization (GRPO)
Deep understanding of reinforcement learning algorithms including state-of-the-art online RL methods
Strong expertise in PyTorch and relevant reinforcement learning frameworks
Practical experience in developing RL pipelines
Demonstrated ability to apply empirical research to overcome RL challenges