About this role

Hark is an artificial intelligence company focused on developing advanced, personalized intelligence that interacts with the world through various modalities. The Member of Technical Staff, Post-training will lead the development of strategies for coding agents, utilizing reinforcement learning and simulation to enhance model capabilities at scale.

Responsibilities:

Design and implement post-training strategies, primarily RL-based, to develop strong coding agents capable of multi-step reasoning, tool use, and long-horizon task completion
Build and scale simulation and scaffolding environments for agentic RL: code execution sandboxes, computer use environments, tool-calling harnesses, and verifiable reward signals
Develop reward modeling pipelines — including outcome-based, execution-based, and process-based reward signals — and iterate on them based on training dynamics
Scale synthetic data generation and trajectory distillation pipelines that feed RL training and improve sample efficiency
Design and run rigorous ablations to understand how algorithm choice, data mixture, reward shaping, and scale interact in the agentic setting
Build evaluation frameworks grounded in real agent tasks — code correctness, execution success, multi-step tool use — to measure progress and guide iteration
Collaborate with mid-training, infrastructure, and product teams to translate research insights into durable improvements on the model

Member of Technical Staff, Post-training

Key skills

About this role

Responsibilities: