AIMachine LearningMLGenerative AILLMAgenticLeadershipMentoringCommunicationCollaborationRemote Work
About this role
Role Overview
Define and lead applied AI initiatives across agent systems, LLM evaluation, and model optimization
Own ambiguous problem spaces end-to-end: from framing and experimentation to production impact
Design and implement evaluation & benchmarking frameworks, leveraging and challenging industry standards — and where needed, defining new ones
Drive innovation in agentic systems, including topics like routing, memory, and context engineering
Prototype and validate new approaches (e.g., model combinations, fine-tuning strategies, or open-weight models)
Translate research into production-ready solutions, working closely with engineering teams
Act as a technical authority and multiplier, mentoring others and shaping best practices
Establish and lead the Applied Science guild, fostering knowledge sharing and raising the bar across teams
Influence product and technical strategy, identifying where applied science can unlock the most value
Requirements
12+ years of experience in Applied Machine Learning, Applied Science, or a related field, with a proven track record of delivering ML/AI systems in production
Experience with agent-based systems or multi-step LLM workflows
Strong recent experience working with LLMs and Generative AI systems, ideally in production environments
Demonstrated ability to own and drive ambiguous problem spaces end-to-end—from problem framing to measurable impact
Deep understanding of evaluation methodologies, benchmarking, and model performance analysis, including human-in-the-loop approaches
Hands-on technical skills, with the ability to prototype, experiment, and ship solutions in collaboration with engineering teams
Experience working on production systems, balancing speed, quality, and scalability
Proven technical leadership and influence, with experience shaping direction, mentoring others, or driving cross-team initiatives
Ability to operate at both strategic and execution levels, connecting long-term vision with day-to-day decisions
Strong communication skills, with the ability to clearly articulate ideas and influence technical and non-technical stakeholders
Evidence of broader impact (e.g., published work, open-source contributions, internal initiatives, or industry influence) is a plus.