Role Overview

Own the Agent Framework Roadmap: Define how AI agents within the Tricentis portfolio are structured — their tooling, reasoning patterns, memory architecture, and interaction models — ensuring a consistent and extensible foundation across product lines.
Build the Evaluation Infrastructure: Own the systems and processes that validate agent behavior before deployment, including ground truth datasets, automated evaluation pipelines, and quality gates.
Bridge AI Research and Product Reality: Translate emerging agentic capabilities into well-scoped, implementable product requirements that engineering teams can execute against with confidence.
Drive Execution: This is a hands-on Lead IC role. You will write detailed technical specs, groom backlogs with engineering, and be accountable for the quality bar of every agentic feature we ship.
Own the Agent Quality Bar: Define what 'good' looks like for agentic behavior — establishing the success metrics (Task Success Rate, Goal Completion, Steps-to-Solution, Recovery Rate, Hallucination Rate) that the team builds toward, and holding stakeholders accountable to them before any capability ships to customers.
Prioritize Agent Capabilities: Maintain a clear, evidence-based prioritization of which agent capabilities to build and in what order — balancing customer impact, technical feasibility, and risk.
Make the trade-offs explicit and get alignment across engineering, design, and leadership.
Own the Evaluation Strategy: Ensure the team has the processes, tooling, and resources in place to validate agent behavior rigorously before deployment.
Represent the Customer in AI Design Decisions: Be the voice of the user when engineering teams make decisions about reasoning depth, latency, and failure modes.
Drive Cross-Team Alignment: Ensure that agent framework decisions made by your team are understood and adopted consistently across other Tricentis product lines.
Communicate Progress and Risk: Keep leadership and cross-functional stakeholders informed on what is shipping, what is at risk, and what trade-offs have been made — with enough clarity that decisions can be made quickly at the right level.

Requirements

5–8+ years of Product Management experience, with at least 2+ years in Technical Product Management or AI/Data products.
AI/ML Fluency: Demonstrated experience shipping AI-powered products, with hands-on exposure to LLMs, agentic systems, or intelligent automation.
Technical Background: Bachelor's degree in Computer Science, Engineering, Data Science, or equivalent technical work experience.
Evaluation Mindset: Proven ability to define quality metrics for AI systems and build processes that enforce them — not just measure them after the fact.
Enterprise Experience: Experience designing AI features for complex enterprise environments where reliability, auditability, and security are non-negotiable.
Agentic Product Experience: Prior experience building and shipping products with autonomous or semi-autonomous AI agents. (Nice to have)
Developer or QA Tooling: Experience building products for developers or QA engineers with an understanding of the SDLC. (Nice to have)
Hands-on Tech: Previous experience as a developer or data scientist is a strong plus. (Nice to have)
Global Collaboration: Experience working with distributed teams across time zones. (Nice to have)

Tech Stack

SDLC

Benefits

Flexible working schedule (no core hours)
25 days of paid time off
3 Sick Days
2 days of paid Volunteering Leave per year to get involved in your local community or in a cause that matters to you
Hybrid work environment, with home-office allowance
Meal allowance
Pension Contribution
Life & Disability Insurance
A team of passionate professionals who are experts in their fields
Events for employees to learn, celebrate and socialize (training sessions, hackathons, parties, sports events, board game gatherings, BBQs) and much more

Lead Product Manager – Agent Framework, Evals

Key skills

About this role

Role Overview

Requirements

Tech Stack

Benefits