Pro Talent Crafter is seeking a highly skilled QA professional to build and scale a next-generation Agentic AI Quality Engineering function. This role focuses on validating autonomous AI systems, designing evaluation frameworks, and ensuring high-quality outputs across multiple AI-driven products.

Responsibilities:

Design and scale an agentic QA model for autonomous AI systems
Move QA from human-driven validation to AI-led evaluation and continuous quality monitoring
Establish best practices for testing AI agents across lifecycle stages
Own QA for 3 core AI products: AI Contact Center solutions, AI Chat & Form-based interaction systems, AI Assistants (autonomous / semi-autonomous agents)
Define quality benchmarks, SLAs, and success metrics for each product
Proactively identify quality gaps ahead of customer impact
Define and track performance outputs for agentic systems (accuracy, latency, resolution quality, hallucination rate, etc.)
Build frameworks for: Evals & graders (LLM evaluation pipelines), Output scoring and benchmarking, Continuous feedback loops
Leverage tools like Langfuse for: LLM observability and tracing, Prompt monitoring and performance analysis, Debugging agent behavior in production
Analyze: Downstream issues, Production tickets, Failure patterns
Build and scale automation across: Regression testing, Smoke testing, End-to-end agent workflows
Develop and maintain Playwright-based automation scripts
Integrate QA into CI/CD pipelines for continuous validation
Design testing approaches for: Multi-step agent workflows, Context retention and reasoning, Tool usage by agents
Work with orchestration frameworks like Temporal to: Validate long-running workflows, Test retries, state transitions, and failure handling in agent pipelines
Account for non-deterministic behavior in AI systems
Invest additional effort in agentic validation, recognizing higher complexity vs traditional QA
Define frameworks to predict and prevent failures before customer exposure
Continuously improve QA processes using AI and automation
Partner with Product, Engineering, and AI teams to improve system quality

Requirements:

5–10+ years in QA / Quality Engineering, with strong automation experience
Hands-on experience with test automation tools (Playwright preferred)
API and system testing
Strong understanding of AI/ML systems (LLMs, conversational AI preferred)
Evaluation frameworks and benchmarking
Experience with Temporal (workflow orchestration, stateful systems testing)
Langfuse (LLM observability, tracing, and evaluation)
Experience in building QA frameworks from scratch
Working with production data, logs, and issue triaging
Experience with LLM eval frameworks, prompt testing, or AI red-teaming
Familiarity with agentic architectures / autonomous systems
Exposure to observability and analytics platforms

Lead / Senior QA Engineer – Agentic AI Systems

Key skills

About this role

Responsibilities:

Requirements: