Pro Talent Crafter is seeking a highly skilled QA professional to build and scale a next-generation Agentic AI Quality Engineering function. This role focuses on validating autonomous AI systems, designing evaluation frameworks, and ensuring high-quality outputs across multiple AI-driven products.
Responsibilities:
- Design and scale an agentic QA model for autonomous AI systems
- Move QA from human-driven validation to AI-led evaluation and continuous quality monitoring
- Establish best practices for testing AI agents across lifecycle stages
- Own QA for 3 core AI products: AI Contact Center solutions, AI Chat & Form-based interaction systems, AI Assistants (autonomous / semi-autonomous agents)
- Define quality benchmarks, SLAs, and success metrics for each product
- Proactively identify quality gaps ahead of customer impact
- Define and track performance outputs for agentic systems (accuracy, latency, resolution quality, hallucination rate, etc.)
- Build frameworks for: Evals & graders (LLM evaluation pipelines), Output scoring and benchmarking, Continuous feedback loops
- Leverage tools like Langfuse for: LLM observability and tracing, Prompt monitoring and performance analysis, Debugging agent behavior in production
- Analyze: Downstream issues, Production tickets, Failure patterns
- Build and scale automation across: Regression testing, Smoke testing, End-to-end agent workflows
- Develop and maintain Playwright-based automation scripts
- Integrate QA into CI/CD pipelines for continuous validation
- Design testing approaches for: Multi-step agent workflows, Context retention and reasoning, Tool usage by agents
- Work with orchestration frameworks like Temporal to: Validate long-running workflows, Test retries, state transitions, and failure handling in agent pipelines
- Account for non-deterministic behavior in AI systems
- Invest additional effort in agentic validation, recognizing higher complexity vs traditional QA
- Define frameworks to predict and prevent failures before customer exposure
- Continuously improve QA processes using AI and automation
- Partner with Product, Engineering, and AI teams to improve system quality
Requirements:
- 5–10+ years in QA / Quality Engineering, with strong automation experience
- Hands-on experience with test automation tools (Playwright preferred)
- API and system testing
- Strong understanding of AI/ML systems (LLMs, conversational AI preferred)
- Evaluation frameworks and benchmarking
- Experience with Temporal (workflow orchestration, stateful systems testing)
- Langfuse (LLM observability, tracing, and evaluation)
- Experience in building QA frameworks from scratch
- Working with production data, logs, and issue triaging
- Experience with LLM eval frameworks, prompt testing, or AI red-teaming
- Familiarity with agentic architectures / autonomous systems
- Exposure to observability and analytics platforms