IntelePeer delivers rapidly deployable communications solutions for an always connected world. The AI QA Engineer II/III leads the end-to-end quality validation of AI models and conversational agents to ensure they deliver accurate, safe, and contextually relevant user experiences.
Responsibilities:
- Build prompt‑based test cases to evaluate LLM outputs for correctness, stability, safety, and non‑functional targets (latency, determinism, cost)
- Execute scripted model tests to validate agent behaviors across intents, flows, and edge‑cases
- Maintain voice‑to‑voice and API regression tests to detect model drift or unintended degradation
- Use Smart Analytics/Analysis Agents to review real user interactions and identify issues (misclassification, routing errors, hallucinations)
- Summarize patterns and provide example‑driven insights to internal teams
- Apply HITL policies for low‑confidence model predictions, unknown intents, or out‑of‑scope cases
- Document CX and accuracy impact for model improvement cycles
- Publish evaluation results in TestRail with structured evidence
- Use the nuanced pass taxonomy (not binary pass/fail) to communicate model readiness
Requirements:
- 2-4 years of experience in QA, LLM prompt testing, and conversational AI for level II and 4-7 years' experience for level III
- Bachelor's degree in computer science, Data Science, Linguistics, Cognitive Science, HCI or related AI–adjacent field
- Solid understanding of LLM behavior (hallucination patterns, determinism, prompt sensitivity)
- Progressive experience testing API's, conversational UX and/or machine learning
- Strong analytical skills for pattern recognition in model outputs
- Clear, concise communications skills
- The ability to work in a fast-paced environment and be adaptable to change
- Strong initiative, self-motivated, proactive, and resourceful
- Team player who is willing to go above and beyond to help others
- Coursework or projects involving NLP, LLM prompt engineering, model evaluation or automated testing frameworks