QualGent is seeking an AI Reliability QA Engineer to harden their autonomous AI systems for enterprise production environments. The role focuses on engineering reliability into AI-driven workflows, ensuring stability, reproducibility, and trust at scale.

Responsibilities:

Deterministic AI Execution
Identify and eliminate flakiness in AI-generated workflows
Improve reproducibility across CI, staging, and production environments
Design validation layers and guardrails for AI agent behavior
Reduce regression escapes through structured reliability metrics
Evaluate AI-generated test cases for correctness and coverage gaps
Design stress-testing frameworks for AI workflows
Improve system resilience under concurrency and load
Define SLAs and reliability standards for autonomous execution
Instrument execution traces across AI decision paths
Build monitoring dashboards for reliability metrics
Reduce time-to-diagnosis for complex failures
Lead incident reviews focused on systemic improvements

Requirements:

2–10+ years of experience in QA, SDET, automation engineering, or reliability engineering
Proven experience reducing flakiness in CI/CD pipelines
Strong debugging capabilities across frontend, backend, and infrastructure layers
Experience supporting and improving production releases
Systems-level thinking with a focus on failure modes and edge cases
Mobile testing expertise (iOS/Android, emulators, device farms)
Experience with distributed systems
Observability tooling experience (Datadog, Prometheus, OpenTelemetry, Sentry, etc.)
Cloud infrastructure experience (AWS, GCP)
Exposure to LLM-based or agent-driven systems

AI Reliability QA Engineer

Key skills

About this role

Responsibilities:

Requirements: