Build the CI&D quality gates for backend and AI workflows so engineers get faster, more trustworthy signals before changes reach production.
Create reusable validation infrastructure for AI-powered backend features, including scenario-based evals, staging validation flows, generated test profiles, and higher-signal E2E checks.
Improve the feedback loop between development and production by connecting eval results, CI outcomes, runtime telemetry, and member-facing signals into one operational picture.
Help standardize the shared LLM delivery surface in practice: unified client patterns, trace capture, environment setup, and operational guardrails that feature squads can adopt without bespoke platform work.
Partner closely with backend engineers, AI engineers, PMs, and adjacent platform teams to turn release-confidence needs into reusable workflows that scale across the platform rather than one-off fixes for a single squad.
Requirements
Strong hands-on experience building and operating production backend platforms, developer infrastructure, CI/CD systems, or internal engineering tooling used by real product teams.
Strong software engineering depth in a backend language such as Python, plus practical experience with cloud infrastructure, production debugging, and maintainable system design.
Experience designing validation or quality systems that teams actually trust: CI gates, eval flows, test infrastructure, telemetry pipelines, or release automation.
A track record of improving delivery confidence for other engineers, not just operating infrastructure manually.
Comfort supporting modern AI or LLM-backed systems in production through some combination of evals, tracing, rollout safety, review signals, or operational feedback loops.
Strong collaboration and communication skills, especially when translating vague platform pain points into practical systems that many teams can use.