Design multi-agent architectures with robust state management, memory, and routing.
Choose and implement leading frameworks such as LangGraph/LangChain Agents, Microsoft AutoGen, CrewAI, LlamaIndex Agents, Semantic Kernel, or Haystack Agents—and justify trade-offs.
Build modular components (planners, tool registries, policy guards, evaluators) that are reusable across clients and domains.
Integrate enterprise tools and data sources via function/tool calling, webhooks, and event-driven flows (Queues/Service Bus/Functions).
Implement retrieval-augmented generation (RAG) patterns with vector stores (Azure AI Search, pgvector, MongoDB Atlas, Pinecone, Weaviate, Milvus) and structured knowledge (SQL/Graph).
Add deterministic fallbacks, circuit breakers, and caching to keep latency and cost predictable.
Define SLIs/SLOs for agent runs; implement tracing, metrics, and logging (e.g., Langfuse + OpenTelemetry) and build dashboards for run-level analytics.
Create evaluation harnesses (automatic + human-in-the-loop) using tools such as Ragas, DeepEval, promptfoo to measure groundedness, task success, safety, and cost.
Productionize with CI/CD, environment promotion, feature flags, and canary strategies; optimize cost-per-task and time-to-success.
Enforce content and safety policies (redaction, classification, guardrails) with policy-as-code; implement role/tenant isolation and data minimization.
Collaborate with security teams to align to ISO 27001/SOC 2/NIST/HIPAA/GDPR contexts; deliver audit-ready evidence for agentic workflows.
Build privacy-first patterns (no data exfiltration by default, least-privilege tool access, secure prompt/trace storage).
Work directly with enterprise client teams to translate business processes into agentic designs; present trade-offs and proofs-of-value that lead to production.
Partner with solution leads to create domain-specific agents (e.g., RFP assist, incident RCA drafting, knowledge ops) and reusable templates.
Requirements
5–8+ years in software/platform engineering with recent production LLM applications (not just prototypes).
Hands-on expertise with agentic frameworks (one or more of: LangGraph/LangChain Agents, AutoGen, CrewAI, LlamaIndex Agents, Semantic Kernel, Haystack Agents) and tool/function-calling patterns.
Strong RAG engineering across vector DBs, chunking/embedding strategies, metadata/search ranking, and grounding techniques.
Proven track record building observable, cost-aware, and secure LLM systems (tracing, evals, guardrails, secrets/IAM, PII handling).