Design & ship mission grade GenAI: Build agentic workflows and RAG systems tailored to mission data and environments; target low hallucination, tight p95 latency, and predictable cost.
Agent frameworks & orchestration: Apply patterns from LangChain/LlamaIndex/Semantic Kernel; design task decomposition, tool use, guardrails, and recovery/fallback strategies.
Platform integration (no model training): Implement with AWS Bedrock, Azure OpenAI, Google Vertex AI, Amazon Kendra, and managed services (e.g., Document AI, Gemini, Gemma).
LLM selection & evaluation: Compare models for quality, safety, latency, cost; author/test prompts & policies; deploy with observability and safe rollback/fallback.
RAG done right: Build retrieval pipelines & vector search (Pinecone, Weaviate, OpenSearch, pgvector, FAISS/Chroma); handle data prep, chunking, metadata, and IRstyle evals (e.g., NDCG) to maximize signal to noise.
Production rigor: Instrument metrics/logs/traces; run A/B experiments; maintain incident playbooks; and implement safety & compliance guardrails.
SRE & FinOps for AI: Define SLIs/SLOs (quality/latency/safety/cost), run on call and postmortems, reduce MTTR; meter usage and optimize token/spend.
Reusable platform components: Ship SDKs, CI/CD templates, Terraform/IaC modules, evaluation harnesses that accelerate multiple mission team not one-off projects.
Operate in real world constraints: Deliver into hybrid, restricted, or air gapped environments with Zero Trust principles and audit ready controls.
Requirements
End-to-end ownership of production systems: integration → deployment → observability → incident response.
Hands-on experience with LLMs, transformer based apps, and RAG in production.
Strong Python
Experience with vector search and retrieval (Pinecone, Weaviate, OpenSearch, pgvector, FAISS/Chroma) and grounding AI in enterprise/mission data.