Design and implement production-grade multi-agent systems using modern agent frameworks (e.g., Pydantic AI, Agent Harness, Tool-Calling, Code Execution)
Build agent workflows that integrate context retrieval, reasoning, tool execution, validation, and compliance checks
Develop distributed services for agent execution with strong observability, monitoring, and failure handling
Establish evaluation frameworks for multi-step reasoning accuracy, groundedness, hallucination mitigation, and financial correctness
Implement memory management, context handling, and agent state persistence strategies
Partner with product, design, and engineering teams to translate business requirements into robust agent architectures
Optimize systems for latency, cost efficiency, and reliability in production
Contribute to infrastructure decisions around model serving, vector databases, caching, and orchestration layers
Requirements
3+ years of experience building and shipping Generative AI and LLM applications into production
6+ years of ML experience
Demonstrated experience designing and deploying multi-agent systems of various architecture
Strong experience with multimodal LLMs, knowledge graph, data synthesis, LLM fine tuning, reinforcement learning, agent harness, agent memory
Deep proficiency in Python and modern AI frameworks
Experience with distributed systems, cloud infrastructure (AWS/GCP/Azure), and containerized deployments
Experience implementing monitoring, evaluation, and reliability safeguards for AI systems
Strong systems thinking — ability to design beyond single-model solutions toward coordinated, multi-component architectures
Resilience and adaptability
experience working at early-stage startups is a plus