Develop RAG Pipelines: Design and implement backend services in Python (FastAPI) to ingest, chunk, embed, and retrieve unstructured data such as PDFs, network schematics, and documentation.
Agent Orchestration: Implement agentic workflows using frameworks such as LangChain, LangGraph, or Vertex AI Agent Builder, enabling multi-step reasoning and tool execution.
Prompt Engineering & Optimisation: Design, test, and refine system prompts to ensure agents follow strict engineering rules while optimising for latency, reliability, and cost.
System Integration: Integrate AI agents with external APIs and enterprise systems such as Jira, ServiceNow, inventory systems, and internal platforms.
Cloud Deployment: Deploy and operate services on GCP using Cloud Run, Pub/Sub, and related managed services.
Quality & Reliability: Write clean, testable code and ensure solutions meet enterprise standards for security, observability, and operational readiness.
Collaboration: Work closely with cross-functional teams to refine requirements, review designs, and deliver production-grade solutions.
Requirements
Expert-level Python development skills, including experience with FastAPI or Flask and asynchronous programming.
Hands-on experience building GenAI applications using LangChain, LlamaIndex, LangGraph, or Google Vertex AI.
Practical experience working with vector databases such as Pinecone, Milvus, Weaviate, or pgvector.
Strong understanding of embeddings, similarity search, and RAG design patterns.
Experience deploying applications on Google Cloud Platform, including Cloud Run and Pub/Sub.
Solid understanding of REST APIs, system integration, and secure service design.
Experience working in Agile, cross-functional engineering teams.