ZS is a management consulting and technology firm focused on improving life and how we live it. They are seeking a Lead AI Engineer to design and implement LLM-powered applications, optimize RAG pipelines, and contribute to the integration of AI systems into production software environments.

Responsibilities:

Design and implement LLM-powered applications using state-of-the-art transformer models
Build and optimize RAG pipelines using embeddings, chunking strategies, and vector search
Experiment with prompt engineering, structured outputs (JSON schemas/function calling), and tool-augmented LLMs (agents/workflows)
Fine-tune models using techniques such as LoRA, PEFT, and instruction tuning
Develop and evaluate embedding models for similarity search and semantic retrieval
Conduct LLM evaluation using automated and human-in-the-loop techniques (offline + online)
Optimize inference workflows for latency, GPU utilization, and cost efficiency (quantization, batching, caching)
Build and maintain REST API Services (FastAPI etc.) to deploy LLM/RAG endpoints, integrate with product systems, and support scalable inference
Contribute to integration of AI systems into production software environments (CI/CD, monitoring, reliability)
Research and prototype cutting-edge approaches in Generative AI and share learnings with the team

Lead AI Engineer

Key skills

About this role

Responsibilities: