ZS is a management consulting and technology firm focused on improving life and how we live it. They are seeking a Lead AI Engineer to design and implement LLM-powered applications, optimize RAG pipelines, and contribute to the integration of AI systems into production software environments.
Responsibilities:
- Design and implement LLM-powered applications using state-of-the-art transformer models
- Build and optimize RAG pipelines using embeddings, chunking strategies, and vector search
- Experiment with prompt engineering, structured outputs (JSON schemas/function calling), and tool-augmented LLMs (agents/workflows)
- Fine-tune models using techniques such as LoRA, PEFT, and instruction tuning
- Develop and evaluate embedding models for similarity search and semantic retrieval
- Conduct LLM evaluation using automated and human-in-the-loop techniques (offline + online)
- Optimize inference workflows for latency, GPU utilization, and cost efficiency (quantization, batching, caching)
- Build and maintain REST API Services (FastAPI etc.) to deploy LLM/RAG endpoints, integrate with product systems, and support scalable inference
- Contribute to integration of AI systems into production software environments (CI/CD, monitoring, reliability)
- Research and prototype cutting-edge approaches in Generative AI and share learnings with the team