Bridge the gap between theoretical AI research and practical business applications by building end-to-end LLM-powered features.
Optimize retrieval-augmented generation (RAG) pipelines and ensure the reliability of AI services in a cloud-native environment.
Design and deliver production applications using model APIs (OpenAI, Anthropic, Gemini) and orchestration frameworks like LangChain, LlamaIndex, or LangGraph.
Build and optimize retrieval systems over proprietary data using vector databases such as Pinecone, Weaviate, or Milvus and hybrid search techniques.
Develop autonomous agents and multi-step reasoning workflows that call external tools and maintain state to solve complex automation tasks.
Establish evaluation pipelines (using frameworks like DeepEval) to measure model drift, accuracy, and latency, ensuring safe and ethical AI outputs.
Package AI applications in Docker containers and manage scalable deployments on cloud platforms (AWS, Azure, or GCP) using CI/CD pipelines.
Design pipelines for data ingestion, cleaning, and chunking to support retrieval and model fine-tuning.
Requirements
Bachelor’s or Master’s degree in Computer Science, AI, Mathematics, or a related technical field.
Expert-level proficiency in Python (3.10+) and familiarity with backend frameworks like FastAPI or Flask.
Strong understanding of machine learning, deep learning architectures (Transformers), and NLP fundamentals.
Hands-on experience with Docker, Kubernetes, and cloud-native AI tools (e.g., AWS Bedrock, Azure AI Search).
Proficiency in SQL and experience with Vector Databases for semantic search.
Strong problem-solving acumen and the ability to explain complex AI behavior to non-technical stakeholders.
Experience fine-tuning open-source models (e.g., Llama 3, Mistral) for specific domains is a plus.
Knowledge of AI ethics, bias mitigation, and responsible AI governance is a plus.
Relevant certifications (e.g., Microsoft Azure AI Engineer Associate, AWS Certified Machine Learning – Specialty) are a plus.