Design and develop end-to-end NLP pipelines — from classical text processing to state-of-the-art LLM-powered architectures
Build and maintain systems for intent detection, NER, entity extraction, and text classification, both standalone and as components feeding into larger LLM workflows
Design and optimize Retrieval-Augmented Generation (RAG) systems — chunking strategies, vector store architecture, hybrid search (dense + sparse), and re-ranking pipelines
Work with embedding models for semantic search, document retrieval, and intent classification in contact center contexts
Design and implement agentic architectures — tool use, function calling, multi-step reasoning, and orchestration with frameworks like LangChain, LlamaIndex, or custom-built solutions
Develop memory and context management strategies — short-term conversation memory, long-term user context, and context window optimization for multi-turn interactions
Evaluate and benchmark models rigorously: hallucination detection, faithfulness scoring, latency/token cost tradeoffs, and continuous performance monitoring
Integrate AI components into scalable, production-ready microservices with a focus on low-latency inference pipelines
Collaborate with product and engineering to design new AI-powered features and drive innovation across the platform
Conduct practical research with a scientific mindset and a focus on delivery — publications are possible and encouraged
Requirements
1-3 years of experience in a Data Science, AI or NLP Engineer role
Strong programming skills in Python and core Data Science & ML libraries (Pandas, scikit-learn, NLTK, spaCy, Gensim)
Solid understanding of NLP fundamentals — word embeddings, NER, information extraction, intent classification, text similarity
Experience building and delivering ML products in production environments
Hands-on experience with LLMs in production (OpenAI, Anthropic, Mistral, LLaMA, Gemini, or equivalent)
Familiarity with RAG pipelines
Experience with vector databases (Pinecone, Weaviate, Qdrant, pgvector, etc.) and modern embedding models
Understanding of context window management, token budgeting, and prompt design for multi-turn conversations
Experience with LLM Observability and Monitoring
Experience with LLM frameworks such as LangChain, LlamaIndex, or Hugging Face Transformers