ChatGPT Jobs is seeking a Senior AI Engineer to join Dr.Evidence, an AI-powered insights platform for life sciences. The role involves researching and prototyping ML/NLP architectures, developing scalable training pipelines, and collaborating cross-functionally to enhance AI capabilities.
Responsibilities:
- Research & Prototyping
- Rapidly prototype state-of-the-art ML/NLP/LLM architectures using frameworks like Python, PyTorch, Hugging Face, and LangChain/LangGraph
- Assess open-source and commercial models (e.g., Llama 3, Mistral, GPT-series, Qwen, and other modern LLM architectures) for accuracy, latency, cost, hallucination risk, and compliance
- Run structured model evaluations (accuracy, relevance, precision/recall, hallucination checks)
- Model Development & Training
- Build scalable training pipelines for supervised and self-supervised learning paradigms
- Implement efficient techniques such as LoRA/QLoRA, instruction tuning, and quantization when needed
- Develop agentic workflows (tool-calling, iterative reasoning, ranking, and multi-step pipelines) for real-world use cases
- Improve and maintain existing ML pipelines
- Optimize inference performance with batching, quantization, and efficient serving frameworks
- Retrieval-Augmented Generation (RAG) & Search
- Design and optimize RAG systems that integrate with internal data sources
- Build hybrid retrieval using Elasticsearch/BM25 + vector search (PGVector or other vector DBs)
- Tune chunking, embeddings, reranking, and hallucination-prevention strategies for long documents
- Data Engineering for AI
- Prepare and refine text datasets for training, evaluation, and fine-tuning
- Generate synthetic training data using LLMs to improve extraction accuracy, classification, and reasoning
- Build small, targeted datasets for fine-tuning domain-specific models
- LLM Specialization
- Fine-tune and align open-source LLMs for domain-specific tasks (RAG, agents, tool-calling, reasoning)
- Implement safety layers: prompt guards, output filters, adversarial robustness testing
- Productionization & MLOps
- Containerize models with Docker; orchestrate with Docker Swarm
- Build monitoring for model quality, latency, hallucinations, and regressions
- Collaborate with DevOps and Architecture teams on cost, performance, and scalability decisions
- Implement observability around prompts, retrieval, and model outputs
- Cross-Functional Impact
- Collaborate with Product to define AI roadmaps and success metrics
- Partner with Product and Engineering to design new AI capabilities across modules
Requirements:
- Education: MS/PhD in CS, ML, Data Science, or related field; or BS + 5+ years of intensive AI industry experience
- Core Technical Skills: Languages: Python (expert); familiarity with modern web stacks (e.g., PERN) is a plus
- Experience with PyTorch, Hugging Face Transformers, and common LLM tooling (PEFT, LoRA, QLoRA)
- Strong understanding of NLP: tokenization, embeddings, chunking, NER, classification, summarization
- Experience with RAG pipelines, vector databases, retrieval/ranking strategies
- Experience designing and running model evaluation pipelines
- Comfortable with Docker-based deployments and cloud environments
- Knowledge of Elasticsearch
- Proven Track Record: Shipped at least 3 production LLM-powered features (e.g., chatbots, summarizers, agents)
- Experience working with modern large-scale models
- Experience designing clean, maintainable architectures for LLM features
- Strong documentation and communication skills