Design robust pipelines for processing Electronic Health Records (EHR) and medical literature.
Implement rigorous multi-stage validation frameworks (Sensitivity/Specificity analysis) to ensure clinical safety and model reliability.
Adapt Large Language Models using SFT, DPO, or PEFT (LoRA/QLoRA) for specialised medical domains and complex clinical diagnostic reasoning.
Architect hybrid retrieval systems that combine vector databases with Knowledge Graphs to eliminate hallucinations and ensure factual grounding.
Develop methods to make model outputs transparent.

Expert knowledge of Transformer architectures and hands-on experience fine-tuning LLMs (Llama 3, Mistral, etc.)
Hands-on experience with Knowledge Graphs, Triple-stores, or Graph Databases (Neo4j, ArangoDB) and Graph Neural Networks (GNNs).
Proficiency in LangChain / LlamaIndex and vector search engines (Pinecone, Milvus, or Weaviate).
Practical experience with SHAP, LIME, or custom attention-mapping techniques for model interpretability.
Strong background in statistical validation for high-stakes environments and handling imbalanced messy data.

Meaningful social impact: your work directly contributes to better patient outcomes, faster recovery, and a significant reduction in diagnostic errors.
Cutting-edge stack: work at the absolute forefront of AI, combining LLMs with structured Knowledge Graphs (Graph RAG).
Solve the "why": you won't just build a model, you'll build a system that clinicians can trust because they understand its logic.

Senior Data Scientist

Key skills