Shift Technology is a company that delivers AI agents transforming critical work for insurers. They are seeking a Data Scientist/Engineer to contribute to the US Health roadmap, focusing on various data types and developing LLM-based solutions for claims handling and document understanding.
Responsibilities:
- Your role will be to actively contribute to the US Health roadmap and clients, and working on various data types such as structured claims data, free text, documents and images
- Build and productionize data pipelines (structured, text, documents, images) optimized for LLMs and multi-modal models
- Design, develop and deploy LLM-based solutions (RAG, embeddings, instruction tuning) for claims handling, document understanding, and related use cases
- Experiment with the latest in Agentic AI technologies (Langchain/Langgraph, OpenAI Agent SDK, MCP, A2A) and develop MVP for the next generation of payment integrity solutions
- Create custom "tools" for the agent, allowing it to query internal databases, call external weather APIs, or calculate impact force based on telemetry data
- Establish rigorous evaluation frameworks (LLM-as-a-judge) to ensure the agent’s decisions are unbiased, legally sound, and explainable
- Ensure responsible-AI practices: privacy, hallucination mitigation, explainability and compliance
- Lead client workshops, present prototypes, gather feedback and help define roadmap priorities
Requirements:
- 3+ years of Data Science and Healthcare Insurance Payment Integrity experience
- Expert proficiency in production-level object-oriented programming (OOP) for building scalable and reliable systems
- Proven hands-on experience with Large Language Models (LLMs) and generative AI techniques (including RAG, embeddings, prompt engineering, and model tuning), leveraging frameworks such as OpenAI/Anthropic or open-source variants
- Solid foundation in ML fundamentals with practical experience in the full machine learning lifecycle, including model evaluation, monitoring, versioning, and deployment in production environments
- Experience designing and implementing robust data pipelines for document, OCR, and multi-modal data workflows
- Experience with integrating frameworks like LangChain/LangGraph, OpenAI Agent SDK, CrewAI, A2A/MCP with Databricks or Azure-hosted models (e.g., DBRX, OpenAI GPT-5.3, Anthropic Claude, Google Gemini)
- Demonstrated ability to effectively engage with clients, translate complex business needs into clear, actionable technical solutions, and manage stakeholder expectations
- Deep expertise in Mosaic AI (formerly MosaicML), Unity Catalog, and Delta Lake
- Good understanding of Spark data architecture
- Experience using MLflow for the full lifecycle: from experiment tracking and prompt engineering in the AI Playground to model evaluation