Backend for LLMs – Architect and implement scalable, low-latency APIs and services that wrap, orchestrate, and optimize LLMs for healthcare use cases.
Data & Retrieval Pipelines – Build ingestion, preprocessing, and retrieval-augmented generation (RAG) pipelines to ground LLMs in clinical and revenue-cycle data.
LLMOps & Observability – Design systems for model monitoring, evaluation, cost tracking, and guardrails, ensuring reliability and responsible use.
Performance & Optimization – Engineer solutions for caching, batching, load balancing, and scaling LLM workloads across cloud and containerized environments.
Security & Compliance – Implement HIPAA-ready infrastructure, data governance, and auditability for LLM-powered applications.
Cross-Functional Collaboration – Partner with product, ML engineers, and healthcare experts to translate business workflows into robust backend systems.
Technical Leadership – Drive end-to-end delivery of LLM backend projects, establish engineering best practices, and mentor peers in LLM system design.
Requirements
5+ years of backend or full-stack software engineering experience, with 3+ years working on ML/LLM-enabled applications.
Strong coding skills in Python (and ideally one statically typed language such as Go, Java, or TypeScript).