Build, run, and evolve production ML and LLM systems by implementing feature pipelines, training and retraining workflows, and batch and real-time inference on top of Qventus’ data platform
Monitor and optimize model performance across hospitals, improving accuracy, latency, cost, and reliability
Build and maintain model-level feature pipelines and feature management systems on top of curated datasets to support training, inference, and replay.
Collaborate with Data Science leaders to establish best practices for applied ML at Qventus, setting standards for feature design, evaluation, and production readiness through iteration and retraining
Requirements
3+ years building and running machine learning models in production using Python and SQL in modern cloud-based ML environments (AWS & Databricks preferred)
Demonstrated ability to design and run feature engineering, training, and inference workflows in applied ML systems
Familiarity with operationalizing LLMs or retrieval-augmented generation (RAG) systems; Exposure to LLM frameworks and libraries (Langchain, LlamaIndex, HuggingFace, etc.)
Strong understanding of software engineering principles and writing maintainable, modular code
Strong collaboration and communication skills — able to partner closely with product, clinical, and engineering stakeholders