Swoop is a market leader in privacy-safe healthcare marketing, and they are seeking a Staff AI/ML Engineer to enhance the patient experience through AI-driven technology. The role involves building end-to-end ML/LLM features, developing applications, and leading architectural direction for applied AI within the organization.
Responsibilities:
- Build end-to-end ML/LLM features from problem definition → data → modeling → evaluation → deployment → monitoring
- Develop LLM applications with retrieval and tool use (e.g., RAG, orchestration/workflows, structured extraction) to deliver trustworthy consumer health experiences
- Convert unstructured text (posts, comments, messages, search queries) into structured signals (topics, entities, intent, sentiment, safety flags) using a mix of classical NLP and modern LLMs
- Create and maintain data pipelines for training, inference, evaluation, and analytics (batch and/or streaming as needed)
- Design evaluation systems that measure quality and safety: offline metrics, golden datasets, human review workflows, and online A/B testing alignment
- Implement production guardrails to reduce harm and misinformation risk (policy constraints, refusal behavior, citations/attribution when appropriate, red-teaming, monitoring, and incident response)
- Set up monitoring for model + system health (latency, cost, drift, regressions, quality metrics)
- Partner closely with the Product, Engineering, and Data teams and clinical/subject-matter experts to validate outputs and define what 'correct' means for sensitive, health-adjacent use cases
- (Staff scope) Lead architecture and technical direction for applied AI across the organization; mentor engineers; establish best practices and reusable platforms
Requirements:
- 8+ years building and shipping production ML systems (or equivalent experience with demonstrable impact)
- Strong Python skills and experience with ML/LLM libraries and tooling (e.g., Hugging Face ecosystem, LangChain/LangGraph, or equivalent)
- Proven ability to design production-grade pipelines (training/inference/eval) and operate models in real systems (monitoring, rollbacks, incident handling)
- Solid grounding in ML fundamentals (NLP, deep learning, statistical reasoning, evaluation)
- Experience with MLOps best practices: versioning, reproducibility, CI/CD, model registry patterns, feature/data management, and infrastructure collaboration
- Experience working with large-scale data using Databricks/Spark or equivalent distributed processing
- Strong product and stakeholder instincts: you can translate ambiguous business needs into measurable ML outcomes
- Experience building RAG and retrieval systems: vector databases, hybrid search, ranking, recommendation, query understanding
- Experience in healthcare or regulated environments, including privacy-by-design, auditability, and safety reviews (HIPAA/PHI familiarity a plus)
- Experience with streaming/clickstream data, experimentation platforms, and causal/measurement thinking
- Ability to prototype end-to-end experiences (e.g., Streamlit, Gradio, lightweight frontends)
- Experience designing LLM safety systems: red-teaming, adversarial testing, prompt injection mitigation, output filtering, human-in-the-loop review