Interwell Health is a kidney care management company focused on reimagining healthcare. As a Machine Learning Engineer, you will develop and deliver end-to-end machine learning solutions, collaborate with cross-functional teams, and lead the design of MLOps frameworks to enhance healthcare systems.
Responsibilities:
- Develop and deliver end‑to‑end machine learning solutions, including defining technical requirements, architecting scalable systems, and implementing monitoring, logging, and maintenance workflows
- Collaborate closely with engineers, product managers, clinicians, and cross‑functional partners to build new ML products and enhance existing systems
- Lead the design and implementation of MLOps frameworks, including pipeline development, CI/CD integration, drift detection, retraining workflows, and rollback strategies
- Monitor model performance in production, identify issues, propose remediation steps, and ensure strong test coverage and system reliability
- Utilize contemporary software engineering practices to implement scalable, secure, and maintainable AI/ML systems
- Develop and customize API integrations to enable seamless connectivity between cloud‑based systems and ML services
- Participate in architectural discussions to ensure ML platforms meet compliance, performance, and scalability standards
Requirements:
- Bachelor's degree in Computer Science, Data Analytics, Software/Computer Engineering, Computational Statistics, Mathematics, or a related discipline
- 3+ years of end‑to‑end ML development in production (data prep, feature engineering, modeling, calibration, deployment, monitoring, maintenance)
- 3+ years of MLOps experience building production pipelines (CI/CD, model registry, feature store), implementing monitoring & drift detection, and automating retraining
- 3+ years of Python for production ML (testing, packaging, type hints, linting) and SQL for analytical and production workloads; Scala a plus
- 2+ years working with distributed compute and cloud ML environments (e.g., Spark/Databricks on Azure/AWS/GCP) and modern data ecosystems (data lakes, DBMS)
- Strong debugging and optimization skills across data and ML workflows
- Track record of ownership and problem solving—driving measurable impact and quality under ambiguity and evolving requirements
- Ability to communicate technical decisions clearly and contribute to documentation and design discussions
- Demonstrated system design & architecture skills for scalable, high‑performance ML services and batch/streaming workflows; familiarity with API design and service integration patterns
- Proven understanding of tradeoffs in latency, cost, performance, and compliance
- 1+ years of Databricks experience + some experience in infrastructure/networking
- 1+ years implementing LLM‑based solutions in production (prompt/response design, evaluation frameworks, guardrails/safety, latency/cost optimization)
- 1+ years designing compliant ML platforms (e.g., HIPAA, SOC 2) and working with PHI/PII governance, access controls, and auditability