Cleerly is a healthcare company revolutionizing heart disease diagnosis and treatment. They are seeking a Staff Machine Learning Engineer to architect and advance machine learning platforms that enhance care pathways for heart disease diagnosis and prognosis.
Responsibilities:
- Architect and develop scalable AI/ML platforms and end-to-end pipelines, covering data ingestion, preprocessing, model training, evaluation, deployment, monitoring, drift detection, and automated retraining, while ensuring reproducibility, compliance with FDA/HIPAA, and alignment with organizational and regulatory goals
- Optimize and operationalize production ML systems, including monitoring, drift detection, automated retraining, and workflow execution, to achieve high performance, reliability, scalability, and regulatory adherence
- Evolve the ML stack through integration and refinement of frameworks, libraries, and infrastructure, improving system efficiency, maintainability, and the ability to support clinical ML workflows
- Ensure operational readiness of ML pipelines and platforms, verifying data quality, throughput, reproducibility, and compliance across production workflows
- Drive improvements in processes, tooling, and collaboration to streamline the transition of ML models from research to production, enhancing efficiency, reproducibility, and compliance across the platform
Requirements:
- 12+ years of experience (Bachelor's; 8+ with Master's; 5+ with PhD) designing, implementing, and optimizing AI and ML systems, ideally in regulated healthcare or clinical domains
- Deep technical expertise in ML pipelines, distributed model serving architectures, and production ML lifecycle management, with a track record of solving high-impact system challenges
- Proficiency in Python, Java, or similar, with extensive programming experience establishing reproducible ML workflows, coding standards, and software engineering best practices for AI/ML applications
- Proficiency with ML infrastructure and orchestration tools (Kubernetes, Helm, Airflow) and data platforms (Snowflake, PostgreSQL, Airbyte), and building scalable pipelines that support ML data processing and model workflows
- Advanced experience with AWS (including SageMaker and S3), ML infrastructure frameworks such as MLflow and Terraform, and exposure to platforms like Databricks, with a proven track record of implementing end-to-end ML systems and optimizing platform performance, scalability, and operational efficiency
- Proven ability to influence technical approaches and operational practices in AI/ML workflows, elevating system efficiency, reproducibility, and reliability
- Strong expertise in regulatory and compliance requirements for AI/ML (FDA, HIPAA), able to design systems that are inherently compliant, reproducible, and auditable