Cleerly is a healthcare company revolutionizing heart disease diagnosis and treatment. They are seeking an experienced Staff Machine Learning Engineer to architect and scale machine learning platforms that enhance AI innovation in regulated healthcare, ensuring compliance and efficiency in ML pipelines.
Responsibilities:
- Architect and develop scalable AI/ML platforms and end-to-end pipelines, covering data ingestion, preprocessing, model training, evaluation, deployment, monitoring, drift detection, and automated retraining, while ensuring reproducibility, compliance with FDA/HIPAA, and alignment with organizational and regulatory goals
- Optimize and operationalize production ML systems, including monitoring, drift detection, automated retraining, and workflow execution, to achieve high performance, reliability, scalability, and regulatory adherence
- Evolve the ML stack through integration and refinement of frameworks, libraries, and infrastructure, improving system efficiency, maintainability, and the ability to support clinical ML workflows
- Ensure operational readiness of ML pipelines and platforms, verifying data quality, throughput, reproducibility, and compliance across production workflows
- Drive improvements in processes, tooling, and collaboration to streamline the transition of ML models from research to production, enhancing efficiency, reproducibility, and compliance across the platform
Requirements:
- 12+ years of experience (Bachelor's; 8+ with Master's; 5+ with PhD) designing, implementing, and optimizing AI and ML systems, ideally in regulated healthcare or clinical domains
- Deep technical expertise in ML pipelines, distributed model serving architectures, and production ML lifecycle management, with a track record of solving high-impact system challenges
- Proficiency in Python, Java, or similar, with extensive programming experience establishing reproducible ML workflows, coding standards, and software engineering best practices for AI/ML applications
- Proficiency with ML infrastructure and orchestration tools (Kubernetes, Helm, Airflow) and data platforms (Snowflake, PostgreSQL, Airbyte), and building scalable pipelines that support ML data processing and model workflows
- Advanced experience with AWS (including SageMaker and S3), ML infrastructure frameworks such as MLflow and Terraform, and exposure to platforms like Databricks, with a proven track record of implementing end-to-end ML systems and optimizing platform performance, scalability, and operational efficiency
- Proven ability to influence technical approaches and operational practices in AI/ML workflows, elevating system efficiency, reproducibility, and reliability
- Strong expertise in regulatory and compliance requirements for AI/ML (FDA, HIPAA), able to design systems that are inherently compliant, reproducible, and auditable