IMO Health is a clinical data intelligence company improving how data is used across the healthcare ecosystem. As a Sr. Data Engineer, you will design, build, and operate scalable data platforms that support the company's products, analytics, and AI initiatives, while collaborating with various engineering teams to deliver reliable data systems.
Responsibilities:
- Build and operate production-grade data platforms that support IMO’s terminology-driven products, analytics, and machine learning use cases
- Design, develop, and maintain data pipelines for batch and incremental processing using modern lakehouse and cloud-native patterns
- Work extensively with cloud data platforms (AWS + Databricks) to ingest, transform, and serve structured and semi-structured data at scale
- Model data intentionally—developing well-documented, analytics- and product-ready data models that balance usability, performance, and correctness
- Apply strong software engineering practices to data work, including version control, testing, CI/CD, and infrastructure-as-code
- Collaborate directly with product, analytics, and AI teams to translate requirements into scalable technical solutions
- Improve reliability, performance, and cost-efficiency of data systems through monitoring, observability, and continuous optimization
- Design for data quality and trust, implementing automated checks, validation frameworks, and lineage-aware workflows
- Contribute to platform evolution, helping shape standards around orchestration, data modeling, environments, and deployment
- Operate in an Agile environment, taking ownership of deliverables and proactively identifying risks and opportunities
- Mentor and support other engineers, leading by example in code quality, problem decomposition, and technical decision-making
- Continuously learn and apply industry best practices in data engineering, analytics engineering, and AI data foundations
Requirements:
- Bachelor's degree in a relevant technical field and 5+ years of professional experience, or 7+ years of equivalent hands-on experience
- Demonstrated experience building and supporting end-to-end data platforms in a production environment
- Strong programming experience in Python and SQL, with an engineering mindset toward maintainability and testing
- Deep experience with cloud-based data platforms, especially: AWS (e.g., S3, EC2, RDS, IAM), Databricks / Spark-based processing
- Strong SQL skills, including complex transformations and performance-aware query design
- Hands-on experience with data orchestration frameworks (e.g., Airflow or equivalent)
- Experience designing and optimizing data models for analytics, reporting, and downstream applications
- Familiarity with CI/CD practices and infrastructure-as-code (e.g., Git, Terraform)
- Comfort working with large, complex, and evolving datasets, including managing schema change and metadata
- Strong analytical, debugging, and root-cause analysis skills
- Clear written and verbal communication skills, including documenting designs and tradeoffs
- A proactive, ownership-oriented mindset and the ability to work effectively across teams
- Experience with analytics engineering tools and patterns (e.g., dbt or similar transformation frameworks)
- Familiarity with data observability, monitoring, and cost management in cloud environments
- Experience supporting AI / ML data pipelines or feature engineering workflows
- Exposure to streaming or near–real-time data processing concepts
- Experience with healthcare, clinical, or regulated data domains
- Familiarity with metadata management, data catalogs, and lineage concepts
- AWS certifications (Data Engineer, Solutions Architect, or AI/ML)