Design and implement scalable and reliable data warehouse architecture, including data models, storage strategies, and processing layers.
Develop and maintain a robust role-based access control (RBAC) model to ensure secure and appropriate access to data across the organization.
Design, implement, and maintain inbound and outbound data pipelines, ensuring reliability, observability, and data integrity.
Optimize SQL queries, data models, and compute/storage resource usage to ensure cost-efficiency and high performance of the data platform.
Implement and enforce data protection and security best practices, including handling of sensitive and regulated data.
Work closely with analytics and business teams to understand their needs and deliver efficient, scalable, and reliable data solutions.
Improve and automate data governance practices, including data quality controls, metadata management, lineage, and compliance-related processes.
Monitor and troubleshoot data pipelines and platform issues, ensuring stability and timely resolution of incidents.
Continuously improve the data platform architecture, tooling, and development practices.
Requirements
Advanced SQL expertise, including deep understanding of query optimization and execution.
Ability to analyze query plans and understand what happens at the physical level.
Strong commercial experience working with real-world data.
Understanding that production data is often messy, incomplete, delayed, or inconsistent, and ability to design systems that tolerate and prevent data quality issues.
Solid understanding of data flows and data lifecycle.
Ability to detect, anticipate, and prevent downstream problems caused by unexpected or malformed data.
Proven experience designing and building data warehouses.
Understanding of common architectural patterns.
Experience implementing ETL/ELT pipelines in production environments (preferably using Airflow).
Good understanding of cloud platforms and distributed data systems (preferably AWS).
Strong Python skills for data processing, orchestration, and automation.
Experience working with cloud data warehouses (preferably Snowflake).
Familiarity with BI and analytical tools (preferably Looker or Apache Superset).
Ability to work with large-scale datasets and production-grade data platforms.
Tech Stack
Airflow
Apache
AWS
Cloud
ETL
Python
SQL
Benefits
Open-minded teams, a welcoming and inclusive company culture, plus the opportunity to make a real difference with a game-changing health tech product;
A competitive salary package based on your unique expertise, skillset, and impact on the product plus stock options;
In-office, remote and hybrid work opportunities;
The equipment whatever you need to be happy and productive;
A premium SIMPLE subscription;
21 days annual leave, plus bank holidays (those observed where you live);
Flexible hours. We focus on your results, not how long you spend at your desk.