Theoria Medical is a comprehensive medical group and technology company dedicated to serving patients across the care continuum. They are seeking a Senior Data Engineer responsible for designing, building, and maintaining scalable data pipelines and platforms that support enterprise analytics and reporting, while collaborating with various teams to ensure healthcare data is well-modeled and governed.
Responsibilities:
- Design, build, and maintain scalable data pipelines using Microsoft Fabric and Apache Airflow
- Ingest, transform, and integrate data from a variety of sources, including relational systems, APIs, and MongoDB
- Implement and manage data solutions aligned to Medallion architecture principles (Bronze, Silver, Gold)
- Design and maintain analytical data models, including fact and dimension tables, to support reporting and analytics
- Optimize data storage, performance, and reliability across lakehouse and warehouse environments
- Ensure data quality, observability, and lineage through validation, monitoring, and documentation
- Collaborate with data analysts and BI developers to enable performant, well-modeled datasets for Power BI
- Partner with clinical, operational, and technical stakeholders to understand data requirements and constraints
- Support data governance, security, and compliance efforts, including HIPAA-related controls
- Mentor junior data engineers and contribute to engineering standards and best practices
Requirements:
- 5+ years of experience as a Data Engineer, Senior Data Engineer, or similar role
- Strong experience with Microsoft Fabric (e.g., Lakehouse, Data Warehouse, pipelines, notebooks)
- Hands-on experience with Apache Airflow for workflow orchestration and scheduling
- Experience working with MongoDB and integrating NoSQL data sources into analytical platforms
- Strong SQL skills and experience building performant analytical queries and transformations
- Deep understanding of data modeling concepts, including fact and dimension tables
- Practical experience implementing Medallion architecture in a data lake or lakehouse environment
- Experience working with healthcare data (e.g., EHR/EMR, claims, clinical, revenue cycle, or operational data)
- Strong understanding of data engineering best practices around scalability, reliability, and maintainability
- Experience in a healthcare provider, payer, or health technology organization
- Familiarity with HIPAA and healthcare data privacy and security requirements
- Experience with CI/CD for data pipelines and infrastructure-as-code concepts
- Exposure to streaming or near–real-time data processing
- Experience supporting enterprise BI platforms such as Power BI