
Title: Data Engineer
100% Remote
Duration: 12 month contract to hire
Only- and USC
Candidates need one of the following (PLUS FABRIC):
Description.
Project Overview
Funding Source: Texas Legislature
Platform: Microsoft Fabric, configured in an Azure environment
Architecture: Already defined and in place
Current Status:
Several datasets have been onboarded
Expanding scope to support UT Real Health AI Initiative
Transitioning to ingest data from all Health-Related Institutions (HRIs) in the UT system
________________________________________
Data Ingestion Strategy
Objective: Ingest Epic Caboodle data from multiple UT Health institutions
Considerations:
Each institution uses a different version of Epic Caboodle although all are based on SQL Databases.
Each HRI has unique data structures
A flexible ingestion process is needed to accommodate these differences
Timeline: Targeting December for initial data ingestion
________________________________________
Team & Hiring Plans
Current Setup: Existing infrastructure and architecture are in place
Hiring Needs:
6 Data Engineers (to be hired in stages)
DevOps Engineers
ML/AI Engineer (to support post-ingestion analytics)
2 Data Analysts (focused on Power BI and Power Automate)
Preferred Experience:
Familiarity with Azure stack
Experience with Epic and SQL Server
Background in healthcare data
________________________________________
Technical Stack
Languages & Tools:
Python
Spark
Notebooks
Apache Airflow
Data Architecture:
Delta tables
Parquet format
Medallion architecture (bronze, silver, gold layers)
Lakehouses and warehouses within the Fabric ecosystem
Visualization & Automation:
Power BI and Power Automate (used by analysts)
Preferred Platforms:
Databricks preferred over Azure Data Factory
________________________________________
Roles & Responsibilities
Team members are expected to:
Manage data ingestion
Maintain medallion architecture
Support data consumption through gold layer outputs
Collaborate across ingestion, transformation, and analytics
Top Skills Details
Full stack data engineers with focus on data ingestion
Spark, Python, Delta
Microsoft Fabric
Azure Databricks