UT MD Anderson is a leading institution focused on cancer care, research, education, and prevention. They are seeking a Data Engineer to support the design, build, and operationalization of integrated data pipelines and analytics solutions that enable the institution's digital business initiatives.

Responsibilities:

Participate in end-to-end solution delivery that increases information capabilities and realizes data value across the institution
Build and test end-to-end data pipelines across ingestion, curation, transformation, modeling, and consumption within the Context Engine framework
Integrate data governance processes across data provenance, security, data quality, ontology, and metadata management
Participate in planning, architecture, analysis, design, and build of data pipelines in partnership with IS, Data Offices, and Data Governance teams
Contribute to existing data pipelines spanning acquisition, integration, and consumption for defined use cases
Build data curation pipelines including profiling, specification creation, cleansing, transforming, standardizing, mastering, harmonizing, validating, and aggregating data
Monitor and support data quality across the Context Engine
Incorporate repeatable solution designs and data models to support reuse and scalability
Promote effective data management practices and understanding of analytics across the enterprise
Adhere to IS division standard operating procedures and all MD Anderson policies
Maintain build standards and governance oversight sign-off aligned with institutional data strategy
Participate in documentation preparation for enhancements or new technology
Perform quality control, testing, and peer review of analytics builds
Support system updates, releases, change control processes, and after-hours support as required
Train data scientists, analysts, end users, and data consumers on data pipelining and preparation techniques
Assist in establishing training plans and curricula for Context Engine tools
Provide institutional, department, and one-on-one training on EDEA deliverables
Support liaison relationships with customers and OneIS partners to deliver effective technical solutions
Explore and promote modern tools, techniques, and architectures to automate data preparation and integration tasks
Improve productivity by reducing manual and error-prone processes
Model OneIS values through integrity, partnership, quality, and continuous improvement

Requirements:

Bachelor's Degree
2 years Clinical, relevant healthcare information technology, or relevant business experience
With preferred degree, no experience required
EPIC - EPIC Certification Must obtain at least one Epic Data Model certification (Clinical, Access, or Revenue) issued by Epic within 180 Days
Must pass pre-employment skills test as required and administered by Human Resources
Bachelor's in computer science
Master's degree Business Analytics
Computer Science
Information Technology
Data Science
3-5 years creating data pipelines in a healthcare research environment
Experience building and maintaining analytical reports and dashboards
Problem solving skills and ability to translate business/clinical requirements into reliable data models
Analytics & reporting- cloud data management solutions like Foundry, Fabric etc
Data pipeline & ETL development -hands on experience designing, building and maintaining pipelines using python/spark
Hands-on use of Large Language Models (LLMs) in real-world projects, such as integrating generative AI solutions into applications, workflows, or analytics platforms
Familiar with prompt engineering, model evaluation, and responsible AI practices
Experience collaborating with cross-functional teams to deploy and scale LLM-powered features
EPIC Cogito certification
EPIC Clarity certification
EPIC Caboodle certification
Clinical Data Model certification

Data Engineer - Enterprise Data Engineering & Analytics

Key skills

About this role

Responsibilities:

Requirements: