UT MD Anderson is a leading institution focused on cancer care, research, education, and prevention. They are seeking a Data Engineer to support the design, build, and operationalization of integrated data pipelines and analytics solutions that enable the institution's digital business initiatives.
Responsibilities:
- Participate in end-to-end solution delivery that increases information capabilities and realizes data value across the institution
- Build and test end-to-end data pipelines across ingestion, curation, transformation, modeling, and consumption within the Context Engine framework
- Integrate data governance processes across data provenance, security, data quality, ontology, and metadata management
- Participate in planning, architecture, analysis, design, and build of data pipelines in partnership with IS, Data Offices, and Data Governance teams
- Contribute to existing data pipelines spanning acquisition, integration, and consumption for defined use cases
- Build data curation pipelines including profiling, specification creation, cleansing, transforming, standardizing, mastering, harmonizing, validating, and aggregating data
- Monitor and support data quality across the Context Engine
- Incorporate repeatable solution designs and data models to support reuse and scalability
- Promote effective data management practices and understanding of analytics across the enterprise
- Adhere to IS division standard operating procedures and all MD Anderson policies
- Maintain build standards and governance oversight sign-off aligned with institutional data strategy
- Participate in documentation preparation for enhancements or new technology
- Perform quality control, testing, and peer review of analytics builds
- Support system updates, releases, change control processes, and after-hours support as required
- Train data scientists, analysts, end users, and data consumers on data pipelining and preparation techniques
- Assist in establishing training plans and curricula for Context Engine tools
- Provide institutional, department, and one-on-one training on EDEA deliverables
- Support liaison relationships with customers and OneIS partners to deliver effective technical solutions
- Explore and promote modern tools, techniques, and architectures to automate data preparation and integration tasks
- Improve productivity by reducing manual and error-prone processes
- Model OneIS values through integrity, partnership, quality, and continuous improvement
Requirements:
- Bachelor's Degree
- 2 years Clinical, relevant healthcare information technology, or relevant business experience
- With preferred degree, no experience required
- EPIC - EPIC Certification Must obtain at least one Epic Data Model certification (Clinical, Access, or Revenue) issued by Epic within 180 Days
- Must pass pre-employment skills test as required and administered by Human Resources
- Bachelor's in computer science
- Master's degree Business Analytics
- Computer Science
- Information Technology
- Data Science
- 3-5 years creating data pipelines in a healthcare research environment
- Experience building and maintaining analytical reports and dashboards
- Problem solving skills and ability to translate business/clinical requirements into reliable data models
- Analytics & reporting- cloud data management solutions like Foundry, Fabric etc
- Data pipeline & ETL development -hands on experience designing, building and maintaining pipelines using python/spark
- Hands-on use of Large Language Models (LLMs) in real-world projects, such as integrating generative AI solutions into applications, workflows, or analytics platforms
- Familiar with prompt engineering, model evaluation, and responsible AI practices
- Experience collaborating with cross-functional teams to deploy and scale LLM-powered features
- EPIC Cogito certification
- EPIC Clarity certification
- EPIC Caboodle certification
- Clinical Data Model certification