Mayo Clinic is a top-ranked healthcare provider dedicated to patient needs and employee growth. They are seeking a skilled Data Engineer to join their Advanced Data Lake team, responsible for building and optimizing data solutions that support analytics and machine learning across the enterprise.
Responsibilities:
- Design, develop, and maintain data pipelines for ingestion, transformation, and integration of large-scale datasets
- Build and optimize data models to support advanced analytics and machine learning applications
- Collaborate with product teams and stakeholders to deliver data solutions aligned with ADL architecture
- Ensure data quality, security, and compliance throughout all processes
- Implement automation and monitoring for data workflows to improve reliability and performance
- Support cloud-based data platforms, primarily Google Cloud Platform (GCP), and integrate with enterprise systems
- Assemble large, complex data sets that meet functional / non-functional business requirements
- Build processes supporting data transformation, data structures, metadata, dependency and workload management
- Apply a strong knowledge of SQL to efficiently process large volumes of data and troubleshoot SQL queries
- Proactively identify improvement opportunities (error detection, error correction, root cause analysis)
- Build automation to aid in verification and testing of data
- Effectively participate in multiple, concurrent projects
- Research new and existing data sources in order to contribute to new development, improve data management processes, and make recommendations for data quality initiatives
- Perform periodic data quality reviews for internal and external data
- Ensure timely resolution of queries and data issues
- Work with data and analytics experts to strive for greater functionality in our data systems
Requirements:
- Bachelor's degree in Computer Science, Engineering, or related field from an accredited University or College; OR an Associate's degree in Computer Science, Engineering, or related field from an accredited University or College with 2 years of experience
- Demonstrated ability to analyze and profile data as a means to address various business problems through leveraging advanced data modeling, source system databases, or data mining techniques, is required
- Demonstrated application of several problem-solving methodologies, planning techniques, continuous improvement methods, and analytical tools and methodologies (e.g. data analysis, data profiling, modeling, etc.) required
- Ability to manage a varied workload of projects with multiple priorities and stay current on healthcare trends and enterprise changes
- Interpersonal skills and time management skills are required
- Strong analytical skills and the ability to identify and recommend solutions
- Advanced computer application skills and a commitment to customer service
- Experience with data analysis, quality, and profiling; including data exploration tools including but not limited to Rapid SQL, AQT, Information Analyzer, and Informatics
- Strong experience with Google Cloud Platform (BigQuery, Dataflow, Pub/Sub); Azure knowledge is a plus
- Familiarity with ETL tools, orchestration frameworks, and CI/CD pipelines
- Proficiency in Python, SQL, and other open-source programming languages
- Experience using Terraform to manage infrastructure as code
- GCP Professional Data Engineer certification preferred
- Building scalable data solutions for analytics and machine learning
- Working with structured and unstructured data in enterprise environments
- Strong problem-solving and communication skills
- Ability to work in cross-functional teams and manage multiple priorities