CVS Health is building a more connected and compassionate health experience. They are seeking a Data Engineer to design and implement data pipelines that enhance analytical capabilities, collaborating with a dedicated team to optimize data models and ensure data quality.
Responsibilities:
- Data Pipeline Development: Design and build ETL/ELT data pipelines to ingest, process, and transform datasets from multiple sources
- Performance Optimization: Implement best practices for performance tuning, partitioning, and clustering to optimize data queries
- Data Quality & Governance: follow data quality standards, data governance frameworks, and security policies for data storage and access
- Data Modeling & Architecture: Develop and optimize data models and schemas to support analytics, reporting, and machine learning requirements
- Data Integration & Transformation: Collaborate with data scientists and analysts to design data solutions that integrate with BI tools and machine learning models
- Documentation & Knowledge Sharing: Create comprehensive documentation for data pipelines, workflows, and processes
Requirements:
- 2+ years of applicable work experience
- Proficiency in Python, specifically with ETL pipelines
- Strong proficiency in SQL and experience in developing complex queries
- Experience deploying data pipelines in a cloud environment (any of Azure, AWS, GCP)
- Excellent communication and interpersonal skills, with the ability to collaborate effectively with data scientists and analysts
- Experience working with healthcare data, especially Epic
- Experience using GCP's BigQuery
- Knowledge of data governance best practices in a cloud environment
- Experience working with Machine Learning processes