CVS Health is a company dedicated to shaping a more connected and compassionate health experience. They are seeking a Sr. Data Engineer to develop and manage large-scale data structures and pipelines, while collaborating with the Data Science team to optimize data for business applications.
Responsibilities:
- Develop large scale data structures and pipelines to organize, collect and standardize data to generate insights and addresses reporting needs
- Write ETL (Extract/Transform/Load) processes, design database systems, and develop tools for real-time and offline analytic processing that improve existing systems and expand capabilities
- Collaborate with Data Science team to transform data and integrate algorithms and models into automated processes
- Test and maintain systems and troubleshoot malfunctions
- Conduct data requirements assessment and design sessions with customers to analyze and establish clearly defined business requirements
- Perform detailed qualitative and quantitative data analysis on source systems based in data warehouses and legacy data marts using SQL and Python
- Build and deliver the Enterprise Logical and Physical data models that align with the organizational Data Governance policies for various data domains and sub domains within the healthcare business
- Leverage knowledge of Google Cloud Big Data architecture, Google Cloud Big Query commands, and designing and optimizing queries to build data pipelines
- Utilize programming skills in Python, or similar languages to build robust data pipelines as needed
- Build data marts and data models to support Data Science and other internal customers
- Integrate data from a variety of sources and ensure adherence to data quality and accessibility standards
- Analyze current information technology environments to identify and assess critical capabilities and recommend solutions to complex business problems
- Experiment with available tools and advise on new tools to provide optimal solutions that meet the requirements dictated by the model/use case
Requirements:
- Master's degree (or foreign equivalent) in Computer Science, Data Science, Analytics, Statistics, Mathematics, Engineering, or a related field
- two (2) years of experience in the job offered or related occupation
- two (2) years of experience in analyzing large data sets from multiple data sources
- two (2) years of experience in JIRA, Rally, or Confluence
- two (2) years of experience in Software Development Life Cycle (SDLC) and best practices
- two (2) years of experience in relational database concepts
- two (2) years of experience in designing data models and solutions for analytical and reporting use cases
- two (2) years of experience in data warehousing
- two (2) years of experience in supporting large data, analytics, and technology modernization initiatives
- two (2) years of experience in data analysis for retail and/or healthcare industries