CVS Health is a company dedicated to building a connected and compassionate health experience. They are seeking a Sr. Data Engineer to develop and manage large-scale data structures and ETL workflows, collaborating with the Data Science team to enhance business applications and data processing capabilities.
Responsibilities:
- Develop large scale data structures and pipelines to organize, collect and standardize data to generate insights and addresses reporting needs
- Write ETL (Extract/Transform/Load) processes, design database systems, and develop tools for real-time and offline analytic processing that improve existing systems and expand capabilities
- Collaborate with Data Science team to transform data and integrate algorithms and models into automated processes
- Test and maintain systems and troubleshoot malfunctions
- Leverage knowledge of Hadoop architecture, HDFS commands, and designing and optimizing queries to build data pipelines
- Utilize programming skills in Python, Java, or similar languages to build robust data pipelines and dynamic systems
- Build data marts and data models to support Data Science and other internal customers
- Integrate data from a variety of sources and ensure adherence to data quality and accessibility standards
- Analyze current information technology environments to identify and assess critical capabilities and recommend solutions to complex business problems
- Experiment with available tools and advise on new tools to provide optimal solutions that meet the requirements dictated by the model/use case
Requirements:
- Bachelor's degree (or foreign equivalent) in Computer Science, Data Science, Statistics, Mathematics, Analytics, Electronics Engineering, or a related field
- five (5) years of progressive, post-baccalaureate experience in the job offered or related occupation
- five (5) years of experience in Java, Python, or R
- five (5) years of experience in SAS or SQL
- five (5) years of experience in Hadoop, Hive, and HDFS
- five (5) years of experience in Airflow, Kafka, Hbase, Pig, MySQL, or NoSQL
- five (5) years of experience in Spark, PySpark, or Scala
- five (5) years of experience in Unix, Linux, or Shell scripting
- five (5) years of experience with Version Control System: Git
- five (5) years of experience in Big Data
- five (5) years of experience in Relational and Non-Relational Databases
- five (5) years of experience in Writing unit test cases and performing unit testing of developed software
- five (5) years of experience in Product scrum meetings including spring planning, backlog, prioritization and grooming, standup, and retrospection