CVS Health is dedicated to creating a more connected and compassionate health experience. They are seeking a Staff Data Engineer to communicate with business leaders, analyze data, and develop scalable systems to manage data effectively.
Responsibilities:
- Communicate with business leaders to help translate requirements into functional specification
- Develop broad understanding of business logic and functionality of current systems
- Analyze and manipulate data by writing and running SQL queries
- Analyze logs to identify and prevent potential issues from occurring
- Deliver clean and functional code in accordance with business requirements
- Consume data from any source, such a flat files, streaming systems, or RESTful APIs
- Engineer scalable, reliable, and performant systems to manage data
- Collaborate closely with other Engineers, QA, Scrum master, Product Manager in your team as well as across the organization
- Build quality systems while expanding offerings to dependent teams
- Comfortable in multiple roles, from Design and Development to Code Deployment to and monitoring and investigating in production systems
Requirements:
- 7+ years of relevant experience
- Proven ability to complete projects in a timely manner while clearly measuring progress
- Ability to use AI driven tooling
- Strong software engineering fundamentals (data structures, algorithms, async programming patterns, object-oriented design, parallel programming)
- Strong understanding and demonstrated experience with at least one popular programming language (.NET or Java) and SQL constructs
- Experience with cloud-based systems (Azure / AWS / GCP)
- High level understanding of big data design (data lake, data mesh, data warehouse) and data normalization patterns
- Strong communication skills
- Ability to collaborate closely with product, data analytics teams
- Proficiency in building scalable data pipelines on Big Data Platforms and ETL/ELT workflows
- 3+ years in senior or lead roles
- Expert-level skills in writing and optimizing complex queries (preferably across multiple databases – e.g., SQL Server, Snowflake, Postgres)
- Solid knowledge of Apache Kafka (Kafka Streams, Schema Registry, Avro/JSON serialization)
- Advanced experience building distributed data pipelines, ETL/ELT workflows, and real-time/streaming jobs (Spark SQL, PySpark, Spark Structured Streaming)
- Strong understanding of data modeling, data quality, and governance in regulated environments (HIPAA, HITRUST)
- Familiarity with Epic Interconnect APIs, Bridges, or other healthcare interoperability frameworks
- Experience with Azure (preferred), AWS, or GCP for data engineering workloads
- Experience working on Databricks platform
- Experience with DBT and Dagster
- Strong Python programming experience for data ingestion, transformation, and automation workflows
- Strong experience with revision control (Git)
- Demonstrated experience with Metrics, Logging, Monitoring and Alerting tools
- Strong experience with use of RESTful APIs
- High level understanding of system deployment tasks and technologies. (CI/CD Pipeline, K8s, Terraform)
- Experience designing real-time streaming pipelines and batch workflows