CVS Health is an organization focused on transforming healthcare through advanced analytics and clinical informatics. The Lead Data Scientist will be responsible for activating clinical data to improve health outcomes, serving as a bridge between data assets and various stakeholders, and leading a team of data professionals.
Responsibilities:
- Serve as a subject matter expert in clinical data, including claims, pharmacy, lab results, and clinical documentation, with deep understanding of how to structure and apply this data to solve healthcare problems
- Design and maintain clinical data models, taxonomies, and classification frameworks that enable consistent interpretation and use of clinical data across the organization
- Develop and govern the claims data feature store, establishing standards, documentation, and best practices that accelerate adoption of clinical data for downstream analytics, reporting, and AI/ML use cases
- Enable self-service analytics by building well-documented, validated, and reusable data assets (tables, views, features) that empower analysts and data scientists to work independently with clinical data
- Create and maintain comprehensive data documentation, including data dictionaries, lineage, business logic, known limitations, and appropriate use guidelines for clinical datasets
- Partner with clinical, operational, and business stakeholders to understand their data needs, translate requirements into data solutions, and ensure clinical data assets meet their analytical objectives
- Lead and mentor data scientists, data analysts, and data engineers, providing guidance on clinical data interpretation, appropriate use, and best practices for working with healthcare data
- Establish data quality frameworks for clinical data, including validation rules, anomaly detection, and monitoring processes to ensure data integrity and reliability
- Translate clinical concepts into analytical frameworks, ensuring that business partners understand the capabilities and limitations of available clinical data
- Collaborate with data engineering teams to inform data pipeline development, ensuring clinical data is ingested, transformed, and stored in ways that support downstream analytics needs
- Contribute to data governance initiatives, including compliance with HIPAA, data privacy regulations, and internal data stewardship policies
- Develop and deliver training, presentations, and consultations to existing and prospective data consumers on clinical data assets, appropriate use, and analytics opportunities
- Stay current with clinical data standards (HL7, FHIR, ICD-10, SNOMED-CT, LOINC, CPT, NDC, RxNorm) and industry best practices in clinical informatics
Requirements:
- 7+ years of relevant experience in clinical informatics, healthcare analytics, or clinical data management
- Deep expertise in clinical data types and structures, including medical claims, pharmacy claims, lab results, clinical notes, and administrative healthcare data
- Knowledge of clinical coding systems and terminologies, such as ICD-10, CPT, HCPCS, SNOMED-CT, LOINC, NDC, and RxNorm
- Experience designing and documenting data models, taxonomies, or classification frameworks for clinical or healthcare data
- Proven ability to enable and support downstream data consumers (analysts, data scientists, business users) through documentation, training, and consultative support
- Experience leading cross-functional projects from concept to delivery by coordinating across clinical, technical, and business stakeholders
- Proficiency with SQL and experience working with large-scale healthcare datasets
- Experience using cloud-based data platforms, preferably Google Cloud Platform (GCP) tools including BigQuery, for querying, transforming, and managing data
- Strong understanding of data quality principles, including validation, profiling, and monitoring of healthcare data
- Excellent written and verbal communication skills, including the ability to explain complex clinical data concepts to both technical and non-technical audiences
- Ability to anticipate and resolve roadblocks throughout a project lifecycle, balancing competing priorities across multiple stakeholders
- Strong experience with medical claims (professional and institutional), pharmacy claims, and eligibility/enrollment data, including understanding of adjudication, adjustments, and claims completeness considerations
- Familiarity with claims-based analytics, including total cost of care, utilization metrics, risk adjustment (HCC), and episode groupers
- Strong understanding of interoperability and large‑scale data harmonization across administrative sources (e.g., medical & pharmacy claims, enrollment/eligibility files, provider files) and across common standards such as X12, NCPDP, FHIR, and OMOP
- Expertise in claims lifecycle and payer workflows, including claim submission, adjudication, pricing, remittance, utilization management, and benefits configuration
- Experience working with standardized administrative code systems (e.g., ICD‑10‑CM, CPT/HCPCS, DRG, NDC)
- Hands-on experience with ETL pipelines from payer sources into normalized data standards, preferably OMOP CDM with cost and payer domains