IntegriChain is the data and application backbone for market access departments of Life Sciences manufacturers. They are seeking a Senior Data Engineer to lead enterprise data initiatives, design and optimize data pipelines, and collaborate with cross-functional teams to enhance data integration and analytics capabilities.
Responsibilities:
- Partner with Data Science leadership to rationalize and consolidate the enterprise data landscape across products, platforms, and acquired capabilities
- Define reusable data integration patterns for batch, micro-batch, near-real-time, and application-to-application data exchange
- Collaborate with cross-functional teams to understand business data needs, source-system realities, and enterprise application integration requirements
- Design scalable patterns for ingesting, transforming, mastering, and publishing data across operational and analytical use cases
- Help establish standards for data contracts, schema evolution, data quality, lineage, and data ownership
- Design and build data pipelines that load source data into Reltio MDM and extract mastered outputs from Reltio for downstream Snowflake, analytics, AI, and operational use cases
- Partner with MDM configuration and Product Management teams to translate HCO mastering requirements into data pipeline, mapping, validation, reconciliation, and publishing patterns
- Work with Reltio APIs, exports, crosswalks/XREFs, event-based integration patterns, and bulk load/extract mechanisms as needed to support inbound and outbound data flows
- Engineer integration patterns for HCO Master data, including party/entity, address, identifier, hierarchy, relationship, match/merge, survivorship, and golden record outputs
- Support source ingestion and reference data integration involving datasets such as HIN, DEA, NPI, NCPDP, 340B/PHS, channel outlet data, customer/account data, and other life sciences master/reference sources
- Develop validation and reconciliation processes to compare source data, Reltio mastered data, Snowflake curated data, and downstream consumption layers
- Help operationalize MDM outputs for business-facing data products, semantic models, reporting tables, APIs, and AI-ready datasets
- Design Snowflake database, schema, table, view, and semantic-layer patterns that support performance, governance, and maintainability
- Optimize Snowflake workloads using clustering, micro-partition awareness, warehouse sizing, query profiling, caching behavior, and workload isolation
- Implement Snowflake cost tracking and optimization practices, including warehouse utilization monitoring, inefficient query identification, and cost allocation by workload, team, or use case
- Build scalable SQL and Snowflake stored procedure logic for large-volume data processing and analytical workloads
- Apply secure Snowflake design patterns including RBAC, masking, access isolation, auditing, and environment separation
- Design, build, and maintain reliable ELT pipelines using dbt or comparable modern data transformation tooling
- Develop Python-based automation for API integration, file processing, metadata management, validation, orchestration support, and operational tooling
- Develop modular, tested, and reusable transformation models for raw, curated, mastered, and business-ready data layers
- Implement automated data quality checks, source freshness checks, reconciliation, logging, and exception-handling patterns
- Build orchestration-ready pipelines that support dependency management, restartability, incremental loads, and operational monitoring
- Collaborate with DevOps/SRE teams on CI/CD, deployment automation, environment promotion, and operational runbooks for data pipelines
- Spearhead logical and physical data modeling efforts for enterprise analytical, operational, MDM, and AI-ready datasets
- Design models that balance normalization, dimensional modeling, medallion/lakehouse concepts, and application-specific consumption needs
- Create denormalized reporting and semantic-model-ready structures that simplify business consumption and reduce ambiguity for AI/LLM use cases
- Process and optimize large data volumes in Snowflake using efficient SQL, PL/SQL-style procedural logic, Snowflake Scripting, and performance-aware design
- Create reusable patterns for historical tracking, snapshots, audit columns, data versioning, and lifecycle management
- Ensure data models support downstream BI, AI/ML, semantic models, data apps, MDM Explorer/Entity 360 use cases, and enterprise reporting
Requirements:
- 10+ years of experience in data engineering, database engineering, analytics engineering, or data platform development in production environments
- Strong hands-on experience with Snowflake, including architecture, performance tuning, security design, cost optimization, and cost tracking
- Thorough understanding of Snowflake design patterns for analytical workloads, high-volume data processing, data sharing, and multi-environment deployments
- Hands-on experience with ETL/ELT tools; dbt experience is strongly preferred
- Strong SQL and PL/SQL-style development experience, including complex transformations, stored procedures, performance tuning, and large-scale data processing
- Python experience for data automation, API integration, file handling, data validation, metadata processing, or operational tooling
- Experience designing and implementing enterprise data models, curated data layers, semantic layers, and reusable data products
- Experience with data integration patterns across enterprise applications, APIs, files, cloud storage, operational systems, MDM platforms, and analytical platforms
- Working understanding of Master Data Management concepts such as golden records, crosswalks/XREFs, match/merge, survivorship, hierarchies, entity relationships, stewardship, and data quality
- Experience partnering with MDM, Product, or business teams to translate mastering requirements into source-to-target mappings, transformation logic, validations, and downstream data consumption patterns
- Ability to work directly with cross-functional stakeholders to gather requirements, explain design tradeoffs, and drive alignment
- Experience implementing data quality, lineage, auditability, observability, and operational monitoring within data pipelines
- Comfortable operating as a hands-on senior individual contributor who can also influence strategy and engineering standards
- Experience with Reltio MDM, including inbound data loads, outbound exports, Reltio APIs, crosswalks, match/merge outputs, survivorship outputs, and operational troubleshooting
- Experience in life sciences, healthcare, pharma commercialization, HCO/HCP mastering, patient data, channel data, customer master, or commercial data platforms
- Experience with life sciences reference and commercial datasets such as HIN, DEA, NPI, NCPDP, 340B/PHS, 844, 852, 867, chargebacks, gross-to-net, government pricing, PBR, or UBR
- Experience with orchestration frameworks such as Airflow, Dagster, dbt Cloud jobs, cloud-native schedulers, or similar tools
- Experience with cloud platforms and storage patterns, especially Azure or AWS object storage integrated with Snowflake
- Exposure to AI-ready data architecture, feature stores, ML datasets, semantic models, or AI/ML pipeline enablement
- Experience with Terraform, CI/CD, Git-based development, and infrastructure-as-code practices
- Snowflake SnowPro, Reltio, dbt, or equivalent cloud/data engineering certifications