Role Overview

Understand the manufacturing process, process data generation, process data consumption, and analytics usage, and then lead the design, build, and operation of end-to-end data pipelines (ingest → transform → publish) using SQL, Python, and dbt, ensuring reliability, freshness, observability, and performance.
Build a data strategy to ingest data from both structured and standardized data source systems (e.g., SAP, MES, LIMS) as well as from spreadsheets, paper documents, or other file-based systems.
Build and own data integrity specifications, and incorporate controls into IT solutions to enforce data quality and integrity.
Advice on data transformation, context of how data shall be stored, retrieved, and used.
Collaborate with internal site process SMEs across R&D, CMC, Internal and External manufacturing process SMEs, Digital Services data engineering and data lake teams, manufacturing application software development teams, Suppliers (e.g. Seeq).
Establish and enforce data engineering standards, reusable pipeline patterns, and data modeling conventions (star/snowflake) aligned with Manufacturing Process Intelligence solution governance.
Ensure compliance with SDLC, documentation, testing, deployment, and run book expectations, including GMP/GxP data usage considerations.
Define and monitor SLAs, data quality thresholds, and freshness metrics for prioritized data products.
Partner with Technical Product Managers and stakeholders to refine backlogs, prioritize investments, and balance speed, quality, and scalability.
Mentor, and develop data engineers, fostering strong engineering fundamentals, domain understanding, and consistent ways of working.
Champion data literacy through trainings, office hours, templates, and job aides to enable governed self-service analytics.

Requirements

Bachelor’s degree in engineering, Computer Science, or a related field.
minimum 8-10 years of relevant experience in this field.
Strong expertise in SQL and data modeling with working knowledge of Python.
Experience designing and operating ETL/ELT pipelines, orchestration, and data quality monitoring.
Understanding of data lakes, data marts, data ingestion patterns.
Hands on experience with relational schema design, data warehousing concepts, and performance optimization.
Knowledge of biopharmaceutical manufacturing processes, generated process data, discrete and time-series data, manufacturing systems (e.g. MES, LIMS, PI, SAP) that generate the data, and scientific TechOps processes that consume the data.

Tech Stack

ETL
Python
SDLC
SQL

Benefits

medical, dental, vision healthcare and other insurance benefits (for employee and family)
retirement benefits, including 401(k)
paid holidays
vacation
compassionate and sick days

Associate Director, Data Engineering

Key skills

About this role

Role Overview

Requirements

Tech Stack

Benefits