Understand the manufacturing process, process data generation, process data consumption, and analytics usage, and then lead the design, build, and operation of end-to-end data pipelines (ingest → transform → publish) using SQL, Python, and dbt, ensuring reliability, freshness, observability, and performance.
Build a data strategy to ingest data from both structured and standardized data source systems (e.g., SAP, MES, LIMS) as well as from spreadsheets, paper documents, or other file-based systems.
Build and own data integrity specifications, and incorporate controls into IT solutions to enforce data quality and integrity.
Advice on data transformation, context of how data shall be stored, retrieved, and used.
Collaborate with internal site process SMEs across R&D, CMC, Internal and External manufacturing process SMEs, Digital Services data engineering and data lake teams, manufacturing application software development teams, Suppliers (e.g. Seeq).
Establish and enforce data engineering standards, reusable pipeline patterns, and data modeling conventions (star/snowflake) aligned with Manufacturing Process Intelligence solution governance.
Ensure compliance with SDLC, documentation, testing, deployment, and run book expectations, including GMP/GxP data usage considerations.
Define and monitor SLAs, data quality thresholds, and freshness metrics for prioritized data products.
Partner with Technical Product Managers and stakeholders to refine backlogs, prioritize investments, and balance speed, quality, and scalability.
Mentor, and develop data engineers, fostering strong engineering fundamentals, domain understanding, and consistent ways of working.
Champion data literacy through trainings, office hours, templates, and job aides to enable governed self-service analytics.
Requirements
Bachelor’s degree in engineering, Computer Science, or a related field.
minimum 8-10 years of relevant experience in this field.
Strong expertise in SQL and data modeling with working knowledge of Python.
Experience designing and operating ETL/ELT pipelines, orchestration, and data quality monitoring.
Understanding of data lakes, data marts, data ingestion patterns.
Hands on experience with relational schema design, data warehousing concepts, and performance optimization.
Knowledge of biopharmaceutical manufacturing processes, generated process data, discrete and time-series data, manufacturing systems (e.g. MES, LIMS, PI, SAP) that generate the data, and scientific TechOps processes that consume the data.
Tech Stack
ETL
Python
SDLC
SQL
Benefits
medical, dental, vision healthcare and other insurance benefits (for employee and family)