Role Overview

Design, build, and operate robust, scalable, and secure data pipelines and infrastructure, meeting defined performance, availability, and governance standards across the data stack
Take hands-on ownership of data pipeline development, including batch and streaming workloads, from ingestion through to consumption
Ensure consistently high standards of data quality, reliability, and availability across all data platforms
Enable analytics and AI use cases through well-designed, well-governed data foundations and reusable data patterns
Partner closely with Analytics Engineers and Data Analysts to support the creation of reliable, business-aligned datasets and models
Collaborate with Product, Engineering, and Architecture teams to ensure data solutions align with platform strategy and product needs
Contribute to the evolution of the company’s data architecture in collaboration with the Architecture Review Board, ensuring pipelines and data solutions align with established platform patterns, with a strong AWS-first approach
Maintain and support existing data pipelines, including troubleshooting, bug fixing, and incremental improvements to ensure reliability and performance
Ensure data operations comply with privacy, security, and regulatory requirements, embedding governance and access controls into data pipelines
Monitor, maintain, and continuously improve data pipelines, infrastructure, and platform performance, including observability and alerting
Maintain strong collaboration with Product, Engineering, and Domain Leadership (EHS, ESG, Chemical Safety), contributing to quarterly reporting and ensuring data initiatives remain aligned with company objectives

Requirements

Typically 6+ years of experience in Data Engineering or similar roles
Strong hands-on delivery in cloud-based environments
Experience in SaaS, Health & Safety, ESG, or Chemical Safety domains is a plus
Experience working in Agile environments with DevOps, CI/CD pipelines
Strong delivery mindset with the ability to take data initiatives from design through production
Advanced proficiency in Python and SQL
Strong experience designing and operating ETL / ELT data pipelines at scale
Hands-on experience with AWS-based data platforms, including: S3, Glue, Redshift, Athena, Kinesis
Strong experience implementing Change Data Capture (CDC) patterns using Kafka and/or AWS Kinesis
Strong experience working with large-scale structured and unstructured data
Experience building and operating streaming and event-driven data pipelines
Experience embedding security, access control, and compliance into data platforms (e.g. GDPR, enterprise data security best practices)
Proven ability to design scalable and efficient data architectures, optimising pipelines and storage
Experience working with LLMs and AI-enabled data pipelines, including preparing, governing, and serving data for LLM, RAG, and agentic workflows
Familiarity with agentic development patterns and event-driven architectures that support AI-driven automation and decision-making
Working knowledge of BI and analytics tools such as QuickSight (preferred), Power BI, Tableau, or Looker
Experience with distributed systems and modern data platforms (e.g. Spark, Databricks)

Tech Stack

Amazon Redshift
AWS
Cloud
Distributed Systems
ETL
Kafka
Python
Spark
SQL
Tableau

Benefits

Generous Paid Time Off
Extended Parental Leave
Robust Health Coverage
Accelerated Learning Paths
Team Wellness Initiatives
Company-wide Events
Employee Resource Groups
Recognition awards

Senior Data Engineer

Key skills

About this role

Role Overview

Requirements

Tech Stack

Benefits