Brainlabs is a media agency focused on driving profit through data-driven insights. The Data Engineering Manager will be responsible for designing, building, and managing scalable data solutions, focusing on data pipeline development and AI/GenAI process development.

Responsibilities:

Design, develop, and maintain ETL/ELT pipelines using GCP tools like CloudFunctions, CloudRun, Dataflow, Dataproc, or Cloud Data Fusion
Ensure data pipelines are scalable, efficient, and optimised for performance
Build and manage data pipelines that support LLM and GenAI applications, including Retrieval-Augmented Generation (RAG) architectures, vector data stores, and prompt context assembly workflows
Curate and prepare datasets for AI/ML model training, covering feature engineering, labeling pipeline oversight, and data versioning using tools like Vertex AI Feature Store or DVC
Integrate data from various sources into GCP services such as BigQuery, Cloud Storage, and Cloud SQL
Design and implement data warehouse/mart solutions using BigQuery for analytics and reporting
Build transformation logic using SQL, Python, or Spark for preparing clean and structured data
Optimise query performance and storage cost in BigQuery or other GCP storage systems
Develop processes to ensure data quality, integrity, and consistency across the pipeline
Implement monitoring and logging systems using tools like Stackdriver or Looker
Understand and interpret business and technical requirements to support data development tasks
Assist in building, testing, and maintaining data pipelines while ensuring alignment with project objectives and stakeholder needs
Work closely with cross-functional teams, including data analysts, data scientists, and business stakeholders, to understand requirements
Provide technical guidance on GCP best practices and tools
Maintain clear documentation of processes, workflows, and data architecture
Ensure regular maintenance and version control of pipelines and scripts

Requirements:

2 to 5 years of experience in designing, building, and managing scalable data solutions on Google Cloud Platform (GCP)
Strong background in data engineering and cloud-based architectures
Proficiency in implementing data pipelines to transform raw data into actionable insights
Hands-on experience with GCP services like CloudFunctions, CloudRun, Schedular, BigQuery, Dataflow, Pub/Sub, and Cloud Storage
Strong programming skills in Python and SQL
Knowledge of data modelling, schema design, and query optimization techniques
Experience in building batch and streaming data pipelines
Excellent communication and collaboration skills
Ability to work in a fast-paced and dynamic environment
Must be legally entitled to work in the United States
Familiarity with orchestration tools like Apache Airflow, Cloud Composer, or similar
Working experience on other cloud stack for ETL (AWS or Azure) is a plus
Experience with GCP's AI/ML platform (Vertex AI, BigQuery ML, or AutoML) for building, evaluating, or serving models is a strong advantage
Hands-on experience building or supporting LLM/GenAI pipelines using frameworks such as LangChain, LlamaIndex, or Vertex AI Agent Builder
Familiarity with AI/ML data preparation practices, including feature engineering, dataset curation, and data versioning for model training workflows
Knowledge of CI/CD practices and tools like Git, Jenkins, or Terraform for pipeline deployments
Understanding of data security, governance, and compliance practices on GCP
GCP Data Engineer or Associate Cloud Engineer certification (preferred but not mandatory)

Manager, Data Engineering

Key skills

About this role

Responsibilities:

Requirements: