Guidehouse is seeking a Data Engineer to join their Technology AI & Data practice, supporting public sector and health sector clients. This role involves designing, building, and maintaining scalable data pipelines, ensuring data quality, and collaborating with various teams to deliver data products for analytics.

Responsibilities:

Design, build, test, and maintain scalable data pipelines (batch and/or streaming as applicable) with increasing independence
Integrate data from multiple sources, resolve inconsistencies, and deliver curated datasets for analytics and operational use
Own data quality for assigned domains by implementing validation checks, reconciliation, and monitoring/alerting patterns
Build, maintain, and deploy data products for analytics and data science teams on cloud platforms (e.g. AWS, Azure, GCP)
Optimize performance of pipelines and queries (tuning, partitioning patterns, efficient compute usage)
Collaborate cross-functionally with analysts, data scientists, and stakeholders to translate requirements into technical designs and delivery plans
Produce and maintain technical documentation for data flows, data models, and operational procedures
Contribute to governance and compliance practices (access controls, lineage awareness, controlled data handling) within your scope

Requirements:

Bachelor's degree from an accredited college/university
Based on our contractual obligations, candidate must be located within the United States and US Citizen
Must be able to OBTAIN and MAINTAIN a Federal or DoD 'PUBLIC TRUST'
Advanced SQL and Python skills and experience with relational databases and database design
Experience working with data ingestion tools such as AWS Lambda, AWS Data Migration Service, SFTP
Experience making dashboards and using data visualization tools (Tableau, Power BI)
Experience in integrating data from disparate systems and technologies (IBM Mainframe, Structured, Semi-structured and unstructured sources)
Proficiency with one or more cloud-based solutions (e.g., AWS, Azure, GCP)
Designing/deploying data solutions on cloud platforms (AWS, GCP, Azure)
Hands-on experience with cloud services and REST API integrations
Proficiency with modern data tools (e.g., Spark/Databricks, Airflow, dbt, Kafka) is a plus
Experience working with distributed data processing tools such as PySpark, AWS Glue
Databricks and/or Snowflake Data Engineer Associate or Professional certification
Proficiency with workflow management systems (Nextflow, Snakemake, Airflow)
Experience with regulated environments (GxP, 21 CFR Part 11) and data governance

Senior Data Engineer

Key skills

About this role

Responsibilities:

Requirements: