Guidehouse is a consulting firm seeking multiple Data Engineers to join their Technology AI & Data practice. The role involves supporting public sector and health sector clients by building modern data foundations to improve outcomes and enable better decision-making.
Responsibilities:
- Assist in developing and maintaining data pipelines and ETL/ELT processes under the guidance of more senior engineers
- Write Python and SQL to extract, transform, validate, and load data from common sources
- Perform data quality checks (validation, reconciliation, basic monitoring) and help troubleshoot data issues
- Develop dashboards and analytic products using data visualization tools (e.g., Power BI, Tableau)
- Support cloud-based data workloads (e.g., Azure/AWS/GCP basics) and learn platform-native services and patterns
- Document pipeline steps and technical processes to support maintainability and knowledge transfer
- Participate in team delivery rhythms (standups, sprint ceremonies) and contribute to reviews with a learning mindset
- Design, build, test, and maintain scalable data pipelines (batch and/or streaming as applicable) with increasing independence
- Integrate data from multiple sources, resolve inconsistencies, and deliver curated datasets for analytics and operational use
- Own data quality for assigned domains by implementing validation checks, reconciliation, and monitoring/alerting patterns
- Build, maintain, and deploy data products for analytics and data science teams on cloud platforms (e.g. AWS, Azure, GCP)
- Optimize performance of pipelines and queries (tuning, partitioning patterns, efficient compute usage)
- Collaborate cross-functionally with analysts, data scientists, and stakeholders to translate requirements into technical designs and delivery plans
- Produce and maintain technical documentation for data flows, data models, and operational procedures
- Contribute to governance and compliance practices (access controls, lineage awareness, controlled data handling) within your scope
- Lead the design and build of scalable data pipeline architectures and tools, including patterns for reliability, security, and maintainability
- Drive ETL/ELT and data quality strategy (frameworks, standards, repeatable testing/monitoring approaches) and raise engineering maturity across the team
- Architect solutions in cloud data platforms (e.g., Azure + Databricks, Snowflake) and guide implementation tradeoffs (cost, performance, scalability, governance)
- Design data stores and interactions across storage types (relational, warehouse, lake/lakehouse, and NoSQL where needed) aligned to use cases
- Enable data science / ML readiness by delivering well-modeled, reliable, well-documented datasets and features
- Lead requirements gathering and technical planning; translate ambiguous problem statements into actionable architectures, backlogs, and delivery increments
- Champion data quality and governance standards through the development of sophisticated data quality frameworks, dashboards, and feedback loops to ensure transparency in data completeness, consistency, and quality for partners and researchers
- Own client and stakeholder engagement for your workstream, including organizing/leading meetings, producing clear written outputs, and tracking follow-through
- Mentor and review: provide strong code/design reviews, coach engineers, and help remove technical blockers
Requirements:
- Bachelor's degree from an accredited college/university
- Based on our contractual obligations, candidate must be located within the United States and US Citizen
- Must be able to OBTAIN and MAINTAIN a Federal or DoD 'PUBLIC TRUST'
- Strong communication skills and ability to work independently, strong collaboration habits, and comfort operating autonomously in a remote environment
- Minimum 1+ years of relevant software engineering/data experience (for the Junior role); Minimum of 3+ years of relevant software engineering/data experience (for the Data Engineer); and 8+ years of relevant software engineering/data experience (for the Senior Data Engineer)
- Advanced SQL and Python skills and experience with relational databases and database design
- Experience working with data ingestion tools such as AWS Lambda, AWS Data Migration Service, SFTP
- Experience making dashboards and using data visualization tools (Tableau, Power BI)
- Experience in integrating data from disparate systems and technologies (IBM Mainframe, Structured, Semi-structured and unstructured sources.)
- Proficiency with one or more cloud-based solutions (e.g., AWS, Azure, GCP)
- Designing/deploying data solutions on cloud platforms (AWS, GCP, Azure)
- Hands-on experience with cloud services and REST API integrations
- Proficiency with modern data tools (e.g., Spark/Databricks, Airflow, dbt, Kafka) is a plus
- Experience working with distributed data processing tools such as PySpark, AWS Glue
- Databricks and/or Snowflake Data Engineer Associate or Professional certification
- Proficiency with workflow management systems (Nextflow, Snakemake, Airflow)
- Experience with regulated environments (GxP, 21 CFR Part 11) and data governance