Guidehouse is a consulting firm that focuses on providing innovative solutions to complex challenges. They are seeking a Data Engineer to design and maintain data pipelines, optimize data architectures, and ensure data quality for analytics and reporting.
Responsibilities:
- Design, develop, and maintain robust data pipelines and ETL/ELT processes to ingest, transform, and load data from diverse structured and unstructured sources
- Build and optimize data models and data architectures to support analytics, reporting, and operational use cases
- Implement and maintain CI/CD pipelines for data engineering workflows, including data pipelines and scheduled jobs, using version control and automation tools
- Develop and deploy cloud-based data solutions, leveraging services such as object storage, serverless compute, and messaging/queueing services
- Collaborate with database administrators, analysts, and application teams to integrate data sources, design schemas, and support downstream data consumers
- Ensure data quality, integrity, and reliability through validation, monitoring, logging, and alerting
- Support data migration, integration, and modernization initiatives, including legacy system upgrades and cloud adoption efforts
- Troubleshoot and resolve issues in development and production environments to maintain stable and reliable data pipelines
- Document data flows, pipelines, and technical solutions to support knowledge sharing and compliance requirements
- Stay current with emerging tools, technologies, and best practices in data engineering and cloud platforms
Requirements:
- US Citizenship or a Green Card is required
- A Bachelor's degree is required
- A minimum of 4 years of experience in data engineering, software development, or a closely related role is required
- Proficiency in one or more programming languages commonly used in data engineering (e.g., Python, SQL, R, PySpark, or Java)
- Hands-on experience designing and building data pipelines and ETL/ELT processes leveraging commonly used ETL/ELT tools (e.g., SSIS, Databricks, Azure Data Factory)
- Strong knowledge of relational database design, data warehouses, and/or data lakes using star/snowflake schemas
- Experience working with relational and/or NoSQL databases (e.g., Oracle, Postgres, SQL Server, MongoDB)
- Experience working in a cloud environment (AWS or Azure) supporting data solutions
- Familiarity with version control and CI/CD practices for data workflows (e.g., Git, Jenkins)
- Exposure to containerization technologies (e.g., Docker; Kubernetes is a plus)
- Experience using monitoring and logging tools to support data pipeline reliability (e.g., CloudWatch, Splunk, Kibana, Elasticsearch)
- Ability to work effectively in an Agile development environment
- Strong analytical and troubleshooting skills, and the ability to communicate technical concepts clearly to clients, engineers, and business stakeholders
- Ability to work independently in a fast-paced, client-facing environment
- AWS, Azure, Databricks, or other data engineering–related certifications (Datalake of Exasol)
- Experience with data visualization or analytics tools (e.g., Kibana, Tableau, Power BI)
- Familiarity with security, compliance, and data governance best practices in healthcare
- Exposure to microservices-based architectures or AI/ML-enabled data pipelines
- Prior consulting experience