Global Healthcare Exchange (GHX) is a healthcare business and data automation company that enables better patient care and maximizes industry savings. The Data Engineer III (Enterprise BI) will design, build, and support data solutions for enterprise reporting and analytics, collaborating with various stakeholders to deliver reliable data pipelines and actionable insights.
Responsibilities:
- Design and build ETL/ELT pipelines and dimensional data models using dbt, Airflow, Python, PySpark, and AWS services (S3, Glue, Lambda)
- Create executive dashboards and perform complex SQL analysis to drive strategic decisions (Tableau, Sigma, SAP BO)
- Optimize SQL queries, data structures, and warehouse resources for performance and cost efficiency at scale (Snowflake, Redshift)
- Partner with stakeholders to translate business requirements into self-service analytics capabilities
- Implement infrastructure-as-code (CloudFormation/CDK) and contribute to CI/CD automation
- Troubleshoot production issues across data pipelines, queries, and APIs; perform root cause analysis
- Provide technical mentorship, establish development standards, and drive data engineering best practices
- Document solutions and communicate designs to cross-functional teams in Confluence/JIRA
- Apply data governance, security, and monitoring/alerting best practices
- Leverage AI-assisted development tools (GitHub Copilot, Claude, etc.) to increase productivity and accelerate delivery
Requirements:
- Bachelor's degree in Computer Science, Data Science, Mathematics, Statistics, or related quantitative field
- 6+ years of data engineering experience building BI applications and data platforms
- 5+ years of ETL/ELT development in cloud data warehouses (AWS, Snowflake, Redshift, or similar)
- 4+ years creating dashboards and visualizations in enterprise BI tools (Tableau, Sigma, SAP BO, Power BI, or Looker)
- Proven track record delivering production data solutions in Agile environments (Scrum/Kanban)
- Expert-level SQL and Python proficiency
- Proven experience designing dimensional data models (star/snowflake schema) optimized for analytics
- Demonstrated SQL optimization and performance tuning in large-scale production environments
- Strong business acumen with ability to translate technical solutions into business value
- Excellent communication skills for presenting to executive and non-technical audiences
- Deep analytical and troubleshooting skills with root cause analysis capabilities
- Must be located in the United States (remote position)
- Advanced Snowflake experience (streams, tasks, dynamic tables, Snowpipe, time travel)
- Hands-on experience with dbt for analytics engineering and data quality testing
- Apache Airflow (or Prefect, Dagster) for workflow orchestration
- Deep AWS experience (Glue, Lambda, Step Functions, SNS/SQS, API Gateway, EventBridge)
- PySpark for distributed data processing and large-scale transformations
- Streaming data platforms (Kafka, AWS Kinesis, Spark Streaming) for real-time analytics
- Alteryx Cloud Designer (Trifacta)
- Infrastructure-as-code using CloudFormation or CDK
- Modern Angular (17+) for data-driven web applications; AngularJS modernization experience a plus
- Version control (Git), CI/CD workflows (GitHub Actions, GitLab CI), and containerization (Docker)
- Python data science libraries (pandas, numpy, scipy) and statistical analysis
- AI-assisted development tools (Claude Code, GitHub Copilot, OpenAI Codex) and LLM integration
- Data governance frameworks and cataloging tools (Alation, Collibra) with metadata management experience
- Data quality and observability tools (Great Expectations, dbt tests, Soda, Monte Carlo, Datadog)
- Modern BI semantic layers (Cube, Metriql, dbt Semantic Layer) or headless BI architectures