Healthy Together is a fast-growing GovTech & healthcare platform seeking an experienced Senior Data Engineer. The role involves designing, building, and maintaining data pipelines and integrations to support analytics, reporting, and operational workflows while ensuring compliance with data governance standards.
Responsibilities:
- Architect, develop, and operate scalable data pipelines in Python using frameworks such as Apache Airflow, AWS Glue, or similar
- Ingest and transform data from internal sources, microservices, and third-party APIs (REST, streaming, webhooks)
- Design and maintain dimensional and normalized schemas in cloud data warehouses (AWS Redshift, Snowflake, or equivalent)
- Optimize table structures, partitioning, and indexing for performance and cost efficiency
- Build and manage robust, fault-tolerant integrations with external systems (payment gateways, identity providers, data vendors)
- Develop monitoring, retries, and alerting to ensure integration reliability
- Implement data validation, anomaly detection, and reconciliation processes to guarantee accuracy
- Collaborate with Security and Compliance teams to enforce data governance, encryption, and access controls
- Partner with Analytics, ML, and Product teams to translate requirements into data solutions
- Provide self-service data access (views, dashboards) and documentation for stakeholders
- Monitor and tune pipeline and warehouse performance; identify opportunities to reduce AWS spend
- Introduce caching, batching, and parallelism as appropriate for large-scale workloads
- Evaluate and prototype emerging data technologies (Spark, Kafka, dbt, data mesh patterns)
- Leverage AI/ML tools to automate repetitive data tasks or anomaly detection
Requirements:
- 7+ years in data engineering or analytics engineering roles
- Expert-level Python for ETL scripting, API clients, and automation
- Hands-on with AWS data services (S3, Glue, Redshift, EMR, Lambda) and infrastructure-as-code (Terraform or CloudFormation)
- Deep expertise designing relational schemas, writing complex SQL, and building data marts
- Proven experience with Apache Airflow, AWS Glue, or equivalent orchestration tools
- Solid background integrating and transforming data from third-party APIs, streaming platforms, and message queues
- Familiarity with data handling requirements in HIPAA, FedRAMP, and SOC-2 environments
- Strong communication skills; able to partner effectively with cross-functional teams
- Experience with Apache Spark, Kafka, Kinesis, or similar
- Proficiency with dbt, Delta Lake, or Iceberg for versioned tables and transformations
- Knowledge of Docker and Kubernetes for data workloads
- Exposure to MLOps frameworks and feature stores
- Prior work on healthcare analytics or government data projects
- Engagement with data-engineering or analytics OSS communities