Huron is a company that helps healthcare organizations drive growth and enhance performance. They are seeking a Data Engineer to build and maintain AI/context data capabilities, focusing on transforming information into reusable components and ensuring operational excellence in their AI context platform.
Responsibilities:
- Build and contribute to the AI context platform Implement end-to-end pipelines: ingestion → parsing/chunking → enrichment → embeddings → vector indexing → retrieval/serving
- Build and maintain patterns for incremental refresh, backfills, re-embeddings, deduplication, and lineage across unstructured sources
- Contribute to retrieval quality improvements (query strategies, hybrid search, metadata filtering) in partnership with AI engineers
- Deliver semantic and governed data products Implement semantic layers (metrics/entities) that power BI and agent reasoning consistently
- Apply established data contracts and context contracts for AI inputs (schemas, metadata requirements, freshness, citation expectations)
- Ensure datasets and indexes are documented and reusable
- Support reliability and performance across assigned workstreams: monitoring, alerting, runbooks, and incident response
- Contribute to cost and latency optimization across Snowflake and vector infrastructure
- Apply security-by-design patterns: RBAC/ABAC, PII redaction, retention controls, and audit logging
- Follow established guardrails for AI access to enterprise knowledge in coordination with Security/Legal/Compliance
Requirements:
- Bachelor's Degree in computer science, engineering, or related field of study
- 3–6 years in data engineering or data platform roles with strong hands-on delivery
- Strong SQL and Python (or Scala/Java); solid production engineering habits
- Hands-on experience with Snowflake, including pipeline design, data modeling, and operating at scale in a production environment
- Experience designing and operating cloud data pipelines at scale
- Experience working with unstructured data processing and search/retrieval concepts
- Clear communicator who can work effectively across technical and functional teams
- Hands-on experience with vector search and embeddings (pgvector/Pinecone/Weaviate/OpenSearch/Elastic) and retrieval patterns (semantic retrieval, hybrid search, reranking)
- Experience supporting LLM applications (RAG, agent tool interfaces, evaluation/observability)
- Familiarity with knowledge graphs, semantic modeling, or metrics layers
- Experience in regulated environments and data governance programs
- Exposure to dbt, Iceberg, or other lakehouse/semantic layer tooling alongside Snowflake