Dr. Berg Nutritionals is one of the largest health education and supplement companies in the world, built around Dr. Eric Berg's YouTube channel. The Senior Data Engineer will be responsible for building and maintaining ingestion pipelines to ensure reliable and accurate data flow into the company's data warehouse.

Responsibilities:

Partner with the Head of Data (or the CIO directly) to complete a technical audit of every existing data source — what's flowing, what's broken, what's missing
Replace our current manual CSV-based Klaviyo ingestion with a direct API pipeline
Stand up the first production pipelines for Amazon SP-API, Shopify, and NetSuite, with proper monitoring and alerting
Establish our infrastructure-as-code practice (Bicep or Terraform) and CI/CD pipeline for data engineering changes
Document everything — pipeline architecture, runbooks, on-call procedures
Build and maintain ingestion pipelines. You will own the end-to-end pipelines from source systems into our warehouse. This includes Amazon Selling Partner API, Shopify Admin API, NetSuite (SuiteAnalytics Connect), Klaviyo, Recharge, YouTube Data and Analytics APIs, GA4 (via BigQuery export), Google Ads, Meta Ads, Triple Whale, and approximately 15 additional sources across our Layer 1–5 data model. For each pipeline you will design the ingestion approach, build it with proper error handling and idempotency, establish incremental-load patterns where appropriate, and monitor it in production
Own orchestration and scheduling. You decide what runs when, in what order, and with what dependencies. Financial data needs to be fresh before finance’s morning reconciliation. YouTube analytics need to respect daily API quotas across 7,000+ videos. Klaviyo events need to stream continuously. This is your call to make — and your responsibility to get right
Monitoring, alerting, and on-call. Every pipeline you build needs health checks: row counts within expected ranges, schema validation, freshness SLAs, and data quality gates. You will configure Azure Monitor alerts, decide what pages someone overnight versus what can wait, and lead post-incident reviews. You will take part in a one-in-four weekly on-call rotation once the team is fully staffed
Performance and cost optimization. Our data volumes are substantial — YouTube analytics alone is 7,000+ videos × daily metrics × multiple channels. You will own partitioning strategy, query tuning, incremental processing patterns, and monthly cost reviews. At our scale, this work directly saves tens of thousands of dollars per year in warehouse compute
Source system and vendor API management. When Shopify deprecates an endpoint, when Amazon changes reporting structure, when NetSuite releases a new ODBC driver — you're the person who reads the release notes, tests the change, and adapts the pipelines. You will own API keys, service accounts, rate-limit tracking, and vendor support escalations for data-source APIs
Enforce data contracts. You define and enforce the contracts between source systems and downstream consumers — what fields exist, what's never null, what ranges are valid. When a source system violates its contract, your pipelines stop and alert rather than passing bad data downstream to our AI analyzer. This is what structurally prevents hallucinations
Infrastructure-as-code and CI/CD. Pipelines are defined as code (Bicep, Terraform, or ARM templates) and deployed through peer review. You will own this practice, along with the dev/staging/production environment separation that lets us move fast without breaking the weekly brief

Requirements:

5–8+ years of professional data engineering experience, with at least 2–3 years working primarily in Azure (Data Factory, Synapse, Fabric, or comparable)
Strong SQL — not just query-writing, but query tuning, execution plan analysis, and indexing strategy
Production-level proficiency in C# and/or Python for custom connector work
Demonstrated experience building pipelines against messy real-world APIs — ideally Amazon SP-API, NetSuite, or similarly difficult commerce/ERP sources. This is non-negotiable. Experience with only 'clean' SaaS APIs like Stripe or Salesforce is not a substitute
Infrastructure-as-code experience using Bicep, Terraform, or ARM templates
Real on-call experience — you know what good runbooks and alerting look like because you've been paged at 2am and you know what made the difference between a five-minute fix and a five-hour fire
Strong written communication — because much of your work is documenting decisions and runbooks that others will rely on
Direct experience with dbt or a comparable transformation framework
Experience with Microsoft Fabric specifically, or a strong point of view on Fabric vs. Synapse vs. Snowflake
Familiarity with Microsoft Agent Framework (Semantic Kernel, AutoGen) or comparable agent orchestration systems
E-commerce or direct-to-consumer industry experience, particularly at multi-channel scale
Experience with vector databases (Azure AI Search, pgvector, Pinecone) for AI-retrieval use cases
Prior experience as the first or founding data engineer at a growing company

Senior Data Engineer

Key skills

About this role

Responsibilities:

Requirements: