and reverse-ETL patterns to deliver data to relevant stakeholders.
Develop scalable patterns in the transformation layer to ensure repeatable integrations with BI tools across various business verticals.
Expand and maintain the Alpaca Data Lakehouse architecture's constantly evolving elements.
Collaborate closely with sales, marketing, product, and operations teams to address key data flow needs.
Operate the system and manage production issues in a timely manner.
Requirements
7+ years of experience in data engineering, including 2+ years of building scalable, low-latency data platforms capable of handling >100M events/day.
Proficiency in at least one programming language, with strong working knowledge of Python and SQL.
Experience with cloud-native technologies like Docker, Kubernetes, and Helm.
Strong hands-on experience with relational database systems and object storage implementations like Apache Iceberg.
Strong hands-on experience with Google Cloud Platform and its various data-related services (Composer, Dataproc, Datastream, etc.).
Experience in building scalable transformation layers, preferably through formalized SQL models (e.g., dbt).
Ability to work in a fast-paced environment and adapt solutions to changing business needs.
Experience with ETL orchestrators / frameworks like Apache Airflow and Airbyte.
Production experience with streaming systems like Kafka.
Exposure to infrastructure, DevOps, and Infrastructure as Code (IaaC), like Terraform.
Deep knowledge of distributed systems, storage, transactions, and query processing utilizing open-source distributed query engines like Trino (formerly PrestoSQL).
Tech Stack
Airflow
Apache
Cloud
Distributed Systems
Docker
ETL
Google Cloud Platform
Kafka
Kubernetes
Python
SQL
Terraform
Benefits
Competitive Salary & Stock Options
Health Benefits
New Hire Home-Office Setup: One-time USD $500
Monthly Stipend: USD $150 per month via a Brex Card