Design and oversee key forward
and reverse-ETL patterns to deliver data to relevant stakeholders.
Develop scalable patterns in the transformation layer to ensure repeatable integrations with BI tools across various business verticals.
Expand and maintain the Alpaca Data Lakehouse architecture's constantly evolving elements.
Collaborate closely with sales, marketing, product, and operations teams to address key data flow needs.
Operate the system and manage production issues in a timely manner.

7+ years of experience in data engineering, including 2+ years of building scalable, low-latency data platforms capable of handling >100M events/day.
Proficiency in at least one programming language, with strong working knowledge of Python and SQL.
Experience with cloud-native technologies like Docker, Kubernetes, and Helm.
Strong hands-on experience with relational database systems and object storage implementations like Apache Iceberg.
Strong hands-on experience with Google Cloud Platform and its various data-related services (Composer, Dataproc, Datastream, etc.).
Experience in building scalable transformation layers, preferably through formalized SQL models (e.g., dbt).
Ability to work in a fast-paced environment and adapt solutions to changing business needs.
Experience with ETL orchestrators / frameworks like Apache Airflow and Airbyte.
Production experience with streaming systems like Kafka.
Exposure to infrastructure, DevOps, and Infrastructure as Code (IaaC), like Terraform.
Deep knowledge of distributed systems, storage, transactions, and query processing utilizing open-source distributed query engines like Trino (formerly PrestoSQL).

Senior Data Engineer

Key skills