Design and implement scalable, high-performance data pipelines to ingest and transform data from a variety of sources, ensuring reliability, observability, and maintainability.
Build and maintain APIs that enable flexible, secure, and tenant-aware data integrations with external systems.
Work with event-driven and batch processing architectures, ensuring data freshness and consistency at scale.
Drive clean API design and integration patterns that support both real-time and batch ingestion while handling diverse authentication mechanisms (OAuth, API keys, etc.).
Implement observability, monitoring, and alerting to track data freshness, failures, and performance issues, ensuring transparency and reliability.
Optimize data flows and transformations, balancing cost, efficiency, and rapid development cycles in a cloud-native environment.
Collaborate with data engineering, infrastructure, and product teams to create an integration platform that is flexible, extensible, and easy to onboard new sources.
Requirements
5+ years of experience in data engineering, software engineering, or integration engineering, with a focus on ETL, APIs, and data pipeline orchestration.
Strong proficiency in Python
Experience with API-based ETL, handling REST, GraphQL, Webhooks
Experience implementing authentication flows
Proficiency in SQL and BigQuery
Experience with orchestration frameworks (e.g., Airflow) to manage and monitor complex data workflows.
Familiarity with containerization (Docker, Kubernetes) to deploy and scale workloads.
Ability to drive rapid development while ensuring maintainability, balancing short-term delivery needs with long-term platform stability.
Tech Stack
Airflow
BigQuery
Cloud
Docker
ETL
GraphQL
Kubernetes
Python
SQL
Benefits
Equity package
Comprehensive healthcare benefits (medical, dental, and vision)