Support the design and delivery of data ingestion pipeline and infrastructure
Assist in the successful migration of legacy data lifecycle management to new platform without disrupting existing data consumers
Establish and maintain SLIs and SLOs for new ingestion and data platform with dashboards and alerting to track performance
Build flexible data storage layer supporting a variety of use cases, e.g., transactional, analytic, and machine learning workloads
Implement comprehensive monitoring, observability, and incident response practices for all event data pipelines and services
Collaborate with product engineering, analytics, and machine learning teams to define contracts, functional requirements, and standards
Design and implement a semantic metadata layer that classifies and labels data assets across both products, enabling consistent data discovery, exposure policies, and identification of cross-product data reuse opportunities
Architect and deliver a multi-tenant data model that supports secure data sharing and isolation across clients, with controls designed to meet regulatory and government compliance requirements
Requirements
Deep understanding of data governance principles and data lifecycle management, including data quality, lineage, retention, and access control
Strong familiarity with database backend technologies (OLAP vs. OLTP) and when to apply each, with hands-on experience using data warehouse technologies (Databricks, Clickhouse, Redshift, etc) at production scale
Strong familiarity with SQL with the ability to write and optimize queries
Skill at building complex systems and identifying core primitives and how to apply them to meet changing business needs and future roadmap requirements.
Experience designing and operating data systems on a major cloud provider (AWS or GCP) and working in a multi-cloud deployment environment
Proficiency with containerization and orchestration technologies, including Docker and Kubernetes
Proven ability to integrate systems, connecting legacy platforms with modern architectures through well-designed interfaces and migration strategies
Effective communicator who can facilitate system design discussions, document architectural decisions, and work across team boundaries
(Nice to have) Familiarity with Elixir for building concurrent, fault-tolerant data services
(Nice to have) Experience with data pipeline and streaming technologies such as Airflow, Kafka, Apache Flink, or similar
(Nice to have) Hands-on experience with columnar/OLAP databases such as Databricks or ClickHouse
Experience designing and operating data systems on AWS; GCP experience is a plus
Strong collaboration and communication skills, comfortable leading design discussions, writing technical specs, and working across team boundaries
Experience with Infrastructure-as-Code (IaC) tools such as Terraform, CloudFormation, or similar.
Tech Stack
Airflow
Amazon Redshift
Apache
AWS
Cloud
Docker
Elixir
Google Cloud Platform
Kafka
Kubernetes
SQL
Terraform
Benefits
full range of medical, financial, and/or other benefits