Lead end-to-end technical design for CloudZero's next-generation data platform, from event ingestion and stream processing through hot/cold storage and the query layer to the API surface
Document architectural decisions, tradeoffs, and migration strategies with the rigor of an RFC-driven process
Shape and drive every layer of the new architecture: event ingestion, stream processing and enrichment, real-time serving, analytical storage, query layer, and API
Design and deliver CloudZero's real-time data pipeline from ingestion through enrichment to serving
Establish SLOs for throughput, latency, and correctness, and build the operational playbooks that make this system trustworthy enough to replace the batch pipelines our entire product currently depends on
Tackle real-time streaming at scale across thousands of customers simultaneously, with fault tolerance, backpressure awareness, and correctness as non-negotiables
Redesign CloudZero's dimensional cost model to support high-cardinality, multi-dimensional cost attribution without runaway materialization costs
Drive incremental, delta-based materialization strategies using modern open table formats, dramatically reducing expensive full-rebuild jobs and unlocking millions in annual infrastructure savings
Assess CloudZero's current query infrastructure, drive in-flight migrations to completion, and lead the evolution of the query engine layer going forward
Own performance optimization across partition pruning, predicate pushdown, and query planning, and set the vision for how the query layer grows as data volumes scale 10x
Evolve CloudZero's proprietary cost attribution engine from a batch-oriented model to one that assigns complex cost dimensions by team, feature, and customer within seconds of resource usage
Rethink enrichment, data lineage, and correctness guarantees in a streaming context
Partner with product, infrastructure, and analytics engineering to define a multi-year data platform roadmap
Build consensus across engineering leadership on foundational investments including table formats, streaming frameworks, query engines, and schema management
Participate in architecture reviews, contribute to design patterns and best practices, and mentor senior and staff engineers through code review, pairing, and structured feedback
Make everyone around you better, not by directing, but by raising the collective craft
Requirements
10+ years in data engineering with a clear trajectory toward principal or staff-level architecture
Built and operated large-scale data platforms serving tens of millions of events per day in production
Deep experience with streaming systems such as Kafka, Kinesis, Flink, or Spark Streaming at real production throughput
Strong hands-on fluency with modern open table formats including Apache Iceberg, Delta Lake, and Hudi, including compaction, partitioning strategy, and time-travel queries
Designed hot/cold storage architectures with explicit latency SLOs per tier
Proven ability to drive a data platform end to end, not just a single layer