RedCloud Consulting is a business and IT consulting company that supports local clients in the Puget Sound area. They are seeking a Senior Data Engineer to design, build, and operate large-scale data pipelines that enhance advertising systems and support experimentation and measurement.
Responsibilities:
- Design, build, and maintain large-scale data pipelines supporting bidding, targeting, experimentation, and measurement
- Optimize workflows under real-world constraints—sampling limits, narrow data windows, service ceilings, and high-throughput requirements
- Ensure reliability, performance, and cost-efficiency across batch and streaming workloads
- Build scalable solutions that deliver value today while creating a path for evolution over time
- Own pipelines end-to-end, from design to deployment to maintenance
- Diagnose, resolve, and eliminate recurring issues to strengthen system stability
- Reduce duplicated workflows, wasted compute, and operational overhead
- Establish engineering best practices around quality, testing, deployment, and observability
- Partner closely with platform teams, infra teams, and other data/ML groups to unblock scaling constraints
- Work directly with Data Science, ML Engineering, and Product to deliver features, unblock analysis, and support experimentation
- Translate ambiguous requirements into well-structured engineering solutions
- Operate with urgency in an environment where priorities shift quickly
- Drive initiatives from zero to one with minimal guidance
- Innovate within constraints and deliver solutions that work in the real world—not just on paper
Requirements:
- Deep expertise in at least one major big data ecosystem (e.g., Spark, Flink, EMR, Hadoop-style systems)
- Strong proficiency with distributed compute frameworks (Spark strongly preferred)
- Experience building and operating pipelines at massive scale, including challenges like: Streaming ingestion, High-volume batch ETL, Strict SLAs or constrained windows, Sampling and stateful processing
- Hands-on experience with cloud-based data infrastructure (AWS or equivalent), including: Object storage (e.g., S3), Data lake formats (Iceberg/Hudi-style tables), SQL engines (Redshift, Athena, Snowflake, etc.), Streaming frameworks (Kinesis or Kafka-style systems)
- Strong coding skills in Python, Java, and/or Scala
- Experience with infrastructure-as-code frameworks (e.g., Terraform)
- Strong engineering discipline with Git, CI/CD, testing frameworks, and automated deployment processes
- Exposure to advertising, experimentation, or ML-driven systems is highly beneficial: Real-time bidding / auction systems, Feature pipelines or feature stores, Measurement and attribution systems, Online/offline ML model pipelines
- High ownership mindset: 'I will figure it out and ship it.'
- Bias for action and comfort moving quickly with incomplete requirements
- Strong ability to operate independently and deliver end-to-end solutions
- Adaptability in dynamic environments with fast-changing priorities
- Curiosity, learning speed, and enjoyment of solving hard scaling problems
- Cloud & Compute: AWS, Kubernetes, Lambda, Step Functions
- Data Lake: S3, Iceberg/Hudi-like table formats
- Analytics: Redshift, Athena, EMR/Spark
- Batch & Orchestration: Spark, Airflow-style orchestration
- ML/Modeling: SageMaker or custom ML platforms
- Languages: Python, Java, Scala, SQL
- Infra & DevOps: Terraform, Git, CI/CD