Kunai builds full-stack technology solutions for banks and financial services. They are seeking a Senior Data Engineer to lead the migration of data workloads from GCP to AWS and to design scalable data pipelines that support their data-dependent teams.
Responsibilities:
- Own the technical strategy and execution of migrating large-scale data workloads from GCP to AWS, ensuring continuity, data integrity, and minimal disruption
- Design migration playbooks and serve as the go-to expert for decisions across compute, storage, and orchestration layers during the transition
- Architect and implement scalable batch and streaming data pipelines using Apache Spark, Delta Lake, and the medallion architecture
- Establish standards for pipeline design, data quality, and observability that the broader engineering organization can build on
- Take accountability for the reliability, performance, and cost-efficiency of production ETL jobs running on AWS (EMR, Glue) against terabyte-scale datasets
- Proactively identify and address bottlenecks, technical debt, and opportunities to improve throughput and resilience
Requirements:
- Strong, hands-on Scala expertise with solid Python proficiency — you're comfortable switching between both and know when each is the right tool
- Deep experience with Apache Spark for both streaming and batch data processing at scale
- Proven track record running production ETL workloads on AWS (EMR, Glue) against terabytes of data
- Experience designing and operating data architectures using Delta Lake and the medallion (Bronze / Silver / Gold) pattern
- 8+ years of data engineering experience, with a track record of owning critical infrastructure end-to-end
- Bachelor's Degree, in lieu of a degree, demonstrating in addition to the minimum years of experience required for the role, three years of specialized training and/or progressively responsible work experience in technology for each missing year of college is required
- Familiarity with GCP data services and/or hands-on experience migrating data workloads from GCP to AWS
- Experience with frameworks like Apache Flink, Apache Beam, Airflow, or Databricks