Architect, implement, and deliver next-generation features for Cloudera’s Data Engineering Experience, operating at a massive scale on thousands of production nodes.
Be a core contributor to Apache Spark, directly shaping the future of distributed data processing in the open-source community.
Develop high-performance features using Scala, Java, and Python on modern data platforms.
Gain and apply expert-level knowledge in core distributed data processing concepts, including: SQL Planners and Optimizers, Data layout and modern table formats like Apache Parquet and Iceberg, Fault tolerance and resilience in large-scale distributed systems.
Develop a deep technical understanding of components across the Cloudera Data Engineering Experience, with a focus on Iceberg and Spark, applying this knowledge to your daily tasks.
Work hands-on with massive distributed systems, scaling from hundreds to thousands of nodes in live production clusters.
Conduct thorough root cause analysis, debug complex system-level deployment issues, and resolve failures to maintain high system quality.
Improve internal infrastructure and tooling to streamline development, testing, and deployment processes.
Work closely with a high-impact, distributed team and stakeholders to drive product vision and delivery.
Requirements
5-7+ years of experience in professional software development
Strong proficiency in Java, Scala, or other JVM-based language
Solid experience in the design and development of distributed systems
Passion for clean coding, attention to detail, and a focus on software quality and maintainability
Strong oral and written communication skills for effective collaboration across a distributed team
Demonstrated ability to research, problem-solve, and operate independently without constant supervision
An open-minded approach with a desire to learn new technologies and an unwavering passion for building exceptional products.