Cloudera is a company that empowers people to transform complex data into clear and actionable insights. They are seeking a Staff Software Engineer with deep expertise in distributed systems to join the Apache Spark Team, where the engineer will architect and deliver next-generation features for Cloudera’s Data Engineering Experience at a massive scale.
Responsibilities:
- Pioneer Scalable Solutions: Architect, implement, and deliver next-generation features for Cloudera’s Data Engineering Experience, operating at a massive scale on thousands of production nodes
- Drive Open-Source Innovation: Be a core contributor to Apache Spark, directly shaping the future of distributed data processing in the open-source community
- Build with Modern Stacks: Develop high-performance features using Scala, Java, and Python on modern data platforms
- Deepen Technical Mastery: Gain and apply expert-level knowledge in core distributed data processing concepts, including:SQL Planners and OptimizersData layout and modern table formats like Apache Parquet and IcebergFault tolerance and resilience in large-scale distributed systems
- Own the Technology Stack: Develop a deep technical understanding of components across the Cloudera Data Engineering Experience, with a focus on Iceberg and Spark, applying this knowledge to your daily tasks
- Conquer Large-Scale Challenges: Work hands-on with massive distributed systems, scaling from hundreds to thousands of nodes in live production clusters
- Ensure System Integrity: Conduct thorough root cause analysis, debug complex system-level deployment issues, and resolve failures to maintain high system quality
- Enhance Engineering Velocity: Improve internal infrastructure and tooling to streamline development, testing, and deployment processes
- Collaborate and Influence: Work closely with a high-impact, distributed team and stakeholders to drive product vision and delivery
Requirements:
- 5-7+ years of experience in professional software development
- Proven experience leading technical initiatives and delivering complex product enhancements from concept to production
- Strong proficiency in Java, Scala, or other JVM-based language
- Solid experience in the design and development of distributed systems
- Passion for clean coding, attention to detail, and a focus on software quality and maintainability
- Strong oral and written communication skills for effective collaboration across a distributed team
- Demonstrated ability to research, problem-solve, and operate independently without constant supervision
- An open-minded approach with a desire to learn new technologies and an unwavering passion for building exceptional products
- Experience with using/developing Apache Spark, Apache Iceberg, or other related technologies
- Deep experience with large-scale, distributed systems design and development, including a strong understanding of scaling, performance optimization, and scheduling
- Experience with SQL Planners and Optimizers
- Prior experience as a contributor to open-source projects