Cloudera is a leading data partner for top companies, empowering them to transform complex data into actionable insights. The Staff Software Engineer, Airflow will lead the technical vision and architecture of the Apache Airflow-based workflow orchestration platform, solving complex technical problems and collaborating with cross-functional teams to enhance the data platform.
Responsibilities:
- Drive the multi-quarter technical roadmap and architecture for the Airflow platform, ensuring it is secure, highly scalable, reliable, and cost-efficient for enterprise-grade workloads
- Design and implement solutions for the most challenging technical issues, such as extreme scale, multi-tenancy isolation, complex scheduling, and hybrid/multi-cloud deployment models
- Collaborate closely with product management, principal engineers, and other platform teams (e.g., Spark, Kubernetes) to define and deliver core orchestration capabilities that influence the entire data platform
- Define and champion best practices, performance optimization, and quality standards (observability, testing, and fault tolerance) for the Airflow service and its integrations
- Mentor senior and junior engineers on complex technical design, best practices, and execution, elevating the overall technical capacity of the team and organization
- Maintain significant contributions and influence within the Apache Airflow open-source community, aligning the project’s roadmap with product strategy
Requirements:
- Bachelor's degree in Computer Science or equivalent, and 6+ years of experience
- Deep, hands-on knowledge of Apache Airflow internals (scheduler, executor, serialization, REST APIs) and complex DAG authoring/optimization
- Mastery in Python, some Java experience and extensive experience with core data platform technologies and cloud-native deployments (e.g., Kubernetes, Cloud Composer, AWS/GCP/Azure)
- Demonstrated ability to drive design and architectural decisions with a focus on non-functional requirements (security, performance, high availability, fault tolerance)
- Proven ability to lead and drive technical projects across multiple teams without direct reporting authority
- Experience defining the architecture for multi-tenant, service-oriented data platforms
- Significant contributions to Apache Airflow or related open-source projects
- Background in performance tuning and profiling large-scale Python and distributed applications
- Familiarity with data governance and security frameworks (e.g., Ranger, Kerberos) and their integration with workflow orchestration