Cloudera is a leading company in data management and analytics, empowering organizations to transform complex data into actionable insights. As a Staff Software Engineer specializing in Airflow, you will lead the architecture and evolution of the workflow orchestration platform, tackling complex technical challenges and collaborating with cross-functional teams to enhance the data platform's capabilities.
Responsibilities:
- Drive the multi-quarter technical roadmap and architecture for the Airflow platform, ensuring it is secure, highly scalable, reliable, and cost-efficient for enterprise-grade workloads
- Design and implement solutions for the most challenging technical issues, such as extreme scale, multi-tenancy isolation, complex scheduling, and hybrid/multi-cloud deployment models
- Collaborate closely with product management, principal engineers, and other platform teams (e.g., Spark, Kubernetes) to define and deliver core orchestration capabilities that influence the entire data platform
- Define and champion best practices, performance optimization, and quality standards (observability, testing, and fault tolerance) for the Airflow service and its integrations
- Mentor senior and junior engineers on complex technical design, best practices, and execution, elevating the overall technical capacity of the team and organization
- Maintain significant contributions and influence within the Apache Airflow open-source community, aligning the project’s roadmap with product strategy
Requirements:
- Bachelor's degree in Computer Science or equivalent, and 6+ years of experience
- Deep, hands-on knowledge of Apache Airflow internals (scheduler, executor, serialization, REST APIs) and complex DAG authoring/optimization
- Proficiency in both Python and Java with core data platform technologies and cloud-native deployments (e.g., Kubernetes, Cloud Composer, AWS/GCP/Azure)
- Demonstrated ability to drive design and architectural decisions with a focus on non-functional requirements (security, performance, high availability, fault tolerance)
- Proven ability to lead and drive technical projects across multiple teams without direct reporting authority
- Experience defining the architecture for multi-tenant, service-oriented data platforms
- Significant contributions to Apache Airflow or related open-source projects
- Background in performance tuning and profiling large-scale Python and distributed applications
- Familiarity with data governance and security frameworks (e.g., Ranger, Kerberos) and their integration with workflow orchestration