Storable is on a mission to power the future of storage with an innovative platform for managing self-storage operations. The Data Engineering Manager will oversee data operations, lead a team, and enhance data quality and accessibility across the organization.
Responsibilities:
- Lead Data Management Strategy: Define and execute the data management vision, strategy, and best practices, ensuring alignment with Storable's business goals and objectives
- Oversee Data Pipelines: Design, implement, and maintain scalable data pipelines using industry-standard tools to efficiently process and manage large-scale datasets
- Ensure Data Quality & Governance: Implement data governance policies and frameworks to ensure data accuracy, consistency, and compliance across the organization
- ETL Development: Build, optimize, and maintain ETL pipelines for ingesting, transforming, and delivering large datasets from multiple sources
- Workflow Orchestration: Manage and schedule complex workflows using Apache Airflow
- Query Engines & Processing Frameworks: Leverage Trino (Presto), Apache Spark, and other
- Manage Cross-Functional Collaboration: Partner with engineering, product, and business teams to make data accessible and actionable, and ensure it drives informed decision-making
- Optimize Data Infrastructure: Leverage modern data tools and platforms (e.g., AWS, Apache Airflow, Apache Iceberg) to create an efficient, reliable, and scalable data infrastructure
- Monitor & Improve Performance: Proactively monitor data processes and workflows, troubleshoot issues, and optimize performance to ensure high reliability and data integrity
- Mentorship & Leadership: Lead and develop a team of data engineers and analysts, fostering a collaborative environment where innovation and continuous improvement are valued
Requirements:
- Proven Expertise in Data Management: Significant experience in managing data infrastructure, data governance, and optimizing data pipelines at scale
- Technical Proficiency: Strong hands-on experience with data tools and platforms such as Apache Airflow, Apache Iceberg, and AWS services (S3, Lambda, Redshift, Glue, Athena)
- Data Pipeline Mastery: Familiarity with designing, implementing, and optimizing data pipelines and workflows in Python or other languages for data processing
- Hands-on experience with Trino/Presto and Apache Spark for distributed data processing
- Solid understanding of data modeling, warehousing concepts, and schema design
- Experience with Data Governance: Solid understanding of data privacy, quality control, and governance best practices
- Leadership Skills: Ability to lead and mentor teams, influence stakeholders, and drive data initiatives across the organization
- Analytical Mindset: Strong problem-solving abilities and a data-driven approach to improving business operations
- Excellent Communication: Ability to communicate complex data concepts to both technical and non-technical stakeholders effectively
- Experience with visualization tools (e.g., Looker, Tableau) and reporting frameworks to provide actionable insights