Storable is on a mission to power the future of storage with their innovative platform. They are seeking a Sr. Data Engineer to build and maintain scalable data systems that support analytics, reporting, and product innovation, collaborating closely with cross-functional teams to ensure data reliability and accessibility.
Responsibilities:
- Build & Maintain Data Pipelines: Design, implement, and maintain scalable data pipelines using modern data tools to process and manage large datasets efficiently
- ETL Development: Develop and optimize ETL/ELT pipelines to ingest, transform, and deliver data from multiple internal and external sources
- Workflow Orchestration: Build and manage workflows using Apache Airflow to ensure reliable scheduling and monitoring of data processes
- Query Engines & Processing Frameworks: Leverage tools such as Trino (Presto), Apache Spark, and related distributed processing technologies to support analytics and data applications
- Data Modeling & Warehousing: Contribute to schema design and data modeling efforts to ensure clean, well-structured, and scalable data architecture
- Data Quality & Governance Support: Implement monitoring, validation checks, and best practices to ensure data accuracy, consistency, and reliability
- Optimize Data Infrastructure: Utilize AWS services (S3, Redshift, Glue, Athena, Lambda) and modern data technologies (e.g., Apache Iceberg) to support a scalable and efficient data platform
- Cross-Functional Collaboration: Partner with engineering, product, analytics, and business teams to understand requirements and deliver high-quality data solutions
- Monitor & Improve Performance: Proactively monitor pipelines and workflows, troubleshoot issues, and continuously improve performance and reliability
Requirements:
- 5+ years of experience building and maintaining data pipelines or working in data engineering or related roles
- Hands-on experience with data tools such as Apache Airflow, Apache Spark, Apache Iceberg, Trino/Presto, and AWS services (S3, Redshift, Glue, Athena, Lambda)
- Proficiency in Python (or similar language) for data processing and pipeline development
- Solid understanding of data warehousing concepts, schema design, and data modeling best practices
- Experience deploying and supporting data pipelines in production environments
- Strong analytical skills and ability to diagnose and resolve data-related issues
- Ability to communicate effectively with both technical and non-technical stakeholders and work in cross-functional teams
- Experience with visualization tools such as Looker or Tableau
- Exposure to data governance, privacy, and quality frameworks
- Familiarity with CI/CD practices and version control for data workflows