Shutterfly is looking for a Principal Data Engineer to join their team, focusing on enhancing the Data Warehouse on AWS to support various analytic needs. The role involves designing and developing data engineering solutions, providing technical leadership, and improving data processing environments.
Responsibilities:
- Own & build design, develop, test, deploy, maintain and enhance full-stack data engineering solutions for the Data Pipelines & Data Mart encompassing the Data Warehouse
- Provide technical leadership to both internal Data Warehouse team as well as to publishers & subscribers of the Shutterfly’s Enterprise Data Lake
- Identify, evaluate and evangelize through data-based evidence improvements to the Data Lake as we as the data processing environment; hence influence the data strategy
- With your technical expertise, own and manage project priorities, deadlines and deliverables
- Always with a customer focus, evangelize the benefits of existing solutions and new technologies to drive the use and push the technology of the Data Warehouse forward
- Work closely with Data Operations to improve CI/CD pipelines, as well as continually improve the operations, monitoring and performance of the Data Warehouse
- Work across multiple teams in high visibility roles and own solutions end-to-end
Requirements:
- Expert knowledge of Python, Spark, and SQL; experience with large-scale data processing and distributed systems
- 10+ years of hands-on experience building data platforms, including data pipelines, data warehousing, and feature engineering systems
- Proven experience championing the adoption of AI-powered tools to increase team productivity, reduce manual effort, and improve operational efficiency
- Strong foundation in data structures, algorithms, and system design for large-scale data and AI systems
- Experience with AWS ecosystem (e.g., S3, EMR, Glue, Lambda, SageMaker) or equivalent cloud platforms (GCP, Azure)
- Hands-on experience with Databricks and modern lakehouse architectures
- Familiarity with real-time data streaming frameworks
- Experience supporting Machine Learning and AI workloads, including feature engineering, model data pipelines, and training data preparation
- Bachelor's / master's degree in computer science or equivalent