Lead the design, development, and maintenance of scalable and high-performance data architectures and models that support both batch and streaming data processing.
Define and drive the execution of a strategic Data Engineering Roadmap that unlocks the full potential of the organization's data assets, ensuring alignment with long-term business objectives.
Own, enhance, and evolve critical datasets, including high-leverage streaming and batch data pipelines, along with the development of advanced data discovery, search experiences, and enterprise-wide reporting and data visualization tools.
Oversee complex data extraction, transformation, and loading (ETL/ELT) processes from diverse sources, driving critical analyses, manipulating large datasets.
Spearhead the administration, tuning, and optimization of database systems to ensure robust performance, reliability, and scalability, with a focus on both existing and emerging data technologies.
Lead efforts to identify inefficiencies and potential areas for optimization in current data engineering processes, workflows, and technologies.
Drive the adoption of industry-leading best practices and cutting-edge tools to enhance operational efficiency, reduce technical debt, and scale data engineering capabilities across the organization.
Develop and maintain MLOps pipelines that automate the deployment, monitoring, and management of machine learning models in production.
Perform other related duties as assigned.
Requirements
BA/BS in Computer Science, Data Engineering, or a related field, or equivalent professional experience. Advanced degree preferred.
4-6 years of experience in data engineering or related roles, with a strong track record of manipulating and managing large datasets using Python, AWS, and SQL.
Proven experience in designing, building, and optimizing data warehouses (Redshift) and strong expertise in data warehouse design and dimensional modeling.
Extensive experience with ETL/ELT processes and tools, particularly Airflow and DBT, for orchestrating complex data pipelines.
Proficiency in Infrastructure as Code tool AWS CDK for managing cloud infrastructure and automating deployments.
Demonstrate ability to solve complex data challenges and deliver scalable data solutions in a fast-paced, evolving environment.
Strong analytical and problem-solving skills, with a focus on optimizing data processes for performance and efficiency.
Demonstrated experience in managing Google Looker.
Familiarity with Circle CI / Airflow or similar tools.
Tech Stack
Airflow
Amazon Redshift
AWS
Cloud
ETL
Python
SQL
Benefits
Comprehensive Medical, Dental and Vision plans coverage since day one
Pre-tax benefits: HSA/ FSA
401k Retirement Savings Program with matching up to 4%
Voluntary benefits including disability, basic life or pet insurance, etc.
Monthly Wellness Stipend to promote mental and physical self-care
Flexible PTO and Remote First Environment
Regular team events, including Wellness Workshops and Team Building Events
Free access to Talkspace products for you and one household member, as well as access to a friends and family discount!