Role Overview

Design, develop, and optimize ETL/ELT pipelines for both structured and unstructured data
Mentor junior team members and engage in communities of practice to deliver high-quality data and AI solutions while promoting best practices, standards, and adoption of reusable patterns
Partner with architects and stakeholders to influence and implement the vision of the AI and data pipelines while safeguarding the integrity and scalability of the environment
Ingest and process large-scale datasets (e.g. telematics data) into the Enterprise Data Lake and downstream systems
Curate and publish Data Products to support analytics, visualization, and machine learning use cases
Collaborate with data analysts, data scientists, and BI teams to build data models and pipelines for research, reporting, and advanced analytics
Apply best practices for data modeling, governance, and security across all solutions
Partner with cross-functional teams to ensure alignment and delivery of high-value outcomes
Monitor and fine-tune data pipelines for performance, scalability, and reliability
Automate auditing, balance, reconciliation, and data quality checks to maintain high data integrity
Develop self-healing pipelines with robust re-startability mechanisms for resilience
Schedule and orchestrate complex, dependent workflows using tools like MWAA, Autosys, or Control-M
Leverage CI/CD pipelines to enable automated integration, testing, and deployment processes
Lead Proof of Concepts (POCs) and technology evaluations to drive innovation
Develop AI-driven systems to improve data capabilities, ensuring compliance with industry best practices
Implement efficient Retrieval-Augmented Generation (RAG) architectures and integrate with enterprise data infrastructure
Implement data observability practices to proactively monitor data health, lineage, and quality across pipelines, ensuring transparency and trust in data assets

Requirements

Bachelor’s or master’s degree in computer science or a related discipline
5+ years of experience in data analysis, transformation, and development, with ideally 2+ years in the insurance industry
Experience and demostrated expertise in event-level and trip-level telematics driving data data
3+ years of experience developing and deploying large-scale data and analytics applications on cloud platforms such as AWS and Snowflake
Strong proficiency in SQL, Python, and ETL tools such as Informatica IDMC for data integration and transformation (3+ years)
Experience designing and optimizing data models for Data Warehouses, Data Marts, and Data Fabric, including dimensional modeling, semantic layers, metadata management, and integration for scalable, governed, and high-performance analytics (3+ years)
3+ years of hands-on experience in processing large-scale structured and unstructured data in both batch and near-real-time environments, leveraging distributed computing frameworks and streaming technologies for high-performance data pipelines.
Strong technical knowledge (AI solution leveraging Cloud and modern solutions)
3+ years of experience in Agile methodologies, including Scrum and Kanban frameworks
2+ years of experience in leveraging DevOps pipelines for automated testing and deployment, ensuring continuous integration and delivery of data solutions
Proficient in data visualization tools such as Tableau and Power BI, with expertise in creating interactive dashboards, reports, and visual analytics to support data-driven decision-making
Ability to analyze source systems, provide business solutions, and translate these solutions into actionable steps
Candidate must be authorized to work in the US without company sponsorship

Tech Stack

AWS
Cloud
ETL
Informatica
Python
SQL
Tableau

Benefits

Other rewards may include short-term or annual bonuses
long-term incentives
and on-the-spot recognition

Principal Data Engineer

Key skills

About this role

Role Overview

Requirements

Tech Stack

Benefits