EMPLOYERS is a dynamic provider of workers' compensation insurance and services, seeking a Senior Data Engineer to design, develop, and maintain scalable data platforms. The role focuses on building production-grade data pipelines and optimizing data workflows while collaborating with cross-functional teams to meet organizational needs.
Responsibilities:
- Design, develop, and maintain scalable, performant data pipelines using Python, PySpark, and Databricks, following Medallion architecture patterns (Bronze, Silver, Gold)
- Build and optimize Delta Lake tables, manage schema evolution, and implement data quality checks across ingestion and transformation layers
- Develop and maintain Airflow DAGs for pipeline orchestration, scheduling, and monitoring
- Collaborate with product management and business stakeholders to interpret requirements and define data specifications
- Create and maintain semantic data layers and curated datasets for BI platforms (e.g., Power BI, QuickSight)
- Analyze trade-offs between commercial and custom-built data solutions, offering technical guidance to decision-makers
- Troubleshoot and resolve issues across data workflows, pipeline failures, and infrastructure
- Support machine learning initiatives by building feature pipelines, data preparation workflows, and serving layers within the Databricks platform
- Ensure compliance with legal, regulatory, and internal governance requirements across data solutions
- Automate operational processes using CI/CD pipelines and infrastructure-as-code principles
- Establish and promote best practices for data engineering, pipeline development, and monitoring
- Partner with teams across the organization to translate business needs into technical data solutions
- Create and maintain detailed technical documentation including data lineage diagrams, pipeline architecture, and workflow documentation
- Serve as a subject matter expert in data engineering, pipeline optimization, Databricks platform capabilities, and data architecture
- Provide mentorship to junior team members, fostering professional growth and collaboration
- Communicate effectively with leadership and stakeholders regarding project status, risks, and milestones
- Stay current on emerging data engineering technologies and industry trends to drive continuous improvement and innovation
Requirements:
- 7+ years of experience in data engineering or related roles
- Strong proficiency in Python and PySpark for data pipeline development and scripting
- Deep hands-on experience with Databricks, including Delta Live Tables (DLT), Databricks Workflows, Unity Catalog, and SQL Warehouses
- Strong knowledge of Medallion architecture (Bronze/Silver/Gold) within Databricks
- Solid understanding of Lakehouse and Delta Lake concepts, including ACID transactions, time travel, and schema enforcement
- Proficiency in Apache Airflow for workflow orchestration and scheduling
- Strong knowledge in relational databases, SQL query optimization, and multiple database systems
- Experience creating semantic data layers for BI platforms (e.g., Power BI, QuickSight)
- Experience combining internal and external data sources for in-depth industry and business analyses
- Understanding of cloud computing platforms and integration with on-premise solutions
- Familiarity with CI/CD pipelines, version control (Git), and infrastructure-as-code practices
- Strong understanding of testing methodologies, including data validation, pipeline testing, and automation
- Experience working in regulated industries (e.g., insurance, banking, finance, healthcare)
- Outstanding analytical and problem-solving skills with the ability to simplify complex tasks and absorb new information rapidly
- Strong written and verbal communication skills, with the ability to convey technical information clearly to both technical and non-technical audiences
- Highly self-motivated, able to take initiative and manage time effectively to meet deadlines and deliverables
- Strong attention to detail with the ability to work independently with minimal supervision
- Ability to manage multiple projects simultaneously in a fast-paced, evolving environment
- Exceptional research skills with the ability to gather, analyze, and synthesize complex information quickly
- Passion for delivering exceptional quality and continuous learning of new technologies and frameworks
- Strong listening skills and the ability to build positive relationships with colleagues, stakeholders, and customers
- Commitment to creating an inclusive, team-oriented environment that aligns with company culture and values diversity
- Experience supporting ML pipelines and feature engineering within a data platform context is a plus
- Databricks certification preferred (e.g., Databricks Certified Data Engineer Associate or Professional)