EMPLOYERS is a dynamic provider of workers' compensation insurance and services, seeking a Senior Data Engineer to design, develop, and maintain scalable data platforms. The role involves collaborating across teams to create data solutions, optimizing data workflows, and supporting analytics capabilities within the organization.
Responsibilities:
- Design, develop, and maintain scalable, performant data pipelines using Python, PySpark, and Databricks, following Medallion architecture patterns (Bronze, Silver, Gold)
- Build and optimize Delta Lake tables, manage schema evolution, and implement data quality checks across ingestion and transformation layers
- Develop and maintain Airflow DAGs for pipeline orchestration, scheduling, and monitoring
- Collaborate with product management and business stakeholders to interpret requirements and define data specifications
- Create and maintain semantic data layers and curated datasets for BI platforms (e.g., Power BI, QuickSight)
- Analyze trade-offs between commercial and custom-built data solutions, offering technical guidance to decision-makers
- Troubleshoot and resolve issues across data workflows, pipeline failures, and infrastructure
- Support machine learning initiatives by building feature pipelines, data preparation workflows, and serving layers within the Databricks platform
- Ensure compliance with legal, regulatory, and internal governance requirements across data solutions
- Automate operational processes using CI/CD pipelines and infrastructure-as-code principles
- Establish and promote best practices for data engineering, pipeline development, and monitoring
- Partner with teams across the organization to translate business needs into technical data solutions
- Create and maintain detailed technical documentation including data lineage diagrams, pipeline architecture, and workflow documentation
- Serve as a subject matter expert in data engineering, pipeline optimization, Databricks platform capabilities, and data architecture
- Provide mentorship to junior team members, fostering professional growth and collaboration
- Communicate effectively with leadership and stakeholders regarding project status, risks, and milestones
- Stay current on emerging data engineering technologies and industry trends to drive continuous improvement and innovation
Requirements:
- 7+ years of experience in data engineering or related roles
- Strong proficiency in Python and PySpark for data pipeline development and scripting
- Deep hands-on experience with Databricks, including Delta Live Tables (DLT), Databricks Workflows, Unity Catalog, and SQL Warehouses
- Strong knowledge of Medallion architecture (Bronze/Silver/Gold) within Databricks
- Solid understanding of Lakehouse and Delta Lake concepts, including ACID transactions, time travel, and schema enforcement
- Proficiency in Apache Airflow for workflow orchestration and scheduling
- Strong knowledge in relational databases, SQL query optimization, and multiple database systems
- Experience creating semantic data layers for BI platforms (e.g., Power BI, QuickSight)
- Experience combining internal and external data sources for in-depth industry and business analyses
- Understanding of cloud computing platforms and integration with on-premise solutions
- Familiarity with CI/CD pipelines, version control (Git), and infrastructure-as-code practices
- Strong understanding of testing methodologies, including data validation, pipeline testing, and automation
- Experience working in regulated industries (e.g., insurance, banking, finance, healthcare)
- Outstanding analytical and problem-solving skills with the ability to simplify complex tasks and absorb new information rapidly
- Strong written and verbal communication skills, with the ability to convey technical information clearly to both technical and non-technical audiences
- Highly self-motivated, able to take initiative and manage time effectively to meet deadlines and deliverables
- Strong attention to detail with the ability to work independently with minimal supervision
- Ability to manage multiple projects simultaneously in a fast-paced, evolving environment
- Exceptional research skills with the ability to gather, analyze, and synthesize complex information quickly
- Passion for delivering exceptional quality and continuous learning of new technologies and frameworks
- Strong listening skills and the ability to build positive relationships with colleagues, stakeholders, and customers
- Commitment to creating an inclusive, team-oriented environment that aligns with company culture and values diversity
- Experience supporting ML pipelines and feature engineering within a data platform context is a plus
- Databricks certification preferred (e.g., Databricks Certified Data Engineer Associate or Professional)