Architect, automate, and maintain production-grade data infrastructure on AWS (e.g., S3, EMR, Glue, Lambda, Redshift) using Terraform or CDK, with a focus on high availability, security, and consistent environments across the SDLC.
Integrate Claude Code and other LLM-based agents into the engineering workflow to accelerate infrastructure provisioning, refactoring, and generation of technical documentation, embedding AI into daily development practices.
Design, build, and optimize CI/CD pipelines that test, deploy, and monitor dbt models and AWS Glue/Spark jobs, ensuring reliable, repeatable delivery of governed data assets.
Implement agentic operations for DataOps—configuring AI agents to triage and perform root-cause analysis of pipeline failures, surface cost-optimization signals, and proactively detect schema drift or data quality regressions.
Engineer scalable, well-governed data pipelines and tables using Apache Iceberg, Airflow (MWAA), and Redshift, emphasizing simplicity, reusability, and clear ownership of data products.
Operationalize security and compliance best practices in a regulated insurance environment, including IAM automation, encryption, audit-ready logging, and alignment with enterprise RBAC/MFA standards.
Partner with Product Strategy, PDO, and data science teams to ensure data platforms and features can support AI-heavy products like the Agentic AI Platform, Claim Summary, and Underwriting Assistant at scale.
Requirements
5+ years of experience in Data Engineering, Data Operations, or Platform Engineering building and operating cloud data infrastructure
Deep proficiency with AWS (e.g., S3, EMR, Glue, Lambda, Redshift) and infrastructure-as-code (Terraform strongly preferred; CDK a plus)
Strong experience with dbt in production (modeling, testing, documentation, deployment)
Advanced SQL skills (performance tuning, complex joins and window functions)
Solid Python experience for automation, orchestration, and data engineering tasks
Hands-on experience with Apache Spark for large-scale batch or streaming workloads, ideally on AWS EMR or Glue
Proven track record of building or maintaining CI/CD pipelines (Git-based workflows, automated testing, deployment, and monitoring) for data and analytics workloads
Strong systems thinking and data modeling skills (e.g., Kimball, Data Vault)
Clear, collaborative communication style with the ability to work across product, security, and business stakeholders in a distributed environment
Tech Stack
Airflow
Amazon Redshift
Apache
AWS
Cloud
Python
SDLC
Spark
SQL
Terraform
Vault
Benefits
Flexible work environment
Health and Wellness benefits
Paid time off programs including volunteer time off
Market-competitive pay and incentive programs
Continual development and internal career growth opportunities