Gradient AI is revolutionizing Group Health and P&C insurance with AI-powered solutions that help insurers predict risk more accurately. The Senior Data Engineer will lead the improvement and refinement of scalable data infrastructure and data architecture, focusing on building reliable data pipelines and orchestration frameworks to support predictive analytics solutions.
Responsibilities:
- Design, build, and implement data systems to support ML and AI models for our health insurance clients, ensuring strict compliance with healthcare data privacy and security regulations (e.g., HIPAA)
- Develop tools for extracting, processing, and profiling diverse healthcare data sources, including EHRs, medical claims, pharmacy data, and genomic data
- Collaborate with data scientists to transform large volumes of health-related and bioinformatics data into modeling-ready formats, prioritizing data quality, integrity, and reliability in healthcare applications
- Build and maintain infrastructure for the extraction, transformation, and loading (ETL) of data from a variety of sources using SQL, AWS, and healthcare-specific big data technologies and analytics platforms
- Ensure data pipelines meet the unique requirements of health, medical, and bioinformatics data processing, including translating complex medical and biological concepts into actionable data requirements
Requirements:
- BS in Computer Science, Bioinformatics, or another quantitative discipline with 4+ years of relevant working experience
- Deep expertise in health, medical, and bioinformatics data, including real-world healthcare datasets, with a strong understanding of the complexities and challenges of processing medical and biological information
- Proficiency in Python and SQL within a professional environment
- Hands-on knowledge of big data tools like Apache Spark (PySpark), Databricks, Snowflake, or similar platforms
- Experience with data orchestration frameworks such as Airflow, Dagster, or Prefect
- Experience with modern DevOps practices, including CI/CD, IaC (Terraform), containerization (Docker/Kubernetes), and cloud environments (AWS preferred)
- Knowledge of data transformation tools, such as dbt, is a plus