Experian is a global data and technology company that helps redefine lending practices, uncover fraud, and simplify healthcare. The Data Engineer will design, build, and maintain scalable data platforms using AWS, collaborating with data scientists and engineering teams to translate business requirements into scalable data solutions.

Responsibilities:

Collaborate with data scientists, analysts, and engineering teams to translate business and AI requirements into scalable data solutions, while ensuring data quality, performance, and cost efficiency across the platform
Work with large-scale datasets to build and optimize data pipelines using AWS services such as EMR (Spark, Trino), S3, Glue, Athena, and Airflow
Design and manage lakehouse architectures, using technologies like Apache Iceberg and Glue Catalog to support ACID transactions, schema evolution, and performant analytics
Support machine learning and LLM projects by preparing and delivering datasets for use in Amazon SageMaker and Amazon Bedrock

Requirements:

3+ years of experience in data engineering or related roles
3+ years experience with Python and SQL
3+ years hands-on experience with cloud platforms (AWS, Azure, or GCP)
Experience building and maintaining scalable batch data pipelines
Experience working with data lakes or lakehouse architectures
Experience with Apache Spark and EMR
Bachelor's degree in Computer Science, Engineering, Mathematics, or related discipline

Data Engineer - Healthcare (Remote)

Key skills

About this role

Responsibilities:

Requirements: