Experian is a global data and technology company that helps redefine lending practices, uncover fraud, and simplify healthcare. The Data Engineer will design, build, and maintain scalable data platforms using AWS, collaborating with data scientists and engineering teams to translate business requirements into scalable data solutions.
Responsibilities:
- Collaborate with data scientists, analysts, and engineering teams to translate business and AI requirements into scalable data solutions, while ensuring data quality, performance, and cost efficiency across the platform
- Work with large-scale datasets to build and optimize data pipelines using AWS services such as EMR (Spark, Trino), S3, Glue, Athena, and Airflow
- Design and manage lakehouse architectures, using technologies like Apache Iceberg and Glue Catalog to support ACID transactions, schema evolution, and performant analytics
- Support machine learning and LLM projects by preparing and delivering datasets for use in Amazon SageMaker and Amazon Bedrock
Requirements:
- 3+ years of experience in data engineering or related roles
- 3+ years experience with Python and SQL
- 3+ years hands-on experience with cloud platforms (AWS, Azure, or GCP)
- Experience building and maintaining scalable batch data pipelines
- Experience working with data lakes or lakehouse architectures
- Experience with Apache Spark and EMR
- Bachelor's degree in Computer Science, Engineering, Mathematics, or related discipline