Develop and maintain data pipelines using Amazon EMR or Amazon Glue.
Create data models and end-user querying using Amazon Redshift or Snowflake, Amazon Athena, and Presto.
Build and maintain the orchestration of data pipelines using Airflow.
Collaborate with other teams to understand their data needs and help design solutions.
Troubleshoot and optimize data pipelines and data models.
Write and maintain PySpark and SQL scripts to extract, transform, and load data.
Document and communicate technical solutions to both technical and non-technical audiences.
Stay up-to-date with new AWS data technologies and evaluate their impact on our existing systems.
Requirements
Bachelor's degree in Computer Science, Engineering, or a related field.
3+ years of experience working with PySpark and SQL.
2+ years of experience building and maintaining data pipelines using Amazon EMR or Amazon Glue.
2+ years of experience with data modeling and end-user querying using Amazon Redshift or Snowflake, Amazon Athena, and Presto.
1+ years of experience building and maintaining the orchestration of data pipelines using Airflow.
Strong problem-solving and troubleshooting skills.
Excellent communication and collaboration skills.
Ability to work independently and within a team environment.
AWS Data Analytics Specialty Certification
Experience with Agile development methodology
Tech Stack
Airflow
Amazon Redshift
AWS
PySpark
SQL
Benefits
Equal opportunities in employment practices
Non-discrimination based on various factors
Recruitment, compensation, promotions, transfers, disciplinary action, layoff, training, and social and recreational programs are governed by this policy.