Design and build scalable data pipelines, data models, and feature stores to support analytics and ML workloads.
Deploy and manage cloud-native data applications on AWS.
Ensure the technical quality, performance, and reliability of production-grade data pipelines through strong observability and engineering best practices.
Requirements
Strong experience in Python and SQL.
Experience building Data Pipelines for downstream ML Workloads.
Skilled in data modelling and building optimised and efficient data marts and warehouses in the cloud.
Experience with tools like Spark, Airflow, and AWS or similar tooling.
Comfortable with both real-time and batch data workloads, applying modern data transformation and orchestration patterns.
Experience with Infrastructure as Code (Terraform) and containerization (Docker).
Contributed to or maintained CI/CD pipelines (Jenkins, GitHub Actions) as part of production-grade data systems.
Enjoy solving complex data problems and collaborating in a fast-moving environment.