Amazon RedshiftAWSAzureBigQueryCloudETLPySparkPythonSparkSQLData WarehousingAnalyticsSnowflakeRedshiftDatabricksGoogle CloudLambdaS3GlueCloud StorageCollaborationRemote Work
About this role
Role Overview
Pipeline Architect: Design, build, and maintain scalable end-to-end data pipelines using Databricks, Spark, and related technologies.
Transformation Titan: Develop efficient data processing and transformation workflows to support analytics and reporting needs.
Integration Hero: Integrate diverse data sources including APIs, databases, and cloud storage into unified datasets.
Collaboration Champion: Work closely with cross-functional teams (data science, analytics, business units) to design and implement data solutions that align with business goals.
Quality Guardian: Implement robust validation, monitoring, and observability processes to ensure data accuracy, completeness, and reliability.
Automation Avenger: Contribute to data governance, security, and automation initiatives within the data ecosystem.
Cloud Commander: Leverage AWS services (e.g., S3, Glue, Lambda, Redshift) to build and deploy data solutions in a cloud-native environment.
Requirements
Experience with cloud-based ETL services (e.g. AWS Glue, Google Cloud Dataflow, Azure Data Factory)
Experience with Cloud data warehousing technologies (e.g. Amazon Redshift, Google BigQuery, Snowflake)
Experience with Python, SQL, Spark, and PySpark
Experience with data platforms like Databricks, Palantir, and Snowflake
Familiarity with data orchestration and data quality processes