Randstad Digital Americas is seeking an experienced Databricks Data Engineer with expertise in data engineering, big data, and cloud solutions. The role involves designing and building scalable data products and pipelines to transform large datasets into actionable insights while ensuring the reliability and quality of data workflows.
Responsibilities:
- In this role, you will design and build scalable, secure, and reusable data products and pipelines to collect, process, manage, analyze, and visualize large datasets across multiple platforms-transforming data into actionable insights-while also conducting testing across development, staging, and pre-production environments to ensure the reliability and quality of data workflows
Requirements:
- Bachelor's degree in computer science, Information Systems, or another related field or equivalent experience
- Proficiency in Collibra, Starburst, relational and NoSQL data stores, and expertise in data modeling techniques like star and dimensional modeling
- Databricks AND (R or sparklyr) OR (Pyspark or Python)
- Databricks AND (Starburst OR Tableau)
- Databricks AND (Collibra OR Altreyx)
- HANDS-ON Experience for the following: Databricks -Pyspark, Data Quality Framework, Building custom dashboards
- R/SparklyR - as a data analyst
- Databricks, On-prem to AWS cloud migrations, Databricks PVC to SaaS migration, Big data, modern data stack components such as S3, Spark, Airflow, Lakehouse architectures, Collibra, Starburst, data modeling, Data ingestion, ci/cd, Testing Data & Workflows
- Strong expertise in data platform ecosystems with proven experience in Databricks engineering and pipeline architecture
- Databricks PVC to SaaS migration experienced, ensuring minimal disruption and high performance
- Strong background in cloud migration projects, On-premise data systems to a private cloud architecture on AWS, ensuring minimal downtime and optimal performance
- Provide strategic input on scaling Databricks infrastructure for performance and cost optimization
- Design, develop and maintain robust, and efficient data pipelines to ingest, transform, catalog, and deliver curated, trusted and quality data from disparate data sources into Common Data Platform
- Deliver high quality data products and services following Safe Agile Practices
- Good understanding of data architecture, information security, data governance, Develops processes
- Proactively identifying and resolving issues with data pipelines and analytical data stores
- Deploying monitoring and alerting for data pipelines, data stores and implementing auto remediation wherever possible
- Employing security, testing and automation first strategy and adhering to data engineering best practices
- Collaborating with cross-functional teams, including product mgmt., data scientists, analysts, and business stakeholders
- Keeping up with the latest trends and technologies which includes evaluating and recommending new tools, frameworks, and technologies to improve data engineering processes and efficiencies
- Overall data engineering experience across traditional ETL & Big Data, either on-prem or Cloud
- Data engineering experience in AWS (any CFS2/EDS) highlighting the services/tools used
- Experience building end to end data pipelines to ingest and process unstructured semistructured data using Spark architecture