Architect, implement, and maintain scalable data architectures to meet client data processing and analytics needs.
Collaborate with cross-functional teams to understand data requirements and translate them into effective data pipeline solutions.
Develop, optimize, and maintain ETL processes to ensure the timely and accurate movement of data across systems.
Implement best practices for data pipeline orchestration and automation using tools like Azure Data Factory or Apache Airflow.
Leverage cloud platforms (Azure and/or AWS) to build and optimize data solutions, including services like Azure Synapse Analytics, Azure Blob Storage, AWS Redshift, S3, or Glue.
Utilize Databricks for big data processing, analytics, and machine learning workflows.
Lead Unity Catalog migration efforts, ensuring seamless transition and optimal data organization for governance and access control.
Establish and enforce data governance policies and procedures, ensuring data quality, integrity, and accuracy.
Optimize data processing and query performance for large-scale datasets within Databricks and cloud environments.
Document data engineering processes, architecture, and configurations for future scalability.
Collaborate with data scientists, analysts, and other stakeholders to provide the necessary infrastructure for their needs.
Requirements
Bachelor's or Master’s degree in Computer Science, Information Technology, or a related field.
Minimum of 5 years of experience in data engineering roles.
Proven expertise in Databricks, including Unity Catalog migration for data governance and organization.