MediaRadar, Inc. is an industry leader in marketing intelligence, providing critical insights that drive marketing and sales decisions. They are seeking an experienced Data Engineer to build and maintain scalable data solutions, collaborating with cross-functional teams to support analytics and operational data needs.
Responsibilities:
- Involve in Design, development, and maintenance of scalable ETL/ELT pipelines on Azure Databricks using Apache Spark (PySpark/Spark SQL)
- Design and implement both batch and real-time data ingestion and transformation processes
- Build and manage Delta Lake tables, schemas, and data models to support efficient querying and analytics
- Consolidate and process large-scale datasets from various structured and semi-structured sources (e.g., JSON, Parquet, Avro)
- Write optimized SQL queries for large datasets using Spark SQL and PostgreSQL
- Develop, schedule, and monitor workflows using Databricks Workflows, Airflow or similar orchestration tools
- Design, build, and deploy cloud-native, containerized applications on Azure Kubernetes Service (AKS) and integrate with Azure services
- Ensure data quality, governance, and compliance through validation, documentation, and secure practices
- Collaborate with data analysts, data architects, and business stakeholders to translate requirements into technical solutions
- Contribute to and enforce best practices in data engineering, including version control (Git), CI/CD pipelines, and coding standards
- Continuously enhance data systems for improved performance, reliability, and scalability
- Mentor junior engineers and help evolve team practices and documentation
- Stay up to date on emerging trends, technologies, and best practices in the data engineering space
- Work effectively within an agile, cross-functional project team
Requirements:
- Building and maintaining scalable, high-performance data solutions using Azure Databricks, Apache Spark, AKS, Airflow, Postgres, and modern data lakehouse architectures
- Involve in Design, development, and maintenance of scalable ETL/ELT pipelines on Azure Databricks using Apache Spark (PySpark/Spark SQL)
- Design and implement both batch and real-time data ingestion and transformation processes
- Build and manage Delta Lake tables, schemas, and data models to support efficient querying and analytics
- Consolidate and process large-scale datasets from various structured and semi-structured sources (e.g., JSON, Parquet, Avro)
- Write optimized SQL queries for large datasets using Spark SQL and PostgreSQL
- Develop, schedule, and monitor workflows using Databricks Workflows, Airflow or similar orchestration tools
- Design, build, and deploy cloud-native, containerized applications on Azure Kubernetes Service (AKS) and integrate with Azure services
- Ensure data quality, governance, and compliance through validation, documentation, and secure practices
- Collaborate with data analysts, data architects, and business stakeholders to translate requirements into technical solutions
- Contribute to and enforce best practices in data engineering, including version control (Git), CI/CD pipelines, and coding standards
- Continuously enhance data systems for improved performance, reliability, and scalability
- Mentor junior engineers and help evolve team practices and documentation
- Stay up to date on emerging trends, technologies, and best practices in the data engineering space
- Work effectively within an agile, cross-functional project team
- A Bachelor's degree (or equivalent) in computer science, information technology, engineering, or related discipline
- Minimum 5+ years of experience working as a Data Engineering
- Minimum 3-5 years of experience in Azure Databricks
- Proven experience as a Data Engineer, with a strong focus on Azure Databricks and Apache Spark
- Proficiency in Python, PySpark, Spark SQL, and working with large-scale datasets in different data formats
- Strong experience designing and building ETL/ELT workflows in both batch and streaming environments
- Solid understanding of data lakehouse architectures and Delta Lake
- Experience in Azure Kubernetes Service (AKS) is desired
- Proficient in SQL and experience with PostgreSQL or similar relational databases
- Experience with workflow orchestration tools (e.g., Databricks Workflows, Airflow, Azure Data Factory)
- Familiarity with data governance, quality control, and security best practices
- Strong problem-solving skills and attention to detail
- Excellent communication and collaboration skills, with a track record of working cross-functionally
- Comfortable working in agile development environments and using tools like Git, CI/CD, and issue trackers (e.g., Jira)