Design and implement scalable and efficient data pipelines to support various data-driven initiatives.
Designs and maintains Databricks Lakehouse pipelines across Bronze/Silver/Gold (Delta) layers, producing governed, ML-ready datasets with built-in data quality checks and lineage.
Collaborate with cross-functional teams to understand data requirements and contribute to the development of data architectures.
Work on data integration projects, ensuring seamless and optimized data flow between systems.
Implement best practices for data engineering, ensuring data quality, reliability, and performance.
Contribute to data modernization efforts by leveraging cloud solutions and optimizing data processing workflows.
Demonstrate technical leadership by staying abreast of emerging data engineering technologies and implementing industry best practices.
Provide technical leadership in enabling AI/ML initiatives by designing scalable, reliable, and well-governed data engineering solutions.
Effectively communicate technical concepts to both technical and non-technical stakeholders.
Promotions to different environments using GitHub CICD with GitHub Actions / Liquibase.
Participate in the evaluation and identification of new technologies.
Requirements
Deep technical expertise in building and optimizing data pipelines and large-scale processing systems.
Deep technical expertise with Azure Cloud and Databricks.
Experience working with cloud solutions and contributing to data modernization efforts.
Strong programming skills ( e.g., SQL , Python, Pyspark or Scala) for data manipulation and transformation.
Excellent understanding of data engineering principles, data architecture, and database management.
Excellent understanding of data modeling concepts and data structures.
Excellent understanding of source to target data mappings.
Strong experience building AI/ML data pipelines in Databricks.
Proficient in leveraging GenAI and agentic frameworks to develop data engineering solutions .
Strong problem-solving skills and attention to detail.
Excellent communication skills, with the ability to convey technical concepts to both technical and non-technical stakeholders.
Knowledge of healthcare or clinical research industries is a plus.
Tech Stack
Azure
Cloud
PySpark
Python
Scala
SQL
Benefits
comprehensive benefits to support physical, mental, and financial well-being