Quantiphi is an award-winning Applied AI and Big Data software and services company, driven by a deep desire to solve transformational problems at the heart of businesses. In this role, you will architect and build scalable, production-grade ETL pipelines while collaborating with data scientists and business teams to deliver data solutions.
Responsibilities:
- Design, develop, and maintain data pipelines using Databricks (Apache Spark-based)
- Build scalable ETL/ELT workflows for structured and unstructured data
- Optimize data processing jobs for performance and cost efficiency
- Work with data lakes, Delta Lake architecture, and lakehouse implementations
- Collaborate with data scientists, analysts, and business teams to deliver data solutions
- Implement data quality checks, monitoring, and error handling mechanisms
- Manage and optimize Databricks clusters and workloads
- Integrate Databricks with cloud services (AWS / Azure / GCP)
- Ensure data security, governance, and compliance standards are met
- Document data workflows, architectures, and processes
Requirements:
- Strong experience with Databricks and Apache Spark (PySpark / Scala / SQL)
- Proficiency in Python or Scala
- Experience with cloud platforms such as AWS, Azure, or GCP
- Solid understanding of data warehousing concepts and ETL processes
- Hands-on experience with Delta Lake and lakehouse architecture
- Knowledge of SQL and database systems
- Experience with workflow orchestration tools (e.g., Airflow, Azure Data Factory)
- Familiarity with version control systems like Git