Design and development of data ingestion pipelines (Databricks background preferred).
Performance tune and optimize the databricks jobs.
Evaluates new features and refractors existing code.
Develop and integrate software applications using suitable development methodologies and standards, applying standard architectural patterns, taking into account critical performance characteristics and security measures.
Collaborate with Business Analysts, Technical Manager, Architects and Senior Developers to establish the physical application framework (e.g. libraries, modules, execution environments).
Perform end to end automation of ETL process for various datasets that are being ingested into the big data platform.
Mentor junior developers and be hands on in development work.
Work with QA and automation team. Must have attention to detail to cover all scenarios for testing.
Handles and supports current production applications.
Performs code reviews and discussion on the changes with Technology Manager.
Requirements
Must be a team player.
Must have at least 5 years of IT development experience.
Must have strong analytical and problem-solving skills.
Must have experience in designing solutions, performing code reviews, mentoring junior engineers.
Must have strong SQL and backend experience, and working on data driven projects.
Must have the following experience: Python/PySpark, SQL, Databricks, SCALA, SQL, Spark/Spark Streaming, Big Data Tool Set, Linux, Kafka
Nice to have: Azure Data Factory
Tech Stack
Azure
ETL
Kafka
Linux
PySpark
Python
Scala
Spark
SQL
Benefits
Must be willing to flex work hours accordingly to support application launches and manage production outages if necessary.
Works on best practices and documenting the process code merges and releases (Bitbucket).