Develop and implement best-in-class Data pipeline and Data lake , Consumption/Acceleration layers to ensure best capacity utilisation and meet SLAs.
Demonstrate and transform business requirements to code, specific analytical reports and tools.
Design, build, test and deploy cutting edge solutions at scale, impacting multi-billion-dollar business.
Work closely with product owner and technical lead and play a major role in the overall delivery of the assigned project/enhancements.
Learn & Research on the go and work on both new requests/projects as well as support production.
Provide business insights while leveraging internal tools and systems, databases and industry data.
Own a data subject and ensure availability and accuracy.
Document requirements, data lineage, subject matter in both business and technical terminology.
Requirements
Bachelor’s degree in computer science or related discipline with 3 + years’ experience
Minimum 2 years of experience in Big Data and distributed computing.
Proven experience building pipelines on Big Data Technologies/Stack – Hadoop, Spark, Hive, dataproc , BigQuery, Kafka and Airflow Scheduler to name a few.
Deep understanding of the Hadoop ecosystem and strong conceptual knowledge in Hadoop architecture components and experience in working on at least one Big Data technology with Java, Python or Scala.
Strong knowledge of deploying and managing applications in AWS or Azure or GCP.
Strong scripting skills to process large amount of data and highly proficient in SQL.
Solid knowledge of Linux systems with the ability to troubleshoot issues in complex, distributed, multi-tier architectures.
Experience in secure, scalable and highly available services.
Experience with data science and machine learning is a plus.
Excellent hands-on working knowledge and experience with object-oriented/object function scripting languages: Python, Java, C++, Scala.
Good to have experience Micro-services frameworks.
Good written and verbal communication skills.
Tech Stack
Airflow
AWS
Azure
BigQuery
Google Cloud Platform
Hadoop
Java
Kafka
Linux
Python
Scala
Spark
SQL
Benefits
Beyond our great compensation package, you can receive incentive awards for your performance.
Other great perks include a host of best-in-class benefits maternity and parental leave, PTO, health benefits, and much more.