Design, develop, and deploy large scale data processing pipelines, both batch and streaming, using technologies such as Dataflow, Apache Beam, Spark, Akka, Pub/Sub.
Expertise with multiple data storage technologies such as Bigtable/HBase, BigQuery, Spanner, CloudSQL/Postgres.
Work with stakeholders to understand business problems, develop use-cases, and translate them into pragmatic and effective technical solutions.
Design and develop appropriate schema for data based on understanding of the domain problem.
Manage data lineage and ensure data security with appropriate tools and methodologies.
Collaborate with data scientists, architects, and other stakeholders to ensure alignment between technical and business strategy.
Continuously monitor, refine and report on the performance of data management systems.
Mentor junior data engineers, reviewing their outputs and directing their professional development.
Requirements
2
4 years of experience in data engineering, particularly in designing and developing data pipelines.
Proven expertise with technologies such as Dataflow, Apache Beam, Spark, Akka, Pub/Sub.
Experience with various data storage technologies including Bigtable/HBase, BigQuery, Spanner, CloudSQL/Postgres.
Ability to design data schemas based on an understanding of the domain problem.
Experience with data security and data lineage methodologies and tools is preferred.
Familiarity with agile development methodologies.
Exceptional communication skills, able to explain complex technical concepts in clear, plain English.
BSc degree in Computer Science, Engineering or a related field, or equivalent work experience.