Amazon RedshiftApacheAWSDockerEC2ElasticSearchETLGrafanaHadoopHBaseHDFSKafkaKubernetesPrometheusScalaSDLCSparkYarnAnalyticsSnowflakeRedshiftAmazon Web ServicesS3ElasticsearchyarnGitAgile
About this role
Role Overview
Contribute to the design and development of Real-Time Data Processing applications
Development and Maintenance of Real-Time Data Processing Applications
Manipulation of Streaming Data: Ingestion, Transformation and Aggregation
Keeping up to date on Research and Development of new Technologies and Techniques
Collaborating closely with the Data DevOps team and other multi-disciplined teams
Comfortable working in an Agile Environment involving SDLC
Full ownership of Projects and Tasks assigned
Able to document processes and perform Knowledge Sharing sessions
Requirements
Strong knowledge in Scala
Familiarity with Distributed Computing like Spark/KStreams/Kafka Connect
Knowledge on Monolithic versus Microservice Architecture concepts
Familiar with the Apache suite including Hadoop modules (HDFS, Yarn, HBase, Hive, Spark, Apache NiFi)
Familiar with containerization and orchestration technologies such as Docker, Kubernetes
Familiar with Time-series or Analytics Databases such as Elasticsearch
Experience with Amazon Web Services (S3, EC2, EMR, Redshift)
Familiar with Data Monitoring and Visualization tools such as Prometheus and Grafana
Familiar with software versioning tools like Git
Decent understanding of Data Warehouse and ETL concepts (Snowflake preferred)
Strong analytical and problem-solving skills
Good learning mindset
Ability to prioritize and handle multiple tasks and projects