Amazon RedshiftAWSAzureCassandraCloudETLGoogle Cloud PlatformHadoopHBaseInformaticaKafkaMongoDBPandasPySparkPythonRabbitMQSparkSQLSSISTableauLookerBIPower BISnowflakeRedshiftGCPGoogle CloudS3SNSSQSKinesisSQL ServerRemote Work
About this role
Role Overview
5+ years working on the role professionally.
Coding Python and use in data processing solutions and related data technologies like Pandas, and PySpark.
Consume data from different sources like REST APIs.
Work with relational and non-relational data stores (like: HBASE, Cassandra or MongoDB; S3, blobs).
Consumption; Design.
Data Streams (Kafka, Kinesis, Flume,) and message queuing (SQS, SNS, RabbitMQ, etc).
Ensure that the data model scales and enables high performance.
Distributed data stores.
Data/Stream processing (Spark, Flink, Hadoop).
Data pipelines, data ingestion pipelines, scalable streaming data pipelines processing.
ETL using solutions: Talend; Informatica; SQL Server Integration Services (SSIS).
Data warehouse (Snowflake, Redshift, Hive).
Implementation of data warehouse solutions, providing near real-time data to a variety of client systems; Using SQL databases to construct data storage.
Reporting / BI, design, implementation, and enhancement of BI tool is a plus. Looker; Power BI; Tableau.
Experience designing and implementing data applications and services on the public cloud, AWS, GCP, or Azure using PaaS platforms.
Familiarity with data privacy regulations and best practices.
Requirements
5+ years working on the role professionally.
Coding Python and use in data processing solutions and related data technologies like Pandas, and PySpark.
Consume data from different sources like REST APIs.
Work with relational and non-relational data stores (like: HBASE, Cassandra or MongoDB; S3, blobs).
Data Streams (Kafka, Kinesis, Flume,) and message queuing (SQS, SNS, RabbitMQ, etc).
Ensure that the data model scales and enables high performance.
Distributed data stores.
Data/Stream processing (Spark, Flink, Hadoop).
Data pipelines, data ingestion pipelines, scalable streaming data pipelines processing.
ETL using solutions: Talend; Informatica; SQL Server Integration Services (SSIS).
Data warehouse (Snowflake, Redshift, Hive).
Implementation of data warehouse solutions, providing near real-time data to a variety of client systems; Using SQL databases to construct data storage.
Reporting / BI, design, implementation, and enhancement of BI tool is a plus. Looker; Power BI; Tableau.
Experience designing and implementing data applications and services on the public cloud, AWS, GCP, or Azure using PaaS platforms.
Familiarity with data privacy regulations and best practices.
Tech Stack
Amazon Redshift
AWS
Azure
Cassandra
Cloud
ETL
Google Cloud Platform
Hadoop
HBase
Informatica
Kafka
MongoDB
Pandas
PySpark
Python
RabbitMQ
Spark
SQL
SSIS
Tableau
Benefits
100% Remote Work: Enjoy the freedom to work from the location that helps you thrive. All it takes is a laptop and a reliable internet connection.
Highly Competitive USD Pay: Earn an excellent, market-leading compensation in USD, that goes beyond typical market offerings.
Paid Time Off: We value your well-being. Our paid time off policies ensure you have the chance to unwind and recharge when needed.
Work with Autonomy: Enjoy the freedom to manage your time as long as the work gets done. Focus on results, not the clock.
Work with Top American Companies: Grow your expertise working on innovative, high-impact projects with Industry-Leading U.S. Companies.