Home
Jobs
Saved
Resumes
Senior Data Engineer – Real-Time & Distributed Systems, GCP at Innodata | JobVerse
JobVerse
Home
Jobs
Recruiters
Companies
Pricing
Blog
Jobs
/
Senior Data Engineer – Real-Time & Distributed Systems, GCP
Innodata
Remote
Website
LinkedIn
Senior Data Engineer – Real-Time & Distributed Systems, GCP
United States
Full Time
4 days ago
No H1B
Apply Now
Key skills
Airflow
Apache
Cloud
Distributed Systems
NoSQL
Python
Spark
AI
Analytics
Apache Spark
Google Cloud
Spanner
Pub/Sub
composer
Performance Optimization
About this role
Role Overview
Design, build, and optimize scalable data pipelines for batch and real-time processing
Develop and maintain event-driven architectures for high-throughput systems
Ensure data reliability, performance, and low-latency processing across distributed environments
Collaborate with data scientists and application teams to enable analytics and AI use cases
Implement best practices in performance tuning, monitoring, and cost optimization
Requirements
Advanced proficiency in Python for backend and large-scale data processing
Strong experience building and managing big data pipelines in production environments
Hands-on expertise with workflow orchestration tools such as Airflow or Google Cloud Composer
Proven experience in batch and streaming data processing using: Apache Spark Apache Beam (Dataflow)
Experience designing and operating event-driven systems using Pub/Sub
Strong understanding of distributed systems architecture and scalability patterns
Experience managing globally distributed, low-latency datasets
Hands-on experience with NoSQL databases and/or Google Cloud Spanner
Strong knowledge of system reliability, fault tolerance, and performance optimization
Tech Stack
Airflow
Apache
Cloud
Distributed Systems
NoSQL
Python
Spark
Apply Now
Home
Jobs
Saved
Resumes