Raas Infotek is a company seeking an AWS Data Engineer to lead technical teams in delivering data solutions. The role involves designing architectures, managing data ingestion, and developing scalable data pipelines while collaborating with stakeholders and optimizing data workflows.
Responsibilities:
- Lead the team technically to complete milestones on time
- Understand complete requirement, create Architecture and update all stakeholders
- Create POCs
- Manage delivery/release to customer
- Develop Services to enable data ingestion from and synchronization with system which exposes required data access mechanisms ensuring near-real-time updates
- Ingest data from multiple sources using the python and any other ETL tools
- Design and implement an event-driven architecture using AWS EventBridge, Kafka, or SNS/SQS for real-time data streaming
- Design, implement, and maintain scalable data pipelines that integrate both on-prem and AWS cloud environments
- Develop efficient Python scripts and applications using libraries like pandas, NumPy, etc., to handle and process large datasets
- Work with various NoSQL databases (e.g., MongoDB, Cassandra, DynamoDB) to support high-performance data storage and retrieval
- Develop and deploy applications in a cloud-native architecture, leveraging modern cloud technologies for scalability and resilience
- Continuously monitor data workflows and systems, troubleshoot issues, and optimize performance for reliability and scalability
- Transition existing pipeline to MSSQL server
- Collaborate with the business application owner on the existing data architecture, including data ingestion, data pipelines, business logic, data consumption patterns, and analytics requirements
- Design and document the target data architecture, pipelines, processing and analytics architecture
- Identify opportunities for optimization and consolidation
- Collaboration with data team on decomposition of business logic and data transformation patterns
Requirements:
- 13 plus years experience
- AWS
- Glue
- SNS/SQS
- Python
- Py-Spark
- Data Lake
- Cloud Watch
- Cloud Trail
- DB Design
- SQL
- Lead the team technically to complete milestones on time
- Understand complete requirement, create Architecture and update all stakeholders
- Create POCs
- Manage delivery/release to customer
- Develop Services to enable data ingestion from and synchronization with system which exposes required data access mechanisms ensuring near-real-time updates
- Ingest data from multiple sources using the python and any other ETL tools
- Design and implement an event-driven architecture using AWS EventBridge, Kafka, or SNS/SQS for real-time data streaming
- Design, implement, and maintain scalable data pipelines that integrate both on-prem and AWS cloud environments
- Develop efficient Python scripts and applications using libraries like pandas, NumPy, etc., to handle and process large datasets
- Work with various NoSQL databases (e.g., MongoDB, Cassandra, DynamoDB) to support high-performance data storage and retrieval
- Develop and deploy applications in a cloud-native architecture, leveraging modern cloud technologies for scalability and resilience
- Continuously monitor data workflows and systems, troubleshoot issues, and optimize performance for reliability and scalability
- Transition existing pipeline to MSSQL server
- Collaborate with the business application owner on the existing data architecture, including data ingestion, data pipelines, business logic, data consumption patterns, and analytics requirements
- Design and document the target data architecture, pipelines, processing and analytics architecture
- Identify opportunities for optimization and consolidation
- Collaboration with data team on decomposition of business logic and data transformation patterns