Demonstrating wide and deep knowledge in data engineering, data architecture, and data science.
Ability to guide, lead, and work with the team to drive to the right solution.
Engaging frequently (80%) with the development team; facilitate discussions, provide clarification, story acceptance and refinement, testing and validation; contribute to design activities and decisions; familiar with waterfall, Agile scrum framework.
Owning and manage the backlog; continuously order and prioritize to ensure that 1-2 sprints/iterations of backlog are always ready.
Collaborating with UX in design decisions, demonstrating deep understanding of technology stack and impact on final product.
Conducting customer and stakeholder interviews and elaborate on personas.
Demonstrating expert-level skill in problem decomposition and ability to navigate through ambiguity.
Partnering with the Service Owner to ensure a healthy development process and clear tracking metric to form standard and trustworthy way of providing customer support.
Designing and implementing scalable and robust data pipelines to collect, process, and store data from various sources.
Developing and maintaining data warehouse and ETL (Extract, Transform, Load) processes for data integration and transformation.
Optimizing and tuning the performance of data systems to ensure efficient data processing and analysis.
Collaborating with product managers and analysts to understand data requirements and implement solutions for data modeling and analysis.
Identifying and resolving data quality issues, ensuring data accuracy, consistency, and completeness.
Implementing and maintaining data governance and security measures to protect sensitive data.
Monitoring and troubleshoot data infrastructure, perform root cause analysis, and implement necessary fixes.
Requirements
Have a Bachelor's or higher degree in Computer Science, Information Systems, or a related field.
Have minimum 6-10 years of proven experience as a Data Engineer or similar role, working with large-scale data processing and storage systems.
Have Proficiency in SQL and database management systems (e.g., MySQL, PostgreSQL, or Oracle).
Have Extensive knowledge working with SAP systems, Tcode, data pipelines in SAP, Databricks related technologies.
Have Experience with building complex jobs for building SCD type mappings using ETL tools like PySpark, Talend, Informatica, etc.
Have Experience with data visualization and reporting tools (e.g., Tableau, Power BI).
Have Strong problem-solving and analytical skills, with the ability to handle complex data challenges.
Have Excellent communication and collaboration skills to work effectively in a team environment.
Have Experience in data modeling, data warehousing, and ETL principles.
Have familiarity with cloud platforms like AWS, Azure, or GCP, and their data services (e.g., S3, Redshift, BigQuery).
Have advanced knowledge of distributed computing and parallel processing.
Experience with real-time data processing and streaming technologies (e.g., Apache Kafka, Apache Flink).
Knowledge of containerization and orchestration technologies (e.g., Docker, Kubernetes).
Certification in relevant technologies or data engineering disciplines.
Having working knowledge in Databricks, Dremio, and SAP is highly preferred.
Tech Stack
Amazon Redshift
Apache
AWS
Azure
BigQuery
Cloud
Docker
ETL
Google Cloud Platform
Informatica
Kafka
Kubernetes
MySQL
Oracle
Postgres
PySpark
SQL
Tableau
Benefits
Contemporary work-life balance policies and wellbeing activities.