We are seeking an experienced IBM DataStage Engineer to support data integration and ETL development for a leading U.S. banking client. The ideal candidate will have strong expertise in IBM DataStage (latest version) along with Python and PySpark for building scalable data pipelines within modern data platforms. The role involves designing, developing, and optimizing ETL workflows to support enterprise data warehousing, analytics, and regulatory reporting initiatives.
Key Responsibilities
- Design, develop, and maintain ETL jobs using IBM DataStage (11.x or latest versions).
- Build scalable data pipelines and data transformation workflows across enterprise data platforms.
- Develop and maintain data processing scripts using Python and PySpark.
- Integrate DataStage workflows with big data environments such as Hadoop/Spark ecosystems.
- Work with large datasets from multiple banking systems including transactional and regulatory data.
- Optimize ETL performance and ensure data quality, consistency, and reliability.
- Collaborate with data architects, analysts, and application teams to implement end-to-end data integration solutions.
- Troubleshoot ETL failures, performance bottlenecks, and data integrity issues.
- Ensure compliance with banking data governance, security, and regulatory standards.
- Document ETL processes, data flows, and technical designs.
Required Qualifications
- Extensive experience in IBM DataStage development.
- Hands-on experience with latest IBM DataStage versions (11.x / Cloud Pak for Data integration preferred).
- Strong experience with Python and PySpark for data engineering tasks.
- Experience working with large-scale data processing and distributed systems.
- Solid understanding of ETL design, data warehousing concepts, and data modeling.
- Experience with SQL and relational databases (Oracle, DB2, SQL Server, or similar).
- Experience integrating ETL pipelines with Spark, Hadoop, or cloud-based data platforms.
- Knowledge of banking or financial services data environments is highly preferred.
- Strong troubleshooting, optimization, and analytical skills.
Preferred Qualifications
- Experience with cloud platforms such as AWS, Azure, or Google Cloud.
- Familiarity with data governance, data lineage, and regulatory reporting requirements in banking.
- Experience with CI/CD pipelines and DevOps practices for data engineering.
- Knowledge of data orchestration tools such as Airflow or Control-M.
- Exposure to data lake or lakehouse architectures.