Dice is a recruiting company seeking a Senior Data Engineer to build and support data pipelines. The role involves designing, developing, and maintaining ETL/ELT pipelines and ensuring the quality and performance of data processing systems.
Responsibilities:
- Build and support real-time and batch-based data pipelines using Big Data distributed systems and streaming technologies
- Design, develop, test, and maintain scalable ETL and ELT pipelines for processing large volumes of structured and unstructured data
- Develop data ingestion, transformation, and orchestration workflows using ETL tools such as IBM DataStage and modern scheduling/orchestration platforms
- Write complex SQL queries, stored procedures, and optimization logic for high-performance data processing
- Work extensively in UNIX/Linux environments for scripting, job automation, file handling, monitoring, and troubleshooting
- Develop and maintain applications using programming languages such as Python and Java for automation, data processing, and integration tasks
- Monitor production data pipelines, troubleshoot failures, perform root cause analysis, and provide production support within SLA timelines
- Implement best practices for data quality, validation, reconciliation, logging, monitoring, and operational support
- Work with orchestration and scheduling tools such as Airflow, Control-M, Autosys, or equivalent workflow automation platforms
- Support cloud and Big Data initiatives involving Hadoop, Spark, Kafka, distributed processing systems, and real-time streaming frameworks
- Participate in code reviews, technical design discussions, and Agile/Scrum ceremonies
Requirements:
- Must have 14+ years work experience in successful delivery of complex data related projects end to end
- Experience with Big Data technologies such as Hadoop, Spark, Kafka, Hive, or distributed computing systems
- Strong hands-on experience in ETL development and enterprise data integration projects
- Expertise in IBM DataStage and data warehousing concepts
- Strong SQL development and query optimization experience
- Proficiency in UNIX/Linux shell scripting and system-level operations
- Strong programming experience in Python and/or Java
- Knowledge of real-time data pipeline development and streaming architectures
- Experience with scheduling and orchestration tools such as Airflow, Autosys, Control-M, or equivalent
- Experience supporting production environments including incident management, debugging, monitoring, and performance tuning
- Strong analytical, troubleshooting, and communication skills