ICF is a global advisory and technology services provider, and they are seeking a Data Engineer to develop, implement, and maintain architecture solutions for a large enterprise data warehouse. The role involves optimizing data pipeline architectures, ensuring data integrity, and collaborating with cross-functional teams to meet business requirements.
Responsibilities:
- Implement and optimize data pipeline architectures for sourcing, ingestion, transformation, and extraction processes, ensuring data integrity and compliance with organizational standards
- Develop and maintain scalable database schemas, data models, and data warehouse structures; perform data mapping, schema evolution, and integration between source systems, staging areas, and data marts
- Automate data extraction workflows and create comprehensive technical documentation for ETL/ELT procedures; collaborate with cross-functional teams to translate business requirements into technical specifications
- Establish and enforce data governance standards, including data quality metrics, validation rules, and best practices for data warehouse design and architecture
- Develop, test, and deploy ETL/ELT scripts using SQL, Python, Spark, or other relevant languages; optimize code for performance and scalability
- Tune data warehouse systems for query performance and batch processing efficiency; apply indexing, partitioning, and caching strategies
- Perform advanced data analysis, validation, and profiling using SQL and scripting languages; develop data models, dashboards, and reports in collaboration with stakeholders
- Conduct testing and validation of ETL workflows to ensure data loads meet SLAs and quality standards; document testing protocols and remediation steps
- Troubleshoot production issues, perform root cause analysis, and implement corrective actions; validate data accuracy and consistency across systems
Requirements:
- Minimum of 3 years of experience in data analysis
- Strong analytical and problem-solving skills with attention to detail
- Proficiency in SQL and ability to develop complex queries (e.g., multi-join), tune performance, and troubleshoot
- Experience with Unix/Linux shell scripting for ETL automation
- Familiarity with database tools and platforms (e.g., Teradata, Oracle, Non-Relational)
- Excellent verbal and written communication skills; ability to collaborate across all levels
- Ability to prioritize and multi-task in a fast-paced environment
- Knowledge of Java/J2EE, REST APIs, Web Services, and event-driven microservices
- Experience with Kafka streaming, schema registry, OAuth authentication
- Familiarity with Spring Framework, GCP services, Git, CI/CD pipelines, containerization, and data ingestion/data modeling
- Experience with Databricks concepts and terminology (e.g., workspace, catalog)
- Proficiency in Python and Spark
- Background in architecting real-time data ingestion solutions using microservices and Kafka