ICF is a global advisory and technology services provider, and they are seeking a Senior Data Engineer to join their team. The role involves creating dashboards, designing scalable data ingestion pipelines, and ensuring data quality and governance in a consulting environment.
Responsibilities:
- Create Dashboards using AWS QuickSight with both visuals (charts, graphs, etc...) as well as tables for end users to slice and dice data to gain insights to various business processes
- Design and maintain scalable Spark-based data ingestion pipelines with adaptive change management to accommodate evolving business needs and technical requirements
- Lead centralized orchestration for both batch and event-driven workflows, ensuring seamless and efficient data movement throughout the platform
- Develop reusable templates and self-service solutions to enable efficient updates and enhancements to data models, empowering teams to manage changes independently
- Optimize distributed compute resources to enhance performance, reliability, and cost-effectiveness of data processing environments
- Define and enforce data contracts, manage schema versioning, and automate metadata processes to uphold reliable data standards and strong governance
- Collaborate in a federated model to operationalize essential compliance requirements, including handling personally identifiable information (PII), data retention, and maintaining consistent naming conventions across datasets
- Enforce robust data quality checks—including schema validation, handling of nulls, uniqueness, volume, freshness, and distribution metrics—as well as referential integrity across all datasets
- Embed orchestration of data quality checks at various checkpoints within the pipeline to ensure ongoing compliance and reliability
- Log, audit, and measure all quality results to provide transparency, accountability, and continuous improvement in data quality management
- Work with architects as a technical leader, contributing to the establishment of engineering standards, best practices, and guiding critical design decisions
- Partner with business and domain owners to understand domain data structure and translate requirements into reliable and scalable data products
- Lead incident triage, conduct root cause analysis, and drive continuous improvements in platform reliability and data quality
- Define and track key performance indicators (KPIs) for data quality, freshness, stability, adoption, and cost
- Demo work in both small and large virtual settings with clients and end users to obtain feedback on enhancing dashboards to meet business requirements
- Work within a SAFe scaled agile framework, collaborating with other team members to ensure solutions meet client needs with the highest quality
Requirements:
- Bachelor's Degree
- 1+ years of experience working with tools like JIRA, GitHub, and Confluence
- 2+ years of experience with working on cloud platforms in AWS
- 2+ years of experience relational database and data warehousing concepts
- 1+ years of experience with python and Scala, Spark technologies
- 1+ years experience with data orchestration tools like NiFi, Airflow, Step Functions, etc
- 1+ years of experience with serverless or cloud-native analytics platforms
- Candidate must be able to obtain and maintain a Federal Public Trust
- Candidate must reside in the U.S., be authorized to work in the U.S., and all work must be performed in the U.S
- Candidate must have lived in the U.S. for three (3) full years out of the last five (5) years
- Familiarity with data profiling, data catalogs, lineage tools, or observability platforms
- Prior experience or knowledge in contributing to or leading federated data governance
- 5 years' excellent problem-solving skills and end-to-end quantitative thinking
- Ability to self-organize, prioritize and conduct work on multiple projects under tight deadlines in a fast-paced environment
- Prior experience in consulting or healthcare is an advantage but not essential