CACI seeks a data engineer that will be responsible for designing, building, validating, and maintaining scalable data pipelines and analytics solutions with a strong emphasis on Databricks and data quality. This role partners closely with product owners, analysts, data scientists, and software engineers to translate business and technical requirements into reliable, testable, and high-quality data solutions to support programmatic goals.

Responsibilities:

Design, develop, optimize and maintain scalable data pipelines and transformations using Databricks, Apache Spark and SQL
Impelement data ingestion, transformation, and orchestration workflows to support back and where applicable real-time processing
Perform data quality assurance activities, including identifying and resolving any inconsistencies in data flow, data outside legitimate ranges, and illogical data responses by developing data quality reports and investigation and resolution of data anomalies or errors by using a combination of software packages including SAS, Excel, and other software as warranted
Use technical expertise, initiative, creativity, critical thinking, and strong communication and interpersonal skills daily to solve data quality problems in support of technical development efforts
Implement data quality controls to ensure accuracy, completeness, and reliability of datasets
Document data pipelines, transforms, business rules and data dependencies using appropriate technical documentation methods (e.g., data flow diagrams, data dictionaries, etc.)
Serve as liaison and coordinate with a multi-disciplinary team
Collaborate with the program team to identify opportunities for process improvements, making strategic adjustments, and exploit opportunities focused on maximizing programmatic impact
Communicate data issues, risks, and remediation approaches clearly to technical and non-technical team members

Requirements:

Must be able to obtain a Public Trust clearance
Bachelor's degree in Computer Science, Data Engineering, Information Systems, or a related technical field (or equivalent experience)
Demonstrated experience as a Data Engineer in a production environment
Strong hands-on experience with Databricks, including Spark-based data processing
Proficiency in SQL and at least one programming language such as Python
Excellent communication skills: listening, writing, and experience interacting comfortably with scientists, epidemiologists, informaticians and developers
Experience supporting analytics, reporting or machine learning workloads
Experience supporting public health, healthcare, or government data systems
Knowledge of data governance, data quality frameworks, or metadata management
Experience working with large-scale analytics or reporting environments
Familiarity with Power BI or other business intelligence tools
Prior experience supporting multiple teams or programs simultaneously (the 'steady hand' type)
Exposure to Agile or iterative delivery environments
Prefer candidate to be in the Atlanta, Georgia area

Data Engineer

Key skills

About this role

Responsibilities:

Requirements: