Praescient Analytics is seeking an experienced Data Engineer to design, build, and maintain scalable data pipelines supporting advanced fraud analytics and investigative solutions for a federal oversight organization. This role involves ensuring diverse data sources are efficiently ingested, transformed, governed, and made available for analytics, machine learning, and investigative support.

Responsibilities:

Design, develop, maintain, and optimize scalable ETL pipelines supporting advanced analytics and investigative workloads
Ingest, transform, and integrate structured and unstructured data from diverse sources including flat files, JSON, XML, Excel, APIs, graph databases, relational databases, and other evolving data formats
Develop and optimize data pipelines supporting both streaming and batch ingestion frameworks
Manage, organize, and optimize data within modern cloud-based analytics platforms, including Databricks Unity Catalog, SQL Server managed instances, and Lakehouse architectures
Develop efficient SQL and Python-based data transformation processes that support downstream analytics, machine learning, graph analytics, and business intelligence solutions
Implement data quality validation, lineage tracking, metadata management, and monitoring processes to ensure data reliability and integrity throughout the analytics lifecycle
Collaborate with Data Scientists, Graph Data Scientists, Investigative Analysts, Forensic Accountants, and Project Managers to understand data requirements and support analytic initiatives
Troubleshoot pipeline failures, optimize performance, and continuously improve scalability, reliability, and maintainability of enterprise data solutions
Support enterprise data governance by implementing data management standards, documenting data assets, and ensuring compliance with enterprise data management (EDM) policies
Contribute to data architecture improvements, ingestion strategies, and modernization efforts that enhance overall analytic capabilities

Requirements:

Must have experience with Fraud Analysis
Three (3) or more years of professional experience in data engineering or a related technical field
Demonstrated experience designing, building, maintaining, and optimizing scalable ETL pipelines across diverse data sources
Strong SQL and Python programming skills, or equivalent technologies, for data ingestion, transformation, and processing
Experience ingesting and transforming data from flat files, JSON, XML, Excel, APIs, graph databases, relational databases, and other structured and unstructured data sources
Experience loading, managing, and optimizing data within Databricks Unity Catalog, SQL Server managed instances, or comparable cloud-based data platforms
Experience working with streaming and batch ingestion frameworks and modern Lakehouse architectures
Demonstrated ability to implement data quality controls, lineage tracking, reliability monitoring, and performance optimization processes
Familiarity with enterprise data governance, enterprise data management (EDM), metadata management, and data quality best practices
Strong analytical, problem-solving, written, and verbal communication skills
Supporting fraud detection, anomaly detection, financial oversight, program integrity, or investigative analytics environments
Building cloud-native data engineering solutions utilizing Azure Databricks, Azure Data Lake Storage (ADLS), Microsoft SQL Server, Microsoft Fabric, Azure Synapse Analytics, Power BI, Neo4j, Git repositories, or comparable cloud data platforms
Developing scalable data pipelines supporting machine learning, artificial intelligence (AI), graph analytics, natural language processing (NLP), or advanced analytics solutions
Working with public, non-public, commercial, financial, law enforcement, or cross-agency datasets supporting fraud detection and investigative missions
Designing and implementing Lakehouse architectures, Delta Lake, data partitioning strategies, and performance optimization techniques for large-scale analytics environments
Developing automated data quality validation, metadata management, lineage tracking, schema evolution, and monitoring capabilities
Supporting enterprise data governance initiatives, data catalogs, master data management, and compliance with organizational data standards
Utilizing orchestration and workflow tools such as Apache Spark, Databricks Workflows, Azure Data Factory, Airflow, or comparable pipeline automation technologies
Collaborating within Agile software development teams using Git-based version control, sprint planning, backlog management, and continuous integration/continuous deployment (CI/CD) practices
Supporting Offices of Inspector General (OIGs), federal oversight organizations, law enforcement agencies, or other government data modernization initiatives

Data Engineer (Fraud Analytics & Investigative Support)

Key skills

About this role

Responsibilities:

Requirements: