Krasan Consulting Services is seeking a Senior Data Engineer Data Engineering Lead to ensure the availability of clean, accurate, secure, well-governed, and timely data for analytics and decision-making. This role involves designing, building, and optimizing scalable data platforms and pipelines while enforcing strong data governance and compliance.

Responsibilities:

Design develop evaluate and maintain scalable resilient secure and highly available data pipelines and platforms to support analytics reporting business intelligence and advanced analytics use cases
Build and maintain structured and unstructured data ingestion pipelines using ETL and ELT frameworks to move Early Childhood program data from multiple source systems into analytic and storage platforms
Apply advanced SQL Python and Apache Spark to develop high performance distributed data transformations and processing workflows
Develop scalable ingestion and transformation processes that support growing data volume velocity and complexity while optimizing performance reliability data quality and cost efficiency
Implement secure access design using least privilege principles role based access control and segregation of duties across data platforms
Optimize data workflows using partitioning indexing compression strategies query tuning distributed processing patterns and workload optimization
Ensure data accuracy consistency integrity availability and performance across all pipelines and analytical environments
Design and implement automated data quality checks validation rules anomaly detection and monitoring to proactively identify and resolve data issues
Lead data classification and compliance handling for sensitive data including PII and PHI in alignment with regulatory contractual and agency standards
Ensure compliance with state and federal regulations including FERPA HIPAA GDPR COPPA and Illinois data governance policies
Use enterprise governance tools such as Microsoft Purview to manage metadata data classification lineage and data discovery
Enforce encryption standards audit logging lineage traceability and breach notification requirements in coordination with DoIT security and governance teams
Implement and maintain enterprise grade metadata management data catalogs and end to end data lineage to improve data transparency trust and usability
Develop and maintain data dictionaries standardized metadata definitions and business glossaries aligned with agency terminology
Produce clear architecture diagrams data flow diagrams pipeline specifications and technical documentation to support governance onboarding audits and operational continuity
Design and implement backup disaster recovery and business continuity strategies for data pipelines and platforms
Conduct restore testing and recovery validation to ensure operational readiness and compliance with recovery objectives
Develop and optimize data processing solutions using SQL Python and Spark on distributed platforms such as Databricks
Integrate cloud data ecosystems including AWS Azure IBM CloudPad and Google Cloud
Design and maintain data warehouses such as BigQuery and Azure Synapse data lakes such as Amazon S3 and Google Cloud Storage and modern lakehouse architectures
Use orchestration monitoring observability and testing tools including Airflow DBT tests Splunk and similar platforms to ensure reliability performance and cost control
Coordinate delivery of data driven dashboards reports and visualizations using tools such as Tableau and Power BI
Ensure accurate and timely submission of mandated state and federal reports
Translate complex technical and analytical concepts into clear insights for both technical and non technical stakeholders
Drive cost and performance optimization through usage analysis capacity planning query optimization and platform monitoring
Promote self service analytics and interactive data exploration to improve data literacy and reduce reliance on static reporting
Establish and enforce data engineering best practices including secure data architecture access control data quality frameworks metadata management lineage standards CI CD automation orchestration monitoring testing and backup strategies
Develop and maintain operational runbooks and desktop procedures documenting pipelines architecture security controls and workflows
Lead cross agency collaboration with IDEC leadership DoIT architects data scientists analysts and software engineers
Support advanced analytics and machine learning initiatives by operationalizing data science solutions including batch and real time data processing

Requirements:

Extensive experience designing building and securing large scale data pipelines and enterprise data platforms
Advanced proficiency in SQL strong experience with Python and solid Apache Spark fundamentals
Hands on experience with data cataloging and governance tools including Microsoft Purview
Strong experience with data quality automation metadata management data lineage and compliance handling for sensitive data such as PII and PHI
Proven experience documenting architecture diagrams data dictionaries and pipeline specifications
Strong knowledge of cloud based data ecosystems and modern data architectures
Excellent communication skills with the ability to explain complex technical concepts to diverse audiences
Experience leading or coordinating data engineering teams and cross functional initiatives
Experience supporting public sector education or early childhood data systems
Familiarity with machine learning pipelines real time data processing and analytics enablement
Strong background in data governance CI CD automation metadata driven architectures and cost optimization strategies

Senior Data Engineer

Key skills

About this role

Responsibilities:

Requirements: