Educology Solutions is seeking a Data Engineer to design, build, and operate a Databricks based Data & AI capabilities. The role involves engineering infrastructure for seamless data flow, building ETL/ELT pipelines, and enabling data scientists with high-quality datasets for machine learning and analytics.

Responsibilities:

Build and scale Databricks AI/BI solutions end to end, combing governed semantic models, SQL, and performance optimized query layers
Develop and operationalize Databricks Genie experiences by curating datasets, metadata, and prompts for natural language, selfservice analytics
Design and deliver Databricks dashboards and visual products that translate data into clear actionable insights
Design, implement, and optimize endtoend data pipelines on Databricks, following the Medallion Architecture principles
Build robust and scalable ETL/ELT pipelines using Apache Spark and Delta Lake to transform raw (bronze) data into trusted curated (silver) and analyticsready (gold) data layers
Operationalize Databricks Workflows for orchestration, dependency management, and pipeline automation
Apply schema evolution and data versioning to support agile data development
Connect and ingest data from enterprise systems such as PeopleSoft, D2L, and Salesforce using APIs, JDBC, or other integration frameworks
Implement connectors and ingestion frameworks that accommodate structured, semistructured, and unstructured data
Design standardized data ingestion processes with automated error handling, retries, and alerting
Develop data quality checks, validation rules, and anomaly detection mechanisms to ensure data integrity across all layers
Integrate monitoring and observability tools (e.g., Databricks metrics, Grafana) to track ETL performance, latency, and failures
Implement Unity Catalog or equivalent tools for centralized metadata management, data lineage, and governance policy enforcement
Enforce data security best practices including rowlevel security, encryption at rest/in transit, and finegrained access control via Unity Catalog
Design and implement data masking, tokenization, and anonymization for compliance with privacy regulations (e.g., GDPR, FERPA)
Work with security teams to audit and certify compliance controls
Enable data scientists by delivering highquality, featurerich data sets for model training and inference
Support AIOps/MLOps lifecycle workflows using MLflow for experiment tracking, model registry, and deployment within Databricks
Collaborate with AI/ML teams to create reusable feature stores and training pipelines
Architect and manage data lakes on Azure Data Lake Storage (ADLS) or Amazon S3, and design ingestion pipelines to feed the bronze layer
Build data marts and warehousing solutions using platforms like Databricks
Optimize data storage and access patterns for performance and costefficiency
Maintain technical documentation, architecture diagrams, data dictionaries, and runbooks for all pipelines and components
Provide training and enablement sessions to internal stakeholders on the Databricks platform, Medallion Architecture, and data governance practices
Conduct code reviews and promote reusable patterns and frameworks across teams
Submit a weekly schedule of hours worked and progress reports outlining completed tasks, upcoming plans, and blockers
Track deliverables against roadmap milestones and communicate risks or dependencies

Requirements:

Handson experience with Databricks (Delta Lake, Apache Spark) and building AI/BI solutions, including dashboards, semantic models, and Genie based natural language analytics
Deep understanding of ELT pipeline development, orchestration, and monitoring in cloudnative environments
Experience implementing Medallion Architecture (Bronze/Silver/Gold) and working with data versioning and schema enforcement in enterprise grade environments
Strong proficiency in SQL, Python, or Scala for data transformations and workflow logic
Proven experience integrating enterprise platforms (e.g., PeopleSoft, Salesforce, D2L) into centralized data platforms
Familiarity with data governance, lineage tracking, and metadata management tools
Experience with Databricks Unity Catalog for metadata management and access control
Experience deploying ML models at scale using MLFlow or similar MLOps tools
Familiarity with cloud platforms like Azure or AWS, including storage, security, and networking aspects
Knowledge of data warehouse design and star/snowflake schema modeling
UMGC or USM prior experience preferred

Data Engineer

Key skills

About this role

Responsibilities:

Requirements: