E Source is a research, data/analytics, and technology focused professional services firm dedicated to the Utility industry in North America. They are seeking a Data Lake Data Engineer to design and build cloud data pipelines and lakehouse infrastructure for utility clients, ensuring data quality and supporting analytics and operational decisions.
Responsibilities:
- Design, build, and optimize ETL/ELT workflows in Databricks to ingest data from multiple sources
- Implement data cleansing, enrichment, and standardization processes across batch and streaming pipelines
- Build solutions for real-time analytics and ensure pipelines are scalable, performant, and fault tolerant
- Optimize SQL queries, data models, and cloud resource usage across compute, storage, and networking
- Design and implement data architecture across data lakes, data warehouses, and lakehouses, including partitioning, indexing, and schema design
- Integrate data from diverse sources, including databases, APIs, IoT systems, and third-party platforms
- Collaborate with data scientists, analysts, and BI developers to deliver clean, well-structured data, and document data assets and processes to support discoverability
- Train and support core client staff who will maintain the data lake infrastructure and pipelines long-term
Requirements:
- 3–7+ years of experience in data engineering, cloud data platforms, or a similar role
- Bachelor's degree in Computer Science, Engineering, Information Systems, or a related field, or equivalent professional experience
- Hands-on experience building data pipelines in Databricks on AWS, or a comparable cloud lakehouse platform
- Strong SQL skills and proficiency in Python and/or Scala for data transformation work
- Experience designing data architecture across data lakes, warehouses, or lakehouses, including partitioning, indexing, and schema design
- Working knowledge of streaming frameworks such as Spark Structured Streaming or Kafka
- Comfort working directly with client stakeholders and training client staff to maintain infrastructure long-term
- Design, build, and optimize ETL/ELT workflows in Databricks to ingest data from multiple sources
- Implement data cleansing, enrichment, and standardization processes across batch and streaming pipelines
- Build solutions for real-time analytics and ensure pipelines are scalable, performant, and fault tolerant
- Optimize SQL queries, data models, and cloud resource usage across compute, storage, and networking
- Design and implement data architecture across data lakes, data warehouses, and lakehouses, including partitioning, indexing, and schema design
- Integrate data from diverse sources, including databases, APIs, IoT systems, and third-party platforms
- Collaborate with data scientists, analysts, and BI developers to deliver clean, well-structured data, and document data assets and processes to support discoverability
- Train and support core client staff who will maintain the data lake infrastructure and pipelines long-term
- Have experience working with or around utilities, energy, consulting, research, or adjacent fields
- Have worked with utility industry data (meter, customer, grid, or outage data) or have familiarity with IEC CIM standards
- Bring DevOps experience, including CI/CD pipelines, infrastructure-as-code (Terraform), and automated deployments
- Have helped build a data platform from the ground up, not just operated an existing one
- Are comfortable navigating ambiguity and making thoughtful architectural tradeoffs
- Can translate business problems into scalable technical solutions
- Communicate clearly with both technical and non-technical audiences
- Care about data quality, reliability, and long-term maintainability
- Enjoy working hands-on across the full data lifecycle, from ingestion to delivery