Fuse Fundraising is a dynamic consulting agency that partners with nonprofits to drive meaningful impact through strategic direct response marketing. The Data Engineer will help evolve the data platform, working across the full data lifecycle to optimize existing systems and design new ones, while ensuring data quality and governance.
Responsibilities:
- Architect scalable data solutions that align with our analytics, reporting, and data science objectives, refining existing systems and designing the foundation for new data product lines
- Design and implement architectures on our Databricks lakehouse, governed by Unity Catalog and built within the Microsoft Azure ecosystem
- Establish best practices for data modeling, storage organization, and access control across client datasets
- Evaluate and introduce innovative patterns and tooling that improve the reliability and maintainability of our lakehouse platform
- Build, deploy, and maintain robust ETL/ELT pipelines that ingest, transform, and load data from a variety of client sources and formats
- Develop and manage pipelines in Databricks Workflows using Python and Spark for transformation, with SQL supporting querying, reporting logic, and stored procedures
- Monitor pipeline health, optimize performance, and implement alerting to ensure data freshness, accuracy, and cost efficiency as data volumes grow
- Take ownership of our digital reporting data ingestion pipeline (currently powered by Fivetran), driving improvements in cleanliness, performance, and cost efficiency
- Evaluate ingestion tooling and approaches to ensure our solution scales with client needs and remains cost-effective over time
- Implement data validation and testing practices that ensure downstream consumers can trust the data they work with
- Use Unity Catalog to manage governance, access controls, and lineage across our lakehouse
- Maintain data dictionaries, pipeline runbooks, and data model documentation that keep our work explainable and accessible
- Identify and resolve data quality issues proactively in partnership with analytics teams
- Partner closely with analysts and data scientists to understand their data needs and deliver well-modeled datasets that power dashboards and machine learning workflows
- Build and maintain curated, analysis-ready datasets and semantic layers that enable self-service analytics, empowering analysts and internal teams to build their own reports without depending on the data team for every request
- Design data models with usability in mind, so business users can confidently navigate and query the data themselves
- Work cross-functionally with internal teams to translate business needs into technical solutions
- Stay current with emerging tools and techniques in the data engineering space, and bring new ideas back to the team
- Identify opportunities to automate manual processes and contribute to the evolution of our team's engineering standards
Requirements:
- 3+ years of experience in data engineering and architecture, with a focus on building production ETL/ELT pipelines and designing scalable systems with thoughtful tradeoffs around cost and complexity
- Strong proficiency in Python and Spark for data transformation, pipeline scripting, and automation, along with solid SQL skills for querying, reporting logic, and stored procedures
- Direct experience with Databricks (notebooks, Jobs/Workflows, Delta Lake) and familiarity with Unity Catalog or similar data governance frameworks
- Experience with ADLS or other cloud data lake storage systems
- Experience building datasets and data models that support BI tools, particularly Power BI
- Strong understanding of data quality principles and experience implementing validation and testing in pipelines
- Curious and growth-oriented, with strong communication skills across both technical and non-technical audiences
- Experience with Fivetran or other managed ingestion tools (Airbyte, Stitch, etc.)
- Experience with dbt on Databricks for the transformation layer
- Experience in a consulting or agency environment working across multiple client datasets
- Exposure to direct response fundraising, nonprofit, or marketing data