HubSpot is an AI-powered customer platform that helps businesses grow by connecting marketing, sales, and service. They are seeking a Principal Software Engineer to lead the evolution of their Data Hub, focusing on building large-scale data systems and ensuring reliability and usability for data-driven demand generation.

Responsibilities:

Own core pieces of our data lake and analytics stack (e.g., Iceberg, Spark, batch and streaming pipelines) that power demand gen, segmentation, and scoring at scale
Design and evolve data systems that balance cost, latency, data freshness, and reliability, making explicit tradeoffs using concepts like CAP theorem, efficient partitioning, and storage layout
Partner closely with PM, product analytics, and GTM leaders to shape commercially meaningful solutions: better lead scoring, funnel visibility, audience building, and campaign attribution for marketers and sales
Help make Data Hub an AI‑agent‑forward platform, where curated, evergreen datasets automatically feed AI agents and reporting surfaces rather than requiring manual stitching or ad-hoc pipelines
Own platform-scale outcomes: Influence technical direction across the Data Hub product line and shape the architecture for unified profiles, segmentation, and datasets that other teams can build on
Be a high-leverage, hands-on builder: Write code and build systems while leading end-to-end delivery of high-impact, multi-quarter initiatives, setting standards for reliability, observability, testing, and incident response
Lead through architecture and influence: Define reusable patterns for ingestion, transformation, quality, sync, and observability, mentor senior engineers and tech leads
Use AI code agents: Actively use AI-assisted development tools to speed iteration, reduce toil (e.g., scaffolding, tests, refactors), and improve code quality, while defining best practices with the human‑in‑the‑loop approach
Champion incremental, outcome-focused delivery: Break down big, ambiguous problems into incremental milestones that deliver value early and often, balancing long-term platform bets with clear business impact (ARR, adoption, usage, efficiency)
Raise the bar on engineering practices: Model strong habits around documentation, design reviews, testing, and observability, and help establish reliability and data quality standards so downstream AI agents and data activation use cases can trust the data they receive

Requirements:

Deep experience building large-scale data systems with Apache Spark and modern table formats like Apache Iceberg, including efficient partitioning, clustering, and file layout for both heavy ingestion and low-latency reads
Applies distributed systems principles and CAP theorem pragmatically to design fault-tolerant, horizontally scalable services that balance availability, consistency, latency, and cost, where it matters
Can turn ambiguous business goals into clear data models, contracts, and SLAs across multiple storage and compute layers (e.g., Iceberg, warehouses, logs, CRM stores)
Influence technical direction across the Data Hub product line and shape the architecture for unified profiles, segmentation, and datasets that other teams can build on
Write code and build systems while leading end-to-end delivery of high-impact, multi-quarter initiatives, setting standards for reliability, observability, testing, and incident response
Define reusable patterns for ingestion, transformation, quality, sync, and observability, mentor senior engineers and tech leads
Actively use AI-assisted development tools to speed iteration, reduce toil (e.g., scaffolding, tests, refactors), and improve code quality, while defining best practices with the human-in-the-loop approach
Break down big, ambiguous problems into incremental milestones that deliver value early and often, balancing long-term platform bets with clear business impact (ARR, adoption, usage, efficiency)
Model strong habits around documentation, design reviews, testing, and observability, and help establish reliability and data quality standards so downstream AI agents and data activation use cases can trust the data they receive

Principal Software Engineer

Key skills

About this role

Responsibilities:

Requirements: