Own platform architecture and technical direction for how we ingest, transform, and serve data across highly variable input formats
business entity data sourced from thousands of government agencies, registries, and third-party providers, each with its own schema, cadence, and reliability profile
Design and build systems for scale
both the infrastructure we need today and the infrastructure we'll need at 2–5x our current volume
Scope and drive complex projects end-to-end, breaking ambiguous problems into well-defined milestones with clear deliverables and timelines
Design AI-powered tooling to improve how we acquire and maintain data using LLMs, AI agents, and agent orchestration
Partner with product engineering, data science, and business teams to understand data needs and translate them into platform capabilities
Establish and maintain data governance and quality standards across the platform, ensuring the integrity and reliability of the data our customers depend on for compliance and risk decisions
Requirements
7+ years of professional software engineering experience, with meaningful time spent on data infrastructure, data engineering, or backend platform work (targeting Senior to Staff Level Engineers)
Experience designing and operating systems at meaningful scale, ideally within a larger or rapidly scaling engineering organization
Track record of independently owning and delivering complex, multi-milestone projects
from scoping through launch
Strong data modeling instincts and deep familiarity with SQL, pipeline orchestration (Airflow, Dagster, etc.), and data transformation patterns
Experience with distributed data processing frameworks (Spark, Flink, Beam, or similar) and an understanding of when and how to apply parallelization to scale pipelines beyond single-node limits
Proficiency in one or more of: Python, Ruby, JavaScript/TypeScript, Java