San Francisco, California, United States of America
Full Time
3 hours ago
$150,000 - $250,000 USD
No Visa Sponsorship
Key skills
PythonSQLAIData Engineering
About this role
Role Overview
Design and build data systems that power reliable AI workflows across enterprise environments
Develop pipelines for collecting, cleaning, transforming, labeling, and evaluating domain-specific data used by AI systems
Create data quality frameworks that identify coverage gaps, ambiguity, drift, duplication, leakage, and other failure modes
Build tools and workflows that help teams turn raw customer data into usable context for retrieval, evaluation, reasoning, and execution
Partner with AI Researchers and AI Engineers to understand how data quality affects system behavior and production outcomes
Develop synthetic data, annotation, and feedback-loop strategies to improve system performance in areas where real-world data is sparse or noisy
Analyze customer workflows and datasets to determine what information AI systems need, where that information should come from, and how it should be represented
Communicate clearly with internal teams and customer stakeholders about data assumptions, limitations, risks, and tradeoffs
Requirements
Experience Building Data Systems for AI: You have built data pipelines, evaluation datasets, labeling workflows, retrieval corpora, or similar systems that improve model or agent behavior
Strong Data Engineering Fundamentals: You write clean Python and SQL, understand data modeling and pipeline reliability, and can build systems that are maintainable under production constraints
Research-Oriented Builder: You are comfortable investigating how data quality, structure, and representation affect AI system performance
AI-Native Working Style: You use AI tools daily to accelerate coding, analysis, debugging, exploration, and workflow automation
Comfort with Ambiguous Data: You can reason through messy enterprise datasets, incomplete documentation, conflicting business definitions, and changing requirements
Bias Towards Measurement: You prefer to make data quality and system behavior observable through concrete metrics, evaluations, and experiments
Customer Environment Readiness: You can work directly with customer teams to understand their data, ask precise questions, and explain tradeoffs clearly
Ownership Mentality: You take responsibility for whether the data layer enables the AI system to deliver reliable value in production
Tech Stack
Python
SQL
Benefits
100% covered medical, dental, and vision for employees and dependents
401(k) with additional perks (e.g., commuter benefits, in‑office lunch)
Access to state‑of‑the‑art models, generous usage of modern AI tools, and real‑world business problems
Ownership of high‑impact projects across top enterprises
A mission‑driven, fast‑moving culture that prizes curiosity, pragmatism, and excellence