Build and own data infrastructure — pipelines, warehousing, ETL/ELT, data quality; make sure the foundation is solid
Analyze product usage and user behavior — identify patterns, segment users, surface what matters from the noise; think like a product person, not just a data person
Build models that ship — LLM-based systems, traditional ML (classification, clustering, NLP), evaluation frameworks; whatever the problem needs
Define and track the metrics that matter — activation, retention, engagement, PQLs; connect data to product and GTM decisions
Run experiments and measure impact — A/B tests, causal analysis, cohort studies; rigorous but fast
Turn data into product conviction — you don’t just hand off charts, you tell the team what to do and why
Requirements
5+ years across data science, data engineering, and analytics — you do all three, not just one
Strong SQL and Python — complex queries, data modeling, scripting, analysis; this is your daily toolkit
Databricks or equivalent modern data platform experience (Snowflake, BigQuery)
LLM experience — fine-tuning, prompt engineering, embeddings, RAG, evaluation; not just API calls
Traditional ML depth — classification, regression, clustering, NLP, feature engineering; you pick the right tool for the problem
Product mindset — you filter signal from noise, understand user behavior, and connect analysis to product decisions
Pipeline engineering — you build reliable, scalable data pipelines, not notebooks that break in production
Clear communicator — you present findings to non-technical stakeholders with clarity and conviction.