Role Overview

Design, build, and deploy machine learning solutions for audience targeting, lookalike generation, and individual propensity scoring
Own the complete ML lifecycle
from exploratory analysis and experimentation all the way through production deployment and operational monitoring
Develop and ship production ML systems spanning self-supervised representation learning, vector similarity search, and supervised classifiers
Leverage distributed computing (Spark/Databricks) and cloud data platforms (AWS, Snowflake) to build and run production ML pipelines at scale
Ensure model quality through rigorous evaluation practices: from embedding validation and retrieval quality to supervised model calibration and production monitoring
Engineer features at scale from demographic, behavioral, and identity data — including handling missing values, encoding strategies, and pipeline-level data quality validation
Contribute ML logic directly into shared production services, working alongside data engineering, software engineering, and product teams

Requirements

8+ years of experience in Data Science or Machine Learning, with a proven track record of delivering high-impact end-to-end ML solutions
Master-level proficiency in Python and SQL
Strong experience with big data and cloud infrastructure (Spark/Databricks, AWS S3, or equivalents)
Expertise deploying and maintaining production ML pipelines including batch model training, large-scale scoring runs, async job orchestration, evaluation and monitoring
Strong experience in audience intelligence or AdTech, with deep knowledge of audience modeling, lookalike/similarity systems, and ML-driven targeting at scale
Hands-on experience with vector similarity and approximate nearest neighbor systems (FAISS or equivalent) — including index
construction, search quality tradeoffs, and production embedding serving
Experience with software engineering best practices: git, automated tests, CI/CD, and code deployment
Exceptional communication skills with the ability to influence technical and non-technical stakeholders
M.S. or PhD in computer science, applied mathematics, statistics, data science, or a quantitative field with strong ML/modeling foundations
Experience with GenAI tooling and LLM integration — particularly building structured recommendation or explanation layers grounded in ML model outputs
Experience with self-supervised or representation learning approaches, particularly Transformer-based architectures for structured or semi-structured data
Production experience with PyTorch for deep learning and embedding models, scikit-learn and XGBoost for supervised classification pipelines.

Tech Stack

AWS
Cloud
Python
PyTorch
Scikit-Learn
Spark
SQL

Benefits

Eligible for the Company Bonus Plan (targeting 15% of Base Salary).
Comprehensive benefits with excellent medical, vision, and dental coverage.
Health Savings Account (HSA) and Flexible Spending Account (FSA) options.
Employer-paid life insurance, with voluntary additional coverage available.
Voluntary short
and long-term disability, accident, and critical illness insurance.
Flexible hybrid work policy.
Flexible unlimited paid vacation plus 80 hours of paid sick leave.
10 paid company holidays per year plus the week between Christmas and New Year’s off.
401(k) plan with 100% match up to 3%, plus 50% match up to 5% (subject to IRS limits).
Cell phone reimbursement stipend.
Monthly parking or commuter stipend for VA-based employees.

Principal Data Scientist

Key skills

About this role

Role Overview

Requirements

Tech Stack

Benefits