Build production augmentation and profile models: Design, train, evaluate, and deploy models against real survey data, owning the full path from experiment to production
Innovate on solutions: We aim to build the best simulation of survey responses in the industry. You'll design the experiments, baselines, and models to get us there
Use the latest technology: Build on the best open-weights LLMs with fine-tuning, build new ones from scratch with custom architectures or use third party models through an API; whatever gets us the best result
Partner with Data Science: Work with the data science team to ensure integrity of our statistical testing and experiment framework
Leverage AI tooling: Use Claude Code and agentic programming tools where appropriate

Strong knowledge of maths, probability, statistics and algorithms
Programming skills and be proficient in Python
Knowledge of Java is a plus
Knowledge of traditional machine learning tools and techniques (support vector machines, gradient boosting)
An understanding of LLMs and their architecture, ideally with experience in fine-tuning, e.g. LoRA, distillation
Familiarity in SQL, knowledge of Databricks is a plus
Outstanding problem-solving and analytical skills
You like to push the limits of your knowledge every day
You own your conclusions and are comfortable presenting them and shaping outcomes

AI/ML Engineer, Synthetic Data

Key skills