About this role

Reflection is a company on a mission to build open superintelligence and make it accessible to all. They are seeking a core member of their Applied AI team to drive model fine-tuning and evaluations for enterprise customers, working hands-on with customer data and deploying adapted models to production.

Responsibilities:

Fine-tune Reflection's open-weight models for customer-specific use cases: prepare datasets, configure training runs (SFT, preference optimization, reinforcement fine-tuning), and iterate based on evals
Build and maintain evaluation infrastructure: design eval suites, curate test sets, establish baselines, and measure whether fine-tuned models actually improve on the tasks customers care about
Prepare training data from raw customer inputs: inspect data quality, clean and format datasets, identify adversarial or noisy samples, and build reproducible data pipelines
Debug and diagnose training and inference issues: interpret loss curves, catch data quality problems, and identify when training dynamics indicate something is wrong
Support end-to-end deployments of fine-tuned models across hybrid environments (public cloud, VPC, and on-premises), helping ensure inference performance and reliability in production
Contribute to evolving playbooks, evaluation benchmarks, and best practices as part of a growing fine-tuning and evals practice

ML Engineer, Post-Training and Evaluation

Key skills

About this role

Responsibilities: