Meta is seeking AI research scientists to help us build the data foundation for Meta's most advanced Large Language Models. The role involves collaborating with cross-functional teams to develop foundational models and improve data velocity across workflows by architecting efficient data curation systems and pipelines.
Responsibilities:
- Collaborate with cross-functional teams to develop Meta’s next foundational models
- Advance our understanding of data research, such as how to overcome data walls and how best to create synthetic data
- Fundamentally improve our data velocity across workflows and projects by contributing to the advancement of data tooling
- Architect efficient and scalable data curation systems and pipelines
- Execute on high priority projects in pre-training, mid-training, or post-training data curation
- Apply specialized expertise in agentic data, synthetic data, reasoning data, web parser, coding data, data scaling laws, or datamix optimization
- Lead complex technical projects end-to-end