xAI is dedicated to creating AI systems that enhance human understanding of the universe. They are seeking a Member of Technical Staff to scale synthetic coding data, optimize mid-training data mixtures, and develop robust evaluation methods for AI training checkpoints.
Responsibilities:
- Scale synthetic coding data to trillions of tokens with large-scale docker verification
- Distill the intelligence of flagship models into flash models through synthetic data generation
- Optimize mid-training data mixtures to boost the ceiling for RL
- Engineer long-context data recipes
- Develop robust and diverse evaluation for mid-training checkpoints