About this roleAbout the Team
We are building the next generation of content safety and governance systems at TikTok, with a strong focus on protecting minors and teens at global scale.
Our team sits at the intersection of product, policy, and AI, leveraging large language models and multimodal technologies to design safety mechanisms that are proactive, scalable, and measurable. We work closely with Policy, Legal, Algorithm, and Engineering teams to translate complex regulatory and safety requirements into real-world product solutions that protect users while preserving a high-quality user experience.
As a project intern, you will have the opportunity to engage in impactful short-term projects that provide you with a glimpse of professional real-world experience. You will gain practical skills through on-the-job learning in a fast-paced work environment and develop a deeper understanding of your career interests.
Applications will be reviewed on a rolling basis - we encourage you to apply early.
Successful candidates must be able to commit to at least 3 months long internship period.
Responsibilities
- Support model strategy and operations for TikTok Minor Safety, working closely with cross-functional teams to participate in the end-to-end process of model design, optimization, training, and evaluation. Identify performance gaps and vulnerabilities in models, and propose effective actions to continuously improve model quality.
- Participate in the full data production lifecycle, including defining dataset standards, executing model evaluations, and ensuring high-quality data delivery.
- Explore and adopt the latest Large Language Model (LLM) tools to continuously optimize data production workflows and processes, improving efficiency and scalability. Stay up to date with industry trends to help build more intelligent and efficient data systems.
- Research emerging model training methodologies from both academia and industry, identify weaknesses in existing training data, and propose innovative solutions to improve data generalization, production efficiency, and coverage.