Lead and grow a distributed team focused on Generative AI
Define and execute strategy for acquiring, processing, curating, annotating, versioning, and ensuring quality of large-scale datasets to support diverse customer use cases
Enable novel capabilities for controllable and precise multi-modal content generation with Generative AI, solving real customer problems
Build and maintain scalable, efficient data, pre-training, and post-training pipelines across the ML model lifecycle
Collaborate with researchers, ML engineers, and PMs to align on data, architecture, and control needs for new models
Evaluate and integrate tools and methodologies to enhance infrastructure and workflows
Champion data diversity, bias mitigation, and ethical AI practices
Provide technical leadership and foster a culture of innovation
Partner across teams to align priorities and share best practices
Requirements
Ph.D. in CS, Data Science, Engineering, AI/ML, or related field—or equivalent experience
10+ years in engineering and research leadership, managing full lifecycle of ML models from research to direct impact on customer workflows
Deep understanding of ML data lifecycles, especially for modern generative models
Expertise in data quality, governance, versioning, and annotation platforms
Proficiency with cloud tech (AWS, Azure, GCP, Spark, Databricks, Snowflake)
Strong grasp of data privacy, security, and ethical AI principles
Excellent leadership and cross-functional collaboration skills
Experience with large-scale image/video datasets and training generative models