PythonAIMLNLPNatural Language ProcessingGenerative AIGenAILLMHugging Face
About this role
Role Overview
Lead projects and own processes for creating, validating and annotating data for use in LLM/ML applications
Design/improve workflows to create data for AI/ML training and evaluation
Dive deep into existing workflows and processes to gather data and insights
Work closely with client stakeholders on understanding goals, gathering requirements, proposing solutions, and executing them.
Contribute to establishing best practices and standards for generative AI development
Requirements
5+ years of relevant experience with data creation, curation, and analysis for GenAI applications
MA in (computational) linguistics, data science, computer science (AI / ML / NLU), quantitative social sciences or a related scientific / quantitative field, PhD strongly preferred
Ability to collaborate directly with technical stakeholders including senior project managers, data engineers, and research scientists
Knowledge of how components of GenAI products or services combine to work
Excellent problem-solving skills, with the ability to think critically and creatively to develop innovative AI solutions
Experience with Natural Language Processing (NLP) techniques and tools, such as SpaCy, NLTK, or Hugging Face.
Proficiency in Python to handle / transform large datasets
Tech Stack
Python
Benefits
Providing technical mentorship and guidance to junior team members