AWSAzureCloudGoogle Cloud PlatformPyTorchTensorflowAIMLNLPGenerative AILLMLarge Language ModelsTensorFlowHugging FaceGCPGoogle CloudCollaboration
About this role
Role Overview
Lead the design, training, and optimization of large-scale language models for African languages and dialects, ensuring accuracy and contextual relevance.
Develop innovative solutions for data pre-processing, tokenization, and embedding, specifically for low-resource and multilingual datasets.
Implement and fine-tune transformer-based architectures (e.g., GPT, BERT) for specific NLP applications like text generation, translation, and sentiment analysis.
Design scalable pipelines for data collection, annotation, and pre-processing to support model development.
Collaborate with linguists, cultural experts, and AI researchers to ensure the LLM aligns with diverse user needs and cultural sensitivities.
Deploy, monitor, and maintain production-level LLM systems in cloud-based environments, ensuring robust performance and reliability.
Establish frameworks for model evaluation, leveraging metrics such as perplexity, BLEU, and human-centered benchmarks.
Stay at the forefront of advancements in NLP, LLMs, and generative AI, integrating emerging research into Awarri's language model development efforts.
Document processes, architectures, and outcomes for knowledge sharing and collaboration across teams.
Requirements
Minimum of 3 years in AI/ML engineering, with a focus on NLP and large language models.
Strong expertise in transformer-based models such as GPT, BERT, T5, or equivalent.
Proficiency with frameworks like TensorFlow, PyTorch, or Hugging Face Transformers.
Experience in training and fine-tuning LLMs on large-scale, multilingual datasets.
Strong programming skills, with expertise in NLP libraries and tools.
Experience in building pipelines for text data collection, pre-processing, and annotation.
Familiarity with cloud platforms like AWS, Azure, or GCP for deploying scalable NLP solutions.
Demonstrated ability to work on open-ended research problems and collaborate with interdisciplinary teams, including linguists and cultural experts.
A proven track record of developing and deploying NLP systems that deliver measurable value in production.
Tech Stack
AWS
Azure
Cloud
Google Cloud Platform
PyTorch
Tensorflow
Benefits
Be part of a pioneering initiative to shape the future of AI in Africa.
Collaborate with a passionate and diverse team of engineers, researchers, and innovators.
Contribute to projects with real-world impact, driving inclusivity and representation in AI.