Deploy AI models and services on cloud platforms such as AWS, Google Cloud, or Azure
Collaborate with cross-functional teams, including data scientists, researchers, and other engineers, to integrate LLMs and RAG into broader projects
Extend agentic capabilities, including agent orchestration and tools calling, using MCP and A2A
Communicate technical concepts and project progress to non-technical stakeholders and team members
Conduct experiments to fine-tune parameters and optimize model performance
Implement techniques to improve the efficiency and scalability of LLM and RAG systems
Stay up-to-date on the most current developments in the generative-AI space
Monitor and maintain deployed models to ensure they perform reliably and meet performance benchmarks
Ensure code quality by following best practices in software development, including version control, testing, and continuous integration or continuous deployment (CI/CD)
Document code and maintain comprehensive technical documentation to support team knowledge sharing and project handovers
Requirements
5+ years of experience delivering production-ready, industrial strength code, and implementing CI/CD pipelines
Experience in Python and with libraries such as TensorFlow and PyTorch
Experience with neural network architecture, including transformers, and NLP techniques
Experience with LLM tool or function calling and agentic orchestration
Experience handling and preprocessing large datasets, and with data pipelining and ETL processes
Experience with cloud platforms such as AWS, Google Cloud, or Azure, for deploying ML models
Experience with software engineering fundamentals, including version control, testing, and CI/CD
Ability to design efficient algorithms for data retrieval and model training, and optimize models through hyperparameter tuning
Ability to travel up to 25% of the time
Master’s degree
Tech Stack
AWS
Azure
Cloud
ETL
Python
PyTorch
Tensorflow
Benefits
Health, life, disability, financial, and retirement benefits