Engineering and deploying production‑ready generative AI solutions, including LLMs, VLMs, and multimodal models, with a strong emphasis on inference, scalability, and reliability
Designing and operating LLM Ops pipelines, including model versioning, fine‑tuning, evaluation, deployment, rollback, and lifecycle management
Building and maintaining AI platforms and services that support prompt management, embeddings, vector search, retrieval‑augmented generation (RAG), and tool‑calling workflows
Integrating generative AI capabilities into enterprise applications using APIs, microservices, and event‑driven architectures
Implementing MLOps best practices, including CI/CD for models, automated testing, performance benchmarking, observability, logging, and cost monitoring
Optimizing model performance across latency, throughput, accuracy, and cost using techniques such as quantization, catching, batching, and model routing
Collaborating with cloud, data, security, and product teams to ensure solutions meet enterprise standards for security, governance, and responsible AI
Producing clear technical documentation and operational runbooks and communicating delivery status and business value to stakeholders
Mentoring engineers and contributing to reusable frameworks, standards, and platform capabilities.
Requirements
A master’s degree in computer science, AI, Machine Learning, or a related field, or equivalent hands‑on industry experience
Proven experience deploying and operating generative AI models in production, rather than only research or experimentation
Strong proficiency in Python, with practical experience using PyTorch, TensorFlow, Hugging Face, and transformer‑based architectures
Experience with AI platform and MLOps tooling, such as model registries, experiment tracking, orchestration, CI/CD pipelines, and monitoring solutions
Solid understanding of cloud‑native architectures, containers, and scalable inference patterns (e.g., Kubernetes‑based deployments)
Hands‑on experience with RAG systems, vector databases, embeddings, prompt optimization, and evaluation frameworks
Strong software engineering discipline, including testing, code reviews, documentation, and production support
Excellent problem‑solving, collaboration, and communication skills, with the ability to work effectively across engineering and business teams
A delivery‑focused mindset, comfortable-owning systems in production and continuously improving them.
Tech Stack
Cloud
Kubernetes
Microservices
Python
PyTorch
Tensorflow
Benefits
Contemporary work-life balance policies and wellbeing activities
Comprehensive private medical care options
Safety net of life insurance and disability programs