CloudPythonAIMachine LearningMLLLMRAGLangChainAgenticMLOpsSaaSCommunicationRemote Work
About this role
Role Overview
Set the technical direction for AI/ML infrastructure: defining standards, reviewing architectures, and driving decisions on tooling and frameworks.
Build and own the production infrastructure that takes ML and LLM-powered solutions from prototype to live product, ensuring reliability, scalability, and fast iteration cycles.
Lead the development of agentic AI systems for domain-specific business problems, including systems of agents, fine-tuning, RAG pipelines, and full lifecycle ownership.
Define model evaluation strategies, performance monitoring, and key indicators to measure the real-world impact of deployed solutions.
Ensure AI systems are safe, observable, and maintainable by including guardrails, drift detection, and responsible deployment practices.
Collaborate with Product and Engineering to translate complex business problems into well-scoped AI projects, and communicate results clearly to non-technical stakeholders.
Mentor and support other engineers on the Data team on productionization of ML and AI.
Requirements
7+ years of experience as a Machine Learning Engineer or Applied AI Engineer, with a meaningful portion in production environments.
Strong Python skills and experience building robust, production-grade ML systems.
Hands-on experience with LLMs and agentic AI frameworks (e.g. LangChain, PydanticAI, or similar), including RAG pipelines and tool-calling agents (fine-tuning experience a plus)
Experience with MLOps practices: model registry, monitoring, and drift detection.
Proven ability to take ML solutions from prototype to production, including evaluation frameworks, versioning, and iteration.
Comfortable working in a cloud-native environment and collaborating with SRE on deployment and infrastructure concerns.
Ability to drive architectural decisions and define ML best practices across a team.
Strong communication skills: comfortable translating technical findings into business impact for diverse stakeholders.
Experience with multi-tenant ML model deployment in a SaaS environment a plus.
Tech Stack
Cloud
Python
Benefits
Attractive compensation package
Hybrid model with remote days to support balance and flexibility.
Enjoy up to 15 days of remote work from abroad each year.
helloCSE: discount platform for fitness, shopping, culture, cinema, live events
Multicultural colleagues, after-work events, team-building & more.