What Your Day Might Look Like:

Make AI production-ready: Design and maintain the infrastructure that takes ML models from experimentation to reliable, scalable deployment.
Build automated ML pipelines: Create repeatable workflows for training, evaluation, deployment, and retraining — with versioning and reproducibility built in.
Deploy and serve models: Package models as production services across cloud, on-premise, or hybrid environments, with performance and reliability in mind.
Monitor what matters: Track model performance, data drift, system health, and production signals to support better retraining and troubleshooting decisions.
Enable AI teams: Work closely with Data Scientists, AI Engineers, and Software Engineers to improve how models are tested, deployed, and maintained.
Set the standard: Contribute to best practices around CI/CD, model registry, observability, security, and governance.
Support GenAI at scale: Help deploy and optimize LLM-based systems, including inference services, GPU usage, and RAG infrastructure where needed.
Keep systems secure and reliable: Ensure ML deployments follow strong practices around access control, data governance, and operational resilience.

Your Superpowers🚀:

BSc or MSc in Computer Science, Software Engineering, or a related STEM field.
5+ years of experience in MLOps, DevOps, platform engineering, or ML engineering, with exposure to ML systems in production.
Strong Python skills and good software engineering fundamentals.
Hands-on experience with ML lifecycle tools such as MLflow, Kubeflow, SageMaker, Vertex AI, Azure ML, or similar.
Experience deploying models using tools like BentoML, TorchServe, Triton Inference Server, or equivalent serving frameworks.
Strong experience with Docker, Kubernetes, CI/CD, and production-grade deployment workflows.
Comfortable working across cloud environments — AWS, Azure, GCP, or hybrid setups.
Experience with monitoring and observability tools such as Prometheus, Grafana, or similar.
Understanding of model performance, drift, retraining, reproducibility, and production reliability.
Strong collaboration skills — able to work across Data Science, Engineering, and client-facing teams.

Bonus Points for:

Experience with Terraform, Pulumi, or infrastructure-as-code practices.
Experience with feature stores such as Feast or Tecton.
Familiarity with data and model versioning tools such as DVC or Delta Lake.
Experience with Kafka or event-driven ML workflows.
Hands-on experience serving LLMs in production using vLLM, TGI, Triton, or similar.
Familiarity with model optimization techniques such as quantization or GPU memory tuning.
Experience operating RAG infrastructure, vector databases, and embedding pipelines.
Exposure to LLM evaluation and observability tools such as LangSmith, RAGAS, or custom evaluation frameworks.

Perks on Perks:

Competitive salary and hybrid work model – come hang out in our Athens office or work remotely from anywhere in European economic Area (EU, Switzerland etc.) or UK (up to 6 weeks per year).
Training budget to level up your skills from the top tech partners in the market (Microsoft, AWS, Salesforce, Databricks etc.) – whether it’s certifications or courses, we’ve got you covered.
Private insurance, top-tier tech gear, and the chance to work with a stellar crew.

Senior MLOps Engineer

Key skills