Home
Jobs
Saved
Resumes
Platform Reliability Engineer – Agentic AI at Search Atlas | JobVerse
JobVerse
Home
Jobs
Recruiters
Companies
Pricing
Blog
Jobs
/
Platform Reliability Engineer – Agentic AI
Search Atlas
Website
LinkedIn
Platform Reliability Engineer – Agentic AI
San Francisco, California, United States of America
Full Time
1 hour ago
$70,000 USD
Apply Now
Key skills
Kubernetes
Python
Terraform
AI
ML
Agentic
MLOps
EKS
GKE
ArgoCD
GitOps
SaaS
About this role
Role Overview
Architect the Autonomous Backbone
Design and maintain the Kubernetes-based platform (EKS/GKE)
Engineer for Zero-Touch
Enable true "zero manual execution" at the infrastructure level
Optimize ML inference pipelines for real-time agent decision-making
Design self-healing systems
Build distributed tracing and monitoring for complex agentic interactions
Implement guardrails and safety controls for autonomous agent execution
Requirements
6+ years in Platform Engineering, SRE, or Infrastructure roles within high-growth SaaS environments
Proven experience supporting AI/ML systems at scale
Mastery of Terraform, ArgoCD, and GitOps workflows
Expert-level Kubernetes (EKS/GKE) networking, scaling, security, and multi-tenancy patterns
Hands-on experience with MLOps pipelines for autonomous agents
Proficiency in Python for building custom platform tools and automation
Deep expertise in distributed tracing and monitoring for complex, event-driven systems
Experience with high-frequency data pipelines, web crawling at scale, real-time processing, and low-latency requirements.
Tech Stack
Kubernetes
Python
Terraform
Apply Now
Home
Jobs
Saved
Resumes