Home
Jobs
Saved
Resumes
AI Infrastructure Engineer at MMC Group LP | JobVerse
JobVerse
Home
Jobs
Recruiters
Companies
Pricing
Blog
Jobs
/
AI Infrastructure Engineer
MMC Group LP
Website
LinkedIn
AI Infrastructure Engineer
United States
Full Time
1 hour ago
No H1B
Apply Now
Key skills
Kubernetes
Linux
Python
Go
Caching
CI/CD
About this role
Role Overview
Own production reliability: availability, latency, error budgets, incident response, postmortems, and follow-ups
Build/maintain observability: metrics, logs, traces, alerting, SLOs/SLIs, dashboards
Improve deployment safety: CI/CD, rollout strategies (canary/blue-green), automated rollback, runbooks
Capacity planning + cost control: GPU/CPU sizing, autoscaling, queue/backpressure management, cost attribution
Security + compliance: secrets management, least privilege, patching, vulnerability response
Disaster recovery + operational readiness: backups, failover plans, game days
Develop and maintain the GPU inference serving stack (APIs, schedulers, workers, batching, caching)
Requirements
Linux fundamentals
Networking fundamentals
Experience with Kubernetes
Experience with incident response
Experience with observability tools
Strong software engineering ability in at least one of: Go / Python
Ability to reason about performance tradeoffs and measure before optimizing
Tech Stack
Kubernetes
Linux
Python
Go
Benefits
Stock options available for core team members.
401(k) plan for employees.
Comprehensive health, dental, and vision insurance.
The latest and best office equipment.
Apply Now
Home
Jobs
Saved
Resumes