Home
Jobs
Saved
Resumes
Software Engineer – AI Infrastructure, Training, Inference at SPREEAI | JobVerse
JobVerse
Home
Jobs
Recruiters
Companies
Pricing
Blog
Jobs
/
Software Engineer – AI Infrastructure, Training, Inference
SPREEAI
Website
LinkedIn
Software Engineer – AI Infrastructure, Training, Inference
United States
Full Time
9 hours ago
Apply Now
Key skills
Cloud
Distributed Systems
Docker
Java
Kubernetes
Python
PyTorch
Ray
Go
C++
C
AI
ML
About this role
Role Overview
Design and build scalable infrastructure supporting training and inference workflows.
Develop high-performance APIs and backend services for AI model serving.
Optimize GPU utilization, latency, and throughput for multimodal workloads.
Build distributed systems supporting large-scale generative models.
Improve observability, monitoring, and reliability of AI systems.
Partner closely with Applied Science teams to productionize research systems.
Drive improvements in deployment workflows, automation, and platform usability.
Requirements
Degree in Computer Science, Engineering, or comparable combination of education and practical experience.
Strong object-oriented programming skills (Python, C++, Java, Go, or similar).
Strong data structures and algorithms foundations.
Experience building production backend or distributed systems.
Understanding of cloud infrastructure concepts and containerized systems.
Experience with Kubernetes, Docker, or container orchestration.
Familiarity with GPU-based ML workloads or distributed training/inference systems.
Experience with model serving frameworks (vLLM, Triton, Ray Serve, or similar).
Experience with observability tools and performance debugging.
Familiarity with PyTorch or ML workflows.
Interest in optimizing systems for efficiency, scalability, and developer velocity.
Tech Stack
Cloud
Distributed Systems
Docker
Java
Kubernetes
Python
PyTorch
Ray
Go
Apply Now
Home
Jobs
Saved
Resumes