About this role

Sciforium is an AI infrastructure company that develops advanced AI models and operates a proprietary serving platform. The role involves architecting and leading the development of the next-generation model serving platform, while also mentoring engineers and influencing engineering direction.

Responsibilities:

Lead the technical direction of the model serving platform, owning architecture decisions and guiding engineering execution
Build core serving components including execution runtimes, batching, scheduling, and distributed inference systems
Develop high-performance C++ and CUDA/HIP modules, including custom GPU kernels and memory-optimized runtimes
Collaborate with ML researchers to productionize new multimodal models and ensure low-latency, scalable inference
Build Python APIs and services that expose model capabilities to downstream applications
Mentor and support other engineers through code reviews, design discussions, and hands-on technical guidance
Drive performance profiling, benchmarking, and observability across the inference stack
Ensure high reliability and maintainability through testing, monitoring, and engineering best practices
Troubleshoot and resolve complex issues across GPU, runtime, and service layers

Lead Software Engineer, Model Serving Platform

Key skills

About this role

Responsibilities: