CoreWeave is The Essential Cloud for AI™, delivering a platform that enables innovators to build and scale AI confidently. The Senior Software Engineer I, Inference will lead designs and drive architecture within the team, ensuring improvements in latency, throughput, and reliability across multiple services while mentoring junior engineers.
Responsibilities:
- Lead design reviews and drive architecture within the team; decompose multi-service work into clear milestones
- Define and own SLIs/SLOs; ensure post-incident actions land and reliability improves release-over-release
- Implement advanced optimizations (e.g., micro-batch schedulers, speculative decoding, KV-cache reuse) and quantify impact
- Strengthen incident posture: capacity planning, autoscaling policy, graceful degradation, rollback/traffic-shift strategies
- Mentor IC1/IC2 engineers; review cross-team designs and elevate coding/testing standards
- For IC4: own an area spanning multiple services and teams (e.g., request routing & adaptive scheduling, cost-per-token analytics, GPU resource isolation)