Nutanix is a leader in AI-ready infrastructure, simplifying the deployment of generative AI solutions. The Software Engineer 2 - LLM Inference role involves architecting and developing scalable, fault-tolerant services, optimizing performance for machine learning applications, and collaborating with global teams to deliver high-quality products.
Responsibilities:
- Architect, design, and develop horizontally scalable, containerized, fault-tolerant services on Kubernetes
- Improve the performance of systems to deliver for low-latency and high-throughput use cases
- Optimize any part of the stack, including low-level systems
- Leverage and contribute to relevant open-source cloud native projects
- Develop scalable, efficient, and fault-tolerant observability architectures for collecting, analyzing, and reporting metrics for various platform services
- Collaborate closely with globally located product management and backend development teams to deliver high-quality products in a fast-paced environment
- Contribute to all stages of the product development cycle: technical design, development, test, experimentation, analysis, and launch
- Be a team player by reviewing code and design docs, giving feedback on product specs and mocks, and documentation
- Participate in an ongoing process definition and technology selection to ensure our technology stack is current with relevant trends
- Continuously learn and improve your technical and non-technical abilities