Cogniware AI is a leading innovator enabling businesses to achieve greater productivity through efficient, and scalable AI inference solutions. The Senior AI/ML Systems Engineer will design, build, optimize, deploy, and operate AI/ML software products with a focus on LLM performance, quality, reliability, and production scale.
Responsibilities:
- Design, develop, and productionize AI/ML software products and platform capabilities for LLM and GenAI workloads
- Build and optimize model serving and inference systems across CPU/GPU environments
- Improve LLM performance using techniques such as batching, caching, quantization, profiling, parallelism, and system tuning
- Develop frameworks and pipelines to measure and improve model quality, including evaluation, experimentation, regression testing, and benchmarking
- Build scalable APIs, services, and platform components for model deployment, orchestration, and lifecycle management
- Implement observability for AI systems, including logging, monitoring, tracing, model/system metrics, and failure analysis
- Partner with ML engineers, application engineers, and product teams to turn prototypes into reliable production systems
- Support CI/CD and MLOps workflows for model packaging, release management, testing, and rollback
- Troubleshoot performance bottlenecks across application, system, infrastructure, and GPU layers