Jansoft Global is seeking a highly skilled AI Cost Optimization & Workflow Engineer to design, optimize, and scale AI-powered workflows while ensuring cost efficiency across cloud and AI infrastructure environments. The role focuses on optimizing LLM usage, GPU workloads, cloud spend, and AI orchestration pipelines while improving automation, performance, and reliability.
Responsibilities:
- Monitor and optimize LLM API usage (token consumption, model selection, inference cost)
- Reduce cloud infrastructure costs across AWS/Azure/GCP AI workloads
- Optimize GPU utilization for training and inference environments
- Implement autoscaling and workload right-sizing strategies
- Build cost observability dashboards for AI workloads
- Analyze and reduce redundant AI pipeline executions
- Design and implement scalable AI workflows (RAG, agent-based systems, automation pipelines)
- Optimize prompt engineering to reduce token usage and latency
- Implement caching strategies for AI responses
- Build orchestration pipelines using tools like LangChain, Airflow, or similar frameworks
- Integrate AI services with enterprise systems (ITSM, CRM, ERP, internal APIs)
- Improve throughput, response time, and reliability of AI services
- Implement model performance monitoring and drift detection
- Establish AI usage guardrails and cost governance policies
- Collaborate with FinOps and Cloud teams to align AI spend with business KPIs
- Document architecture and optimization strategies
Requirements:
- 5+ years in Cloud Engineering, DevOps, MLOps, or AI Engineering
- Strong experience with AWS, Azure, or GCP
- Hands-on experience integrating and optimizing LLM APIs
- Strong Python scripting skills
- Experience with Kubernetes and containerized environments
- Experience with Infrastructure as Code (Terraform preferred)
- Understanding of cloud cost management tools
- Experience designing RAG or AI automation pipelines