ZenBusiness is dedicated to helping entrepreneurs launch and manage their businesses with simplicity and support. The Conversational AI & Prompt Engineer role focuses on enhancing AI interactions by crafting effective dialogue flows and optimizing prompts to ensure high-quality customer experiences.

Responsibilities:

Analyze conversation transcripts and user feedback to identify areas of confusion, failure, and prompt leakage
Work with the Customer Impact Team Product Lead to define and track conversational KPIs (e.g., resolution rate, containment rate, user satisfaction)
Optimize prompts and model selection for cost efficiency, response latency, and scalability in production environments
Collaborate with the engineers to improve conversation-specific evaluation criteria (e.g., NLU accuracy, intent recognition)
Design and maintain evaluation frameworks to measure prompt performance using golden datasets and automated scoring (e.g., LLM-as-judge, rubric-based scoring, precision/recall of intent routing)
Implement guardrails to reduce hallucinations, prevent prompt injection, and ensure compliant, safe responses
Collaborate on design, map, and implement complex conversation flows, including error recovery and contextual handoffs (escalation to human support)
Own the continuous optimization of system prompts and instructions for LLMs (Gemini, OpenAI) to ensure Velo's response is accurate, tone is consistent, and on-brand
Design and optimize structured outputs, function calling, and tool-routing logic to ensure accurate data capture and downstream system integrations

Requirements:

5+ years with 2+ years in Conversational AI, Applied LLM Engineering, Prompt Engineering, or NLP systems in production environments
Deep experience designing and optimizing prompts for GPT, Gemini, or similar models, including structured outputs and function calling
Practical experience designing and tuning RAG pipelines (chunking, embeddings, retrieval evaluation)
Experience building evaluation datasets and running prompt experiments (A/B testing, automated scoring, regression testing)
Proficiency in Python or TypeScript; experience integrating LLM APIs in production systems
Ability to analyze conversational performance using data and logs to drive measurable improvements
Strong systems thinking, empathy for users, and ability to translate business logic into scalable AI behavior
Experience With Agentic Systems: Similar to Decagon, Agentforce, Fin, Sierra

Conversational AI & Prompt Engineer

Key skills

About this role

Responsibilities:

Requirements: