Risepoint is an education technology company that provides world-class support and trusted expertise to universities and colleges. They are seeking a Senior AI Engineer to design, implement, and operationalize AI systems with a focus on structured evaluation and production-grade reliability, contributing to their AI-powered Student Journey Platform.
Responsibilities:
- Build and maintain evaluation frameworks (LLM-as-Judge, rubric-based scoring, regression test suites) to measure output quality, reliability, and drift with the responsibility of debugging production level issues as detected
- Architect and implement multi-agent workflows with clear coordination, tool usage, and failure handling patterns
- Build structured observability into AI systems (tracing, prompt/version tracking, evaluation logging, cost and latency monitoring)
- Define and enforce quality gates for AI features using automated evals prior to production release
- Optimize inference performance (latency, token usage, caching, batching, routing across models)
- Collaborate with product and engineering teams to translate business requirements into testable AI system designs
- Contribute to code reviews, architectural discussions, and internal standards for AI development
- Design and implement Retrieval-Augmented Generation (RAG) systems and Model Context Protocol (MCP) servers using structured and unstructured enterprise data
- Develop and manage fine-tuning workflows (SFT, preference optimization, or related techniques) including dataset preparation, versioning, and validation
Requirements:
- 3-5 years of full stack engineering experience with strong fundamentals in object-oriented programming, applicable design patterns, and AI-focused system design
- Professional experience in Python, C#, Java, or a similar language used in production systems
- Experience with LLM evaluation and observability tooling (e.g. Langfuse, LangSmith, OpenTelemetry-based tracing, custom evaluation harnesses)
- Experience implementing guardrails, policy enforcement, and safety layers in AI driven systems while leveraging LLM-as-Judge for validation and continuous improvement
- Familiarity with performance optimization techniques for LLM-based systems (latency, caching, routing, batching)
- Experience building production-grade RAG systems (retrieval pipelines, chunking strategies, embeddings, reranking, context construction)
- Experience contributing to internal AI standards, reusable frameworks, or platform-level tooling
- Experience deploying AI systems in cloud environments (AWS, Azure, GCP)
- Experience in Databricks (model serving endpoints, ML Flow)