Coram AI is reimagining video security for the modern world with their cloud-native platform that utilizes computer vision and AI. They are looking for an AI Research Engineer to design and build autonomous agents powered by the latest LLMs, focusing on creating reliable and high-performance systems that operate in real environments.
Responsibilities:
- Design and build autonomous agents using state-of-the-art LLMs
- Implement tool use, retrieval pipelines, memory systems, and multi-step reasoning flows
- Engineer prompts and system instructions for robustness, reliability, and speed
- Optimize latency, cost, and throughput in production
- Build evaluation frameworks to measure agent accuracy, tool correctness, and failure modes
- Create high-quality datasets for training, fine-tuning, and benchmarking
- Develop introspection tooling to debug reasoning chains, hallucinations, and tool misuse
- Run structured experiments to improve agent performance through iterative testing
Requirements:
- BS, MS, or PhD in Computer Science, Engineering, Machine Learning, or a related technical field from top University
- 2+ years of experience building software systems (experience working with LLMs, AI agents, or ML systems highly preferred)
- Strong programming ability in Python, with experience in Go or TypeScript a plus
- Experience working with modern LLM APIs (OpenAI, Anthropic, etc.) and building applications powered by foundation models
- Experience building or contributing to production systems that must be reliable, observable, and scalable
- Ability to diagnose and mitigate LLM failure modes such as hallucinations, tool misuse, and reasoning errors
- Strong experimental mindset with a data-driven approach to improving system performance
- Excellent communication skills (written and verbal) in English
- Passion for building cutting-edge AI systems at the speed of a fast-growing startup
- Resilient and adaptable in challenging, fast-paced environments
- Ability to work in an onsite environment, we move faster when we're in the same room
- Strong experimental mindset with a scientific approach to evaluation and iteration
- Experience working with modern LLMs, RAG pipelines, tool calling, and agent frameworks
- Deep understanding of failure modes in LLM systems and how to mitigate them
- Experience building production systems in Python, Go, or TypeScript
- Familiarity with distributed systems, APIs, and real-time infrastructure
- Comfort shipping systems that must be reliable, observable, and measurable
- Experience building evaluation harnesses or LLM benchmarking systems
- Background in machine learning, applied research, or systems performance optimization
- Experience optimizing inference latency and cost at scale
- Experience debugging complex agent behaviors in real-world environments