Rational Dynamics is an early-stage startup focused on building customized AI reasoning systems for high cognitive complexity tasks. They are seeking a Senior Machine Learning Engineer to join a small, senior team to develop AI systems for critical environments, ensuring reliability and performance in production ML systems.
Responsibilities:
- Own, extend, and improve production ML systems: training pipelines, evaluation frameworks, model serving infrastructure, and monitoring. Focus on delivering reliable capability to customers
- Optimize models for latency, cost, and reliability with a bias toward correctness in environments where errors are not recoverable
- Translate research experiments into production-grade capability that solves real customer problems, as an embedded member of the research & ML team
- Design and maintain evaluation and testing infrastructure to enable fast, high quality research and deployment to enable Rational Dynamics to move quickly, and deliver a high quality product with confidence
- Integrate third-party model APIs and LLM orchestration frameworks into the platform
- Support the deployment of agents into complex, high-stakes enterprise environments
- Continuously improve system performance through disciplined benchmarking and iteration
Requirements:
- Orientation toward customer impact. You measure your work by whether it solves real problems, not by technical sophistication alone
- 5+ years of experience building and maintaining ML systems in production
- Track record of shipping ML systems where reliability and correctness were non-negotiable, not demo-quality or research-only work
- Command of machine learning fundamentals and modern deep learning frameworks such as PyTorch or JAX
- Strong skills in latency and cost optimization at scale, including efficient inference, serving optimization, and resource-aware model deployment
- Strong programming skills in Python, with experience in at least one of C++, Rust, or Go
- Comfort operating on a small team with minimal process, high ownership, and significant ambiguity
- Demonstrated experience deploying ML solutions in real production environments serving end users or customers
- Experience with RAG pipelines, vector databases, or LLM orchestration frameworks such as LangChain or LlamaIndex
- Prior work with third-party model APIs such as OpenAI or Anthropic at scale
- Experience building or deploying custom agents in common agent frameworks
- Experience in regulated or high-consequence industries such as finance, healthcare, defense, or critical infrastructure
- Prior early-stage or small-team experience where you owned architectural and technical decisions end-to-end