Lattice is a people success platform focused on building cultures where employees and companies thrive. They are seeking a Senior Software Engineer to join their AI Engineering team, responsible for developing evaluation frameworks and agent infrastructure to enhance AI performance across the organization.

Responsibilities:

Design and ship a robust, end-to-end AI evaluation framework, covering offline evals, production tracing, and human-in-the-loop feedback loops, connected across all of Lattice’s AI use cases
Define and instrument the metrics that actually matter: agent task completion, hallucination rates, response quality, user engagement, and downstream business outcomes
Build and maintain evaluation datasets, test harnesses, and automated scoring pipelines to catch regressions before they ship
Identify and surface the drivers of agent quality improvement, giving the team clear signals on where to invest
Architect and implement reusable agent infrastructure: multi-turn conversation workflows, recommendation services, LLM DAGs, and standardized agent topology patterns using LangGraph
Build and scale RAG pipelines and retrieval infrastructure, including vector store management and retrieval quality optimization
Make principled build vs. buy decisions across LLM providers, agent frameworks, and evaluation tooling, balancing capability, cost, latency, and vendor risk
Contribute to production AI systems with a strong focus on reliability, observability, and performance, not just prototypes
Own projects end-to-end: scope them, drive them to completion, and bring in the right people at the right time
Partner with engineering leads and managers to inform technical direction on agent quality and evaluation strategy you’ll be expected to hold intelligent, substantive conversations about methodology, not just implementation
Raise the AI engineering bar across the broader team through code review, documentation, and thoughtful technical debate

Requirements:

5+ years of professional software engineering experience with significant time spent on production AI/ML systems
Deep hands-on experience with LLM-based systems: prompt engineering, RAG pipelines, agent orchestration, evaluation metrics, and model fine-tuning
Proven ability to work with data and understand statistics, especially in experiments
Proven ability to build and operate agentic AI systems in production: multi-step workflows, multi-agent topologies, and the failure modes that come with them
Strong command of AI evaluation: you've built eval frameworks before, you know the difference between a good eval and a vanity metric, and you have opinions about it
Production-grade Python engineering: clean, maintainable, testable code
LangGraph or comparable agent orchestration frameworks. You've built real agent workflows with it, not just tutorials
LangSmith or comparable LLM observability tooling for tracing, evaluation, and debugging
Reads AI papers & blogs regularly and is a trusted source of AI trends
Vector databases (Pinecone or similar) and retrieval system design
AWS ecosystem or other cloud infrastructure (ex GCP). Comfortable with lambdas, queues, and cloud-native architecture
Familiarity with TypeScript is a plus. Our full-stack engineers use it and cross-pollination is valuable
Clear eyes: you see problems as they are, not as you'd like them to be. You surface hard truths early and address them directly
Ship, shipmate, self: you prioritize the product and your teammates. Low ego, high ownership
You're as comfortable in ambiguity as you are in well-defined problems: early foundations mean you'll encounter both
Strong technical communication: you can debate evaluation methodology with an AI lead and explain it clearly to an EM in the same afternoon
Experience with RLHF, LoRA, or other model adaptation techniques
Background in traditional ML (supervised/unsupervised, neural networks) and knowing when an LLM is overkill
Experience with MLOps tooling: MLflow, DataDog, CI/CD pipelines for model deployment
Published work, conference talks, or open-source contributions in AI/ML
Experience in HR tech, people analytics, or other domains where data quality and trust are critical

Senior Software Engineer, AI

Key skills

About this role

Responsibilities:

Requirements: