Smartsheet has been empowering teams for over 20 years with innovative work management solutions. They are seeking a Senior Software Engineer II for their AI Platform Engineering team to lead the design of core infrastructure for AI experiences and ensure safety and performance in AI-driven features.
Responsibilities:
- Build the AI Platform Foundation: Lead the design and ownership of the core infrastructure that serves as the backbone for all Smartsheet AI experiences. Focus on building a robust, multi-tenant environment that reduces friction for internal teams, allowing them to deploy reliable and scalable AI features with ease
- Standardize the AI Developer Path: Architect high-level abstractions and "Golden Path" APIs that democratize AI development across Smartsheet. By insulating product teams from infrastructure complexity, you will enable them to ship intelligent features with high velocity while guaranteeing safety and consistency at scale
- Engineer AI Trust & Safety Systems: Establish the mission-critical monitoring and quality assurance layers that protect Smartsheet customers. By creating rigorous evaluation pipelines, you will ensure every AI-driven feature meets the high bar for safety, data privacy, and deterministic performance expected by our enterprise partners
- Drive technical strategy: Partner with principal engineers to define the technical roadmap for Smartsheet’s AI infrastructure, making architectural decisions that will shape how we build with AI for years to come
Requirements:
- 8+ years of software engineering experience, with at least 2 years working directly with LLMs in production
- Deep, hands-on experience with prompt engineering and context engineering, you understand how model behavior changes with framing, structure, and input design
- Strong working knowledge of RAG architectures: chunking strategies, embedding models, retrieval evaluation, and failure diagnosis
- Experience building or extending LLM evaluation frameworks, you have designed scorers, worked with golden datasets, and thought carefully about what good looks like
- Strong Python skills; comfortable working in data-heavy environments (Databricks, Delta tables, or equivalent)
- Ability to communicate complex quality findings (written and verbal) to both technical and non-technical stakeholders, you can explain what's broke, why it matters, and what needs to happen next without losing the room
- Strong cross-functional judgment, you know when to escalate, when to resolve independently, and how to build credibility across engineering, product, and AI platform teams
- A bias for clarity in ambiguous situations, when failure modes are murky and trade-offs are real, you bring structure and a clear point of view rather than waiting for consensus
- Prior work in an Applied AI or LLMOps platform within a product company
- Experience with Kubernetes (EKS/GKE): The industry standard for AI. Skills include managing GPU scheduling, auto-scaling based on token throughput, and using tools like Karpenter for cost-efficient node provisioning
- Infrastructure as Code (IaC): Using Terraform, Pulumi, or AWS CDK to provision Vector Databases, SQS queues, and S3 buckets
- Vector Databases: Proficiency in managing and optimizing Pinecone, Milvus, Weaviate, or Databricks Vector Search
- AI Gateways: Building or configuring proxies (like LiteLLM or Kong AI Gateway) to handle rate-limiting, PII masking, and cost-tracking
- LLM Observability: Setting up tracing tools like Langfuse, LangSmith, or MLflow to monitor 'Time to First Token' (TTFT) and trace hallucination issues
- Model-Based Evals: Implementing automated scoring systems (like RAGAS or DeepEval) that use an 'LLM-as-a-Judge' to grade production outputs