AWSAzureCloudDistributed SystemsGoogle Cloud PlatformMicroservicesAIMLLLMRAGAgenticGCPGoogle CloudServerlessCachingSaaSMentoringCommunicationRemote Work
About this role
Role Overview
Lead the design, documentation, and communication of end-to-end system architectures that align with Ozmo's strategic goals, including AI-first capabilities that handle non-deterministic outputs, continuous evaluation, and graceful fallbacks.
Architect scalable AI inference and retrieval stacks (vector/graph stores, embedding pipelines, RAG systems, caching, batching, streaming), balancing accuracy, cost, and latency targets.
Define and own architectural standards for AI-enabled systems, including agent orchestration, memory/state management, guardrails, model versioning, evaluation gates, and safe rollout practices.
Architect multi-tenant AI patterns across the platform, including per-customer data isolation, access control, model configuration, data residency, and cost allocation.
Collaborate with customers' technical teams to evaluate requirements, architect integrations, and ensure seamless deployment aligned with their existing systems.
Partner with product, data, ML, and engineering leaders to translate AI opportunities into deployable system designs; guide tradeoffs between deterministic services and model-driven components.
Create reusable reference architectures and "golden paths" for teams shipping AI features, accelerating delivery while maintaining quality standards.
Mentor engineers and foster a culture of innovation, experimentation, and technical excellence.
Ensure AI systems adhere to Ozmo’s reliability, security, and compliance requirements, including PII handling, policy enforcement, auditability, and incident response for AI components.
Contribute to long-term technical roadmaps and architectural reviews that include AI-first design principles.
Requirements
7+ years architecting and delivering enterprise-scale SaaS platforms, including multi-tenant systems, distributed architectures, and complex integrations
Hands-on experience designing AI-enabled systems: RAG architectures, LLM integrations, agentic workflows, vector databases, and AI evaluation frameworks
Deep expertise balancing AI system tradeoffs: accuracy vs. latency vs. cost, deterministic vs. model-driven components, and build vs. buy decisions
Strong foundation in distributed systems and cloud-native architecture (AWS, Azure, or GCP): microservices, event-driven architectures, serverless patterns, API design, domain-driven design (DDD) principles, and AI infrastructure patterns including model hosting, inference optimization, observability for non-deterministic systems, and security/compliance
Deep expertise in multi-tenant SaaS architecture, including data isolation, per-customer configuration, cost allocation, and scalability patterns for AI workloads
Proven experience leading AI-enabled architecture initiatives and mentoring technical teams on intelligent system design
Track record creating reusable reference architectures that accelerate team delivery while maintaining quality and security standards
Strong communication skills with ability to translate complex technical concepts for diverse audiences: engineers, executives, and customers
Passion for continuous learning, experimentation, and improving technical systems at scale.
While a Bachelor's degree is preferred, we place greater value on proven, relevant experience.
Tech Stack
AWS
Azure
Cloud
Distributed Systems
Google Cloud Platform
Microservices
Benefits
Medical, vision, dental and life insurance along with short
and long-term disability
Plenty of paid time off (PTO) that grows the longer you’re with Ozmo, as well as paid holidays
401k to save for retirement with employer matching
Paid maternity and bonding leave for new parents
Paid pawternity leave when you bring a new pet into your life
One-month sabbatical after you have been with Ozmo for five years
Flexible, remote work arrangements to support your best work