A Place for Mom is the leading platform guiding families through every stage of the aging journey. They are seeking a Staff Software Engineer to join their Agentic Platform team, responsible for building foundational AI capabilities and infrastructure. The role involves designing and building core platform primitives, ensuring safety and compliance, and collaborating with product teams to enhance the platform's usability.

Responsibilities:

Design and build core platform primitives including provider abstraction layers (OpenAI, Anthropic, Google), structured output validation, streaming infrastructure, and token management systems
Own safety and compliance infrastructure including composable guardrail systems, PII detection/redaction, audit logging, and privacy-first observability that never leaks sensitive data to third parties
Build evaluation infrastructure that enables systematic quality measurement for non-deterministic LLM outputs—datasets, scorers (exact match, LLM-as-judge, schema validation), CI/CD integration, and regression detection
Lead churn containment strategy—design provider adapters and SDK architecture that absorbs rapidly-changing LLM provider SDKs without breaking consuming applications
Architect prompt lifecycle management systems including version control, Langfuse integration, GitHub-based review workflows, and deployment pipelines
Design Agent-as-a-Service infrastructure for long-running async tasks using AWS EventBridge, DynamoDB, and PostgreSQL
Collaborate with consuming teams to understand their needs, onboard them to the platform, and provide technical support
Influence architecture, technology selections, and engineering standards across the broader organization
Create reference implementations and technical documentation that enables other engineers to successfully adopt the platform
Champion quality engineering practices including comprehensive testing, type safety, and observability

Requirements:

8+ years of software engineering experience with significant time spent building platform infrastructure, developer tools, SDKs, or distributed Systems
Production experience with LLM/AI systems—you've built and operated systems using OpenAI, Anthropic, or similar providers, and understand the unique challenges (token limits, non-determinism, provider outages, model deprecations)
Strong TypeScript expertise—this is our company standard, and you'll be designing APIs that other TypeScript developers consume
Experience designing APIs and abstractions that other engineers love to use—you understand the balance between power and simplicity
Understanding of safety and compliance in AI systems—PII handling, guardrails, audit logging, and responsible AI practices
Experience with event-driven architectures and async processing patterns (EventBridge, SQS, or similar)
Understanding of observability and monitoring for distributed systems—metrics, tracing, alerting, and debugging production issues
Strong communication and technical writing skills—ability to document systems clearly and work with internal customers across multiple teams
Track record of technical leadership without or without formal management—influencing architecture, mentoring engineers, and driving technical decisions
Experience with cloud infrastructure (AWS preferred: Fargate, DynamoDB, RDS, S3, EventBridge)
Experience building SDK or platform products consumed by multiple teams
Experience with prompt engineering, prompt management systems, or LLM evaluation frameworks
Familiarity with NestJS, Prisma, or similar TypeScript backend frameworks
Experience with streaming architectures (SSE, WebSockets) for real-time AI applications
Background in building multi-tenant platform infrastructure
Experience with hexagonal architecture / ports and adapters patterns
Contributions to open-source LLM tooling or frameworks

Staff Software Engineer

Key skills

About this role

Responsibilities:

Requirements: