Pindrop is the Real Human + Right Human® Identity Trust Platform for the AI era, focusing on identity verification and deepfake detection. They are seeking a Staff Software Engineer for their Authentication team to design and operate real-time distributed authentication services, ensuring operational reliability and leading cross-functional initiatives.
Responsibilities:
- Own one or more core authentication domains end-to-end across design, implementation, migrations, deprecations, and longer-term technical direction
- Design and operate real-time distributed authentication services as cloud-native, containerized microservices with explicit behavior under load, partial failure, and degraded dependencies
- Take full operational ownership, including on-call that covers nights and weekends, after-hours releases, and incident response, postmortems, and lasting fixes
- Define and execute safe change strategies for auth and model releases through staged rollouts, production validation, clear rollback criteria, and rehearsed playbooks
- Work with research and ML teams to ship models into auth products and to use language-model based tools where they measurably improve incident handling, log and metric analysis, runbooks, or policy and rules workflows
- Lead cross-functional initiatives that standardize auth APIs and policies, simplify ML-powered decision paths, reduce operational and integration overhead across engineering, product, research, and customer-facing teams, and back those initiatives with empirical evidence tied to clear business outcomes from design through adoption
Requirements:
- 8+ years of software development experience
- Significant experience designing and operating latency-sensitive backend APIs or services at scale in domains such as authentication, payments, or risk
- Hands-on production MLOps experience with reproducible training, data and model versioning, promotion gates, online monitoring tied to business outcomes, and rollback for regressions
- Experience using large language model-based tooling (AI-augmented design and development using eg: Claude Code or Codex) or similar techniques in production or internal workflows such as incident and log summarization, runbook and documentation assistance, or structured decision support, with attention to evaluation, guardrails, and failure modes
- Strong operational instincts, including ownership of on-call for critical services, leading incidents, and turning runbook, alert, and SLO work into durable reliability patterns rather than one-off fixes
- Strong programming and debugging skills in at least one modern backend language; Go and Python are common in our stack, and you are comfortable designing and debugging cloud-native, containerized, and asynchronously connected services using queues, streams, or workflows
- Familiarity with identity, security, or fraud detection domains is a plus but not required