Judi Health is an enterprise health technology company providing a comprehensive suite of solutions for employers and health plans. They are seeking a Senior Scalability Engineer focused on caching infrastructure and performance optimization to own the architecture and improvement of caching solutions across their platform.

Responsibilities:

Own caching infrastructure: Design, implement, and maintain caching architecture using Valkey/Redis (ElastiCache) for high-throughput healthcare applications processing millions of transactions per day
Build shared libraries: Develop and evolve caching libraries and patterns used across multiple engineering teams, establishing best practices for cache key design, invalidation strategies, and performance monitoring
Partner with engineering teams: Work directly with product teams to design and implement caching solutions tailored to their specific use cases, providing technical guidance and hands-on support during implementation
Drive performance optimization: Conduct deep performance analysis using profiling tools to identify bottlenecks beyond caching—database queries, application code, infrastructure—and deliver measurable improvements
Establish performance standards: Define performance benchmarks, implement monitoring and alerting, and help teams measure the impact of optimizations through data-driven analysis
Contribute to observability: Enhance observability infrastructure (LGTM stack) to track cache hit rates, latency patterns, and system performance metrics across the platform
Demonstrate technical leadership: Mentor engineers on performance best practices, lead architecture reviews, and represent the Scalability team in cross-functional planning discussions
Responsible for adherence to the Capital Rx Code of Conduct, including reporting of noncompliance

Requirements:

10+ years of software engineering experience with demonstrated progression into technical leadership roles
3+ years of experience leading technical initiatives, mentoring engineers, or serving as a subject matter expert on complex systems
Strong expertise in Python (Flask/SQLAlchemy) for production applications
Deep PostgreSQL knowledge: Advanced query optimization, indexing strategies, triggered, stored procedures, plan analysis, and experience with replication and clustering
Production caching experience: Proven track record designing and implementing caching strategies at scale using Redis, Valkey, Memcached, or similar technologies
Performance optimization expertise: Demonstrated ability to profile applications, identify bottlenecks, and deliver measurable performance improvements (latency reduction, throughput gains, cost savings)
AWS experience: Production experience with Aurora RDS, Lambda, ElastiCache, EC2, ECS, and S3
Systems thinking: Ability to analyze performance across the full stack — application, database, caching, infrastructure — and make architectural tradeoffs
Collaboration and communication: Strong written and verbal communication skills with ability to work autonomously while driving proactive collaboration in a remote environment
Rust development experience or strong interest in learning Rust for high-performance systems
Infrastructure as code: Experience with Terraform or similar IaC tools for managing cloud infrastructure
Observability tools: Hands-on experience with Grafana, Prometheus, Loki, or similar monitoring/alerting platforms
High-throughput systems: Background in building systems that handle millions of requests per day with strict SLA requirements
Previous Pharmacy Benefits Manager (PBM) or healthcare technology experience

Senior Scalability Engineer - Caching & Performance Optimization

Key skills

About this role

Responsibilities:

Requirements: