Attentive is an AI marketing platform for 1:1 personalization redefining the way brands and people connect. They are seeking a Staff Software Engineer to lead the evolution of their streaming platform, focusing on architectural design and modernization efforts to enhance reliability and scalability.
Responsibilities:
- Architect and evolve Attentive’s next-generation event streaming platform; design high-throughput, low-latency, and cost-efficient solutions that power mission-critical products and use cases across Attentive’s ecosystem
- Enhance streaming developer experience: Build and refine self-serve tools for event observability, debugging, load testing, and system configuration, empowering teams to experiment independently and ship quickly
- Simplify and modernize streaming architecture: Remove unnecessary abstraction layers, enable direct access for power users, and ensure the platform is flexible for both 'paved path' and advanced use cases
- Solve complex distributed systems challenges with primitives for reliable stream processing: rate limiting, deduping, delayed message delivery, etc
- Champion best practices and technology selection: Stay ahead of industry advancements in event streaming, advocating for tools and approaches that balance innovation with long-term reliability
- Collaborate across engineering: Partner with product, data, and infrastructure teams to launch new customer-facing features, integrations, and scalable solutions built on streaming infrastructure
Requirements:
- 10+ years of experience architecting and supporting high-throughput, distributed systems at scale—especially those involving event streaming or messaging platforms
- Strong expertise in the internals, tradeoffs, and operating models of distributed streaming technologies such as Kafka, Flink, Pulsar, and/or Spark
- Proven track record of leading major platform or architectural initiatives that span multiple teams, including modernization, migration, simplification, or adoption of new infrastructure patterns
- Deep experience designing systems for scale, reliability, debuggability, and efficiency, including handling high-throughput workloads and complex failure scenarios in production
- Strong proficiency in Java and backend systems design, with the ability to work across application, platform, and infrastructure layers
- Able to debug and optimize end-to-end streaming systems, from schemas and serialization to consumer behavior, JVM performance, networking, and infrastructure bottlenecks
- Familiar with resource scheduling, data locality, and how infrastructure choices impact cost and system behavior
- Experience with observability and developer tooling for streaming (e.g., tracing, metrics, replay)
- Infrastructure-as-code expertise (Terraform, Helm), comfortable with Kubernetes (EKS) and cloud-native environments
- Demonstrated ability to influence technical strategy, communicate tradeoffs clearly, and lead through collaboration rather than authority
- Excitement for tackling ambiguous, high-impact platform problems in a fast-moving environment, with sound judgment about where to innovate versus standardize