Designing and implementing extremely high-volume, fault-tolerant, scalable backend systems that process and manage petabytes of customer data.
Work to improve algorithms built to schedule load on clusters of thousands of machines elastically at runtime.
Improve systems to provide performance guarantees to customers in a shared-everything multi-tenant architecture.
Lead and contribute to the re-architecting of our internal message processing technology to petabyte per day scale.
Help manage exabytes of data using the latest and greatest technologies such as Kafka, Kubernetes and Docker.
Work across Sumo interfacing with multiple teams including Search, Security and Metrics & Tracing to identify requirements and architect solutions to meet their data core ingest needs.
Requirements
B.S. or higher in Computer Sciences or related discipline (M.S. a plus)
5+ years of industry experience with a proven track record of ownership and delivery
Experience developing scalable distributed data processing solutions
Experience in multi-threaded programming
Experience in running large scalable distributed services following a microservice architecture
Willingness and experience with occasional on-call availability. Rotations scheduled approximately every 6-8 weeks for a 12 hour timeline, duration 1 week primary, 1 week to assist primary only if needed, starting 9-11am PDT/MDT/CDT and ending 12 hours later.