Alluxio is a company that powers the data layer for modern AI and analytics. They are seeking a Senior Software Engineer to work on high-impact systems problems, including optimizing metadata management and designing fault-tolerant services for multi-region environments.
Responsibilities:
- Cache and metadata enhancements - design and implement improvements to caching policies, eviction logic, and metadata scalability to increase performance and reliability
- Data path optimization - refine I/O pipelines for S3/GCS/HDFS/Posix to reduce latency and improve throughput using concurrency and scheduling techniques
- Distributed systems reliability - strengthen consistency, replication, and fault-tolerance mechanisms across large-scale clusters
- Feature development and integration - collaborate with product and solution-engineering teams to deliver features that support AI and analytics workloads
- Code quality and peer collaboration - participate in design reviews, provide constructive feedback, and ensure robust testing and observability in production systems
- Design, build, and optimize distributed components within Alluxio’s orchestration layer
- Investigate performance bottlenecks and propose scalable solutions using profiling, tracing, and benchmarking tools
- Collaborate cross-functionally with fellow engineers, architects, and the open-source community to drive improvements
- Contribute to releases and stability efforts, ensuring enterprise-grade reliability across global deployments
Requirements:
- Strong computer-science fundamentals and a passion for large-scale distributed systems
- Professional experience developing in Java, C++, or Go
- Practical knowledge of concurrency, replication, distributed coordination, and performance tuning
- Experience with distributed storage, caching, or data-access layers (e.g., Spark, Presto, Hadoop, Kubernetes)
- Bachelor's or advanced degree in Computer Science or related technical field (or equivalent experience)