Databricks is the data and AI company that enables data teams to solve the world's toughest problems. As a software engineer on the Runtime team, you will be responsible for building next generation distributed data storage and processing systems to improve performance and support diverse workloads.
Responsibilities:
- Develop the de facto open source standard framework for big data
- Provide reliable and high performance services and client libraries for storing and accessing humongous amount of data on cloud storage backends, e.g., AWS S3, Azure Blob Store
- A storage management system that combines the scale and cost-efficiency of data lakes, the performance and reliability of a data warehouse, and the low latency of streaming
- Make it simple and possible to orchestrate and operate tens of thousands of data pipelines
- Build the next generation query optimizer and execution engine that's fast, tuning free, scalable, and robust
Requirements:
- BS (or higher) in Computer Science, related technical field or equivalent practical experience
- Comfortable working towards a multi-year vision with incremental deliverables
- Motivated by delivering customer value and impact
- 8+ years of production level experience in either Java, Scala or C++
- Strong foundation in algorithms and data structures and their real-world use cases
- Experience with distributed systems, databases, and big data systems (Apache Spark, Hadoop)