About this role

Databricks is the data and AI company that empowers data teams to solve complex problems. The role involves building next-generation distributed data storage and processing systems that enhance relational query performance while supporting diverse workloads.

Responsibilities:

Building the next generation distributed data storage and processing systems that can outperform specialized SQL query engines in relational query performance
Providing reliable and high performance services and client libraries for storing and accessing humongous amounts of data on cloud storage backends
Developing a storage management system that combines the scale and cost-efficiency of data lakes with the performance and reliability of a data warehouse
Making it simple to orchestrate and operate tens of thousands of data pipelines through a higher level abstraction for expressing data pipelines
Building the next generation query optimizer and execution engine that is fast, tuning free, scalable, and robust

Requirements:

BS (or higher) in Computer Science, related technical field or equivalent practical experience
Comfortable working towards a multi-year vision with incremental deliverables
Motivated by delivering customer value and impact
8+ years of production level experience in either Java, Scala or C++
Strong foundation in algorithms and data structures and their real-world use cases
Experience with distributed systems, databases, and big data systems (Apache Spark, Hadoop)

Staff Software Engineer - Distributed Data Systems

Key skills

About this role

Responsibilities:

Requirements: