Spotify is seeking Data Engineers to join their Artist-First AI Music lab, which focuses on developing innovative generative products for music. The role involves building and maintaining large-scale data pipelines, collaborating with cross-functional teams, and ensuring data quality and reliability.
Responsibilities:
- Build and maintain large-scale data pipelines, including ML pipelines, with data processing frameworks like Scio and Python-based tools on Google Cloud Platform
- Leverage data engineering best practices in continuous integration and delivery
- Help drive optimization, testing and tooling to improve data quality and reliability
- Collaborate with engineers, product managers, subject matter experts, and stakeholders while taking on learning and leadership opportunities that arise every day
- Work in cross-functional, agile teams to continuously experiment, iterate, and deliver on new product objectives
Requirements:
- At least 3+ years of professional experience working in a product-driven environment
- Experience working with high-volume, heterogeneous data using distributed systems and big data technologies such as Python, Scala (e.g., Scio), Ray, Apache Spark, or similar frameworks used for distributed data processing
- Proficient in designing and building distributed data pipelines in Python, Scala, or Java, with experience in frameworks like Scio on platforms such as Dataflow
- Understanding of data modeling, data access, and data storage techniques, and can apply them to both batch and analytical processing (e.g., using BigQuery for analysis)
- Valuing iterative software processes, data-driven development, reliability, and responsible experimentation, with attention to cost efficiency and best practices in data engineering
- Thriving in collaborative environments and enjoying working with cross-functional teams
- Creative problem solver who is passionate about building outstanding products that add real value to millions of people
- Enthusiastic about learning more about turning research ideas into products operating at scale