Spotify needs none. One of the largest players in music streaming platforms, Spotify has over 248 million users on the date this blog was written and has over 50 million songs on them. I have been an avid Spotify user ever since Spotify came to India. It is my most favorite music streaming platform ever. I have always been amazed by how Spotify seems to know me so well, while I also keep discovering new music and new artists with Spotify. Spotify Music Recommendation has been on my reading list for quite some time. When last week, on the Instagram poll, I asked you what I should write about, and you picked Spotify, I was pumped even more.
Now, to speak about the entire recommendation system in a short blog like this would be extremely difficult. Spotify has over five machine learning teams, each specializing in something different with over 200 engineers in total. There are many features that Spotify has, and each has a specialized model behind it. So, I am going to pick and choose the Spotify Home.
Data and The Three Layer system
Oskar Stål speaks of the Spotify machine learning systems as consisting of three layers, on each of which various machine learning methods are applied.
The song itself, of course, along with tags, artist and album information, the lyrics of the song, text mined from the web including reviews, and interviews.
They also have complete data on its users. From what music they listen to, how long they were listening, what genre, artists, etc
These include different models used to represent data or models that learned from the data underlying in the layer below. These are shared by all feature models of Netflix, which we will see below. Some of the things that are being modeled include User affinity to the artist, clustering songs based on not only audio but all other tags as well, the similarity between items using embedding, etc. These can help answer questions like, given a song, give me ten similar songs.
Each of the features, e.g., Discover Weekly, Yearly Wrapped, Your weekly Mix, Home Page Personalisation, etc., have separate models managed by individual teams. These form the topmost layer of this system.
We will be focussing on just one - the Home Screen Personalisation. You can get an overview of the rest in this talk.
The Spotify Home
The Spotify Home is the default screen that users like us come to each time we open the app. Spotify Home is such an intriguing space. Here are a few reasons why.
a. Almost every time I open the app, I am greeted by a new home. How and when does this new decoration happen?
b. Everybody's home screen is unique. I am yet to see two home screens that even resemble each other closely.
c. Spotify is one of those apps which doesn't echo too much. It shows what I like, but also something new. How does this happen?
d. First impression each time a user opens the app is this. How do you engineer something so important?
This blog is a minimal answer to all those pressing questions. So let's go!
Below is a sample image of the Spotify home(from one of the slides of Ben Carterette's presentation)
We can think of the Spotify Home as a 2D array. We can also think of it as a bookshelf. Each row is a shelf. Books on a shelf are Tiles in Spotify.
Now let us try to frame a very surface-level question. How can we arrange hundreds of millions of songs into these shelves for each user such that the user has the most optimum experience, one that brings value to them? By value, we mean -Help them find something they will enjoy quickly. Tuned and personalized to their taste while leaving room to explore so that they won't get bored. All of this in real-time.
In comes - BaRT.
Bart - Bandits for Recommendations as Treatments.
The algorithm that guides Home Screen Personalisation today is the BaRT algorithm. You can find the official paper here.
Traditionally, Recommender systems focussed on just one thing - Exploitation. In Recommender Systems, Exploitation means recommending content (e.g., products, movies, music playlists) with the highest predicted user engagement. We know this is what the user will like. So we keep recommending them. This is one of the main reasons why social media rooms have become echo chambers.
Also, another issue with traditional recommender systems is in the fact they can only exploit or ignore.
Recommender systems know what to do when items are high in actual relevance, with high certainty. But when it comes to items low in relevance with high certainty or low in relevance with low certainty, it either exploits or ignores. But when it does the same to the High Relevance item, an important experience is lost.
But this isn't the most optimal way to go by for Spotify. People will get bored of listening to the same songs over and over again. We need to give room for them to explore a bit too, and find a way to balance both. Parallelly, we need to explain the recommendations to the users for them to understand. BaRT is the solution the Spotify team came up with for these problems.
Bart - The Algorithm
The algorithm consists of four essential things - Reward model, a stochastic policy, a counterfactual training method, and a propensity score scheme
a. Reward Model
The job of the reward model is to predict the user response given a context and recommended item:
where X is the context (e.g. recent user listening, region, platform), A is the recommended item, R is the user response (e.g. stream = 1, no stream = 0), and θ are a set of model parameters.
b. Stochastic Policy
A Stochastic Policy function: π(s1s2…sn,a1a2…an): S×A→[0,1] is the probability distribution function, that, tells the probability that action sequence a1a2…an may be chosen in state sequence s1s2…sn
In Spotify, given that the reward is uncertain, how to know when to explore, when not to? They make that decision based on a stochastic policy called the epsilon greedy.iI explores the lesser of the item as it learns more about the quality of the item.
Epsilon-Greedy is a simple method to balance exploration and exploitation by choosing between exploration and exploitation randomly. The epsilon-greedy, where epsilon refers to the probability of choosing to explore, exploits most of the time with a small chance of exploring. [geeksforgeeks]
c. A Counterfactual training method
From Wikipedia, A randomized controlled trial (or randomized control trial;RCT) is a type of scientific experiment that aims to reduce certain sources of bias when testing the effectiveness of new treatments; this is accomplished by randomly allocating subjects to two or more groups, treating them differently, and then comparing them with respect to a measured response.
Ideally, data should be collected through a randomized controlled trial. But we cannot recommend users' products regardless of their relevance. Hence, a bias is added and the objective is manipulated to approximate a randomized controlled trial.
Remember our stochastic policy earlier, epilson greedy? This will act as the bias.
Thus, the objective function now becomes,
d.Propensity scores (i.e. the logging policy)
Assume you are given a database of impressions and stream outcomes. The direct approach to build a recommendation from this is to train on rows of the database with a regression model. The weights observations inversed becomes the propensity scores.
Now let us try to explain how this works in simple words.
Remember shelves and cards?
All the above are combined to give us a diminishing action space of items given a user behavior in the past. We use that to fill the cards on the shelves.
First, fill up the first card on each shelf with a diminishing action space of items. These could be mapped to a title, which will give you an explanation.
Next, fill up each shelf with a diminishing action space of shelf items, ie items belonging to the same title as the first item.
and this is how you get your Spotify home page.🥳
Video: Music Recommendations at Spotify - Oskar Stål, Spotify
Video: Personalization of Spotify Home and TensorFlow - Tony Jebara
James Mclnery's blog summarising the paper
Introduction to policy gradients
People who worked on this:
James completed his Ph.D. in AI from the Univerisity of Southampton. He then worked as a Research Associate with universities like Princeton and Columbia. He joined Spotify in 2016 as a Senior Research Scientist. He is now with Netflix.
Ben comes from a background of music, having completed his Bachelors in the same from the New York University. After wearing the hat of a software engineer for a while, Ben joined as Senior Data Engineer with Seed Specific. He then joined Spotify as an ML engineer in 2015.
Samantha completed her Ph.D. in Engineering Science and Mathematics from Northwestern University. She joined IBM right after and joined Spotify in 2016. She now works as a Machine Learning Engineer with LinkedIn.
Karl comes from a background in Electrical Engineering. He worked as a software developer, and then web before joining Spotify as Senior Data Engineer, Personalization and Discovery in 2016. He now works with NVIDIA.
After working as an IT Consultant Nsein Technologies for over a year, Huges joined as Senior Research Engineer - Geo Mining at Yahoo. After working with Yahoo for over 6 years, he joined Spotify in 2016 as Research Scientist. Within Spotify, he has worked in Context Understanding, Search & Recommendation Modeling and is now working with Knowledge Representation & Management.
After serving the role of Research Intern with organizations like Suplec, Alois joined as Machine Learning Scientist with niland. He later joined Spotfiy in 2017.
After completing his bachelor's at BITS Pilani, Rishabh did his doctorate from UCL. During this time, he worked with Microsoft as a visiting Research Scholar. He co-founded an AI research company - UserContext.AI. He later joined Spotify as a research scientist in 2017 as a research scientist.
Tony Jebara, the current VP of Engineering, Head of ML at Spotify, earned his doctorate from MIT in 2002. Previous to completing his doctorate, he had served as Associate Professor at Columbia University and worked as DIrector of ML with Netflix. His important works including the Netflix Home page and the Spotify Home page.
After earning his master's degree from KTH Royal Institute of Technology, Oskar Stal joined as the Director of Development at mBlox inc. He later in 2009 joined Spotify as its Chiefimportant Technological Officer. He is now the VP of Personalisation at Spotify, leading efforts to bring experiences that bring value to Spotify's users on it's platform.