Predicting Song Popularity Using Spotify Data

The world is transitioning into a data-obsessed entity and the music industry is no exception. The very nature of the creative process behind making music is beginning to see influence from data. There is now a ‘formula’ for what is considered a good track, and one that can predict the likelihood of an audience taking a liking to that track.

Let’s consider Spotify, the ‘Netflix’ of music streaming platforms. Not only does Spotify provide access to an infinite library of songs from the past and present, it also listens, and observes your behaviour on an intricate level. It gathers information on the songs you listen to, how long you listen into a song before you skip, or how often you listen to a song. It then takes that information, and looks for other songs that you may enjoy, packaging them on a weekly basis in a playlist called ‘discovery’.


Spotify also compares you to other users who share the same taste in music. If someone named Carter has listened to a similar set of songs as you, the two of you are likely to share the same taste in music, and what Carter has listened to, is then recommended to you. This is executed using data gathered from thousands of users.

But it gets a little bit more complicated than that. Each song is broken down into various attributes, and rated from 0 to 1. For example, energy is an attribute. If the song has a ton of energy, it will be rated at a number closer to 1. If it is low in energy, it will be rated at a number closer to 0. How the energy is measured requires deeper analysis, measuring the speed of the track, how loud it is, the amount of noise, onset rate, timbre, etc. Of course, Spotify’s reputable algorithm does this automatically for each track that is on the platform.  Below are just some of the different song attributes that each track is broken down into.


What does this mean for artists?

The conversations being held between artists and labels and streaming platforms are taking a whole new meaning. Spotify now has the power to push your music to the audience that is almost certainly going to love it based on data gathered on their taste in music. The amount of data that is being gathered on this music platform (not only in the sense of user behaviour, but also in specific song attributes) is both scary and fascinating.  And this is not unique to Spotify – all music streaming platforms gather data on the tracks as well as their users.

It’s likely that artists’ teams are going to have to become more data-centric. Artists will require an analyst to interpret and analyze their fans’ behaviour when it comes to the interactions that are being had with their music. They will have to ask the right questions and use the data available to increase the bottom line and ensure that their music is reaching the right people, at the right time, etc.

With all this in mind, and being a visual learner, I was very curious to know what the shape of popular hip hop songs looks like.

This will make more sense in a second.

I gathered data on the song attributes of the top 25 songs on Billboards Top 100 Hip Hop Tracks for the last three years. I flushed out the data, chose the attributes that matter, and visualized it. Here it is:

Predicting Song Popularity Using Spotify Data Hip Hop.png

This is what current popular hip-hop music looks like. It is high in energy, valence, and danceability. It is low in acousticness and speechiness.

Does this mean that we can predict the popularity of a song? I mean, couldn’t rising artists/established artists use this data when creating music to ensure that their songs are a hit? Well, established artists have more resources, so are they currently using this data in creating tracks? Does Drake, or The Weekend know the exact combination of danceability, valence, speechiness, acousticness, and energy that their fans are inclined to, and create using that data? Once the general public becomes more data savvy, it is safe to say that there will be some artists who publicly opt-out of using data within their creative process, and some that won’t.

The bigger question is, should artists use data when it comes to the creative process behind their music. It is undeniable that the authenticity is lost if they do, but almost a guaranteed failure if they don’t, at least to some degree. It makes me wonder about the songs that I identify and emotionally connect with. Were these tracks from a legitimate and genuine place that stemmed from an experience that the artists went through, or are they just data in the form of sound that triggers emotional connectivity within me based on research and other users’ behavior?

Who’s to say. And if we can’t tell the difference, does it even matter?

Del Mahabadi


Spotify for Developers:

Nielsen Holdings Inc. : An American information, data and measurement company