Hacker News new | past | comments | ask | show | jobs | submit login
Using Spotify data to predict what songs will be hits (techxplore.com)
106 points by lelf on Sept 9, 2019 | hide | past | favorite | 32 comments

Did I miss something or does this project include popularity measures as features?

In the section on dataset features, they include "popularity" (calculated by Spotify) as well as Billboard chart stats like weeks, rank, and a custom-made "score". To me it's not clear whether these features were hidden from the train/test sets or whether the popularity features were only used in their "artist past performance" measures.

If they included these popularity features, it's like asking "can we predict whether a song is a hit just by looking at how popular it is?" If it is the case that they peeked into the future and observed ex-post song popularity, obtaining just 89% accuracy hints at how unpredictable song success truly is. Check out [1] for a famous study of song success which experimentally demonstrates the unpredictability of song success.

[1] Salganik, M. J., Dodds, P. S., & Watts, D. J. (2006). Experimental Study of Inequality and Unpredictability in an Artificial Cultural Market. Science, 311(5762), 854–856. https://doi.org/10.1126/science.1121066

From the paper:

>To extend previous work, in addition to audio analysis features, we consider song duration and mine an additional artist past-performance feature. Artist past-performance for a given song represents how many prior Billboard hits the artist has released before that track’s release date

emphasis mine.

I wonder how accurate a model using this feature alone would be.

Right, this sentence made it unclear to me whether they only used the popularity features to compute past-performance, or whether they included past-performance in addition to other popularity features.

To your question, other work on success prediction of tweets [1, 2] demonstrates that past-performance is indeed much more predictive than the typical content features. This way of looking at success of "cultural products" assumes it depends to varying extents on both inherent "quality" (measured by content features), and the social processes of sharing (which are much harder to understand ahead of time, as the paper I referenced in my parent post shows).

[1] Martin, T., Hofman, J. M., Sharma, A., Anderson, A., & Watts, D. J. (2016). Exploring Limits to Prediction in Complex Social Systems. Proceedings of the 25th International Conference on World Wide Web - WWW ’16, 683–694. https://doi.org/10.1145/2872427.2883001

[2] Bakshy, E., Hofman, J. M., Mason, W. A., & Watts, D. J. (2011). Everyone’s an Influencer: Quantifying Influence on Twitter. Proceedings of the Fourth ACM International Conference on Web Search and Data Mining - WSDM ’11, 65. https://doi.org/10.1145/1935826.1935845

Hah, my friend and I did nearly the exact same project in college, though minus the publication. We had an open ended project for an intro to machine learning class we were taking.

We ended up using the million song dataset, because I'm not sure Spotify gave out this data six years ago, which includes various info about roughly a million songs including artist, length, and supposedly Echonest api results for things like "dancyness". We then merged this with a list of something like 250k results of play counts. We then found out the Echonest data was quite literally all just set to null, so I went out to their api, signed up for a developer key, and spent six days querying to fill out our dataset.

We were massive novices to machine learning, so we basically were just script-kiddying it, and pretty much none of the models we made over a 24ish (because we were dumb college students doing things last minute) period had any significant accuracy. Finally we made a random forest model that was able to, with 80% accuracy, predict the "magnitude" of plays, ie roughly whether a song would get a million plays or a thousand.

When we broke it down (model explainability is an awesome feature) we found that out of everything interesting we had done with feature investigation and data cleaning etc, the model was about 90% based on which artist made the song. In retrospect, that makes sense, in a sort of cynical way; even a great song by an unknown artist rarely makes it big. The moral of the story I guess is that machine learning isn't magic

I still have all the data, and I've been meaning to revisit it now that I actually have a better understanding of the field. It's on my list of things to revisit/do, a very long list

Ha. I did this exact same thing for a project in college using echonest and linear regression. In the end, we were unable to find a single statistically significant coefficient. We ended up having to change our project completely. Kudos to your team for finding something there

I also did something similar in college but due to similar issues noted pivoted to genre classification with extracted audio features. With that though I was actually able to get a pretty accurate classifier going.

First experiment I would do is remove the artist field from the input :)

I also wanted to do this for the "eventual" revisit. We also wanted to try more computationally intensive training systems, including possibly neural networks, but my poor 4th gen i5 just could not cope. I'm waiting for AMD accelerated training to be mostly trivial, so possibly forever.

And maybe eliminate the top 100 artist?

"90% based on which artist made the song"

Doesn't that demonstrate that the actual business case of producers discovering new artists doesn't even factor into the model's case of discovering which new songs will actually be hits.

It seems like the former is much harder than the latter in this case.

I think you will find is that the process of an artist being discovered is basically the same as getting into YC.

It's about the artist and whether they have star quality and are saleable. There are plenty of song writers to write the actual songs.

Classic case is someone like Sia Furler who has written a ton of hits for other artists.


Keep in mind that our results and findings, while making "sense" post-hoc, are predicated on our process and procedure being correct, which honestly, as complete newbies to the whole field, was roughly a coin flip.

Not only that, but sales figures and marketing spend are also not included.

An interesting '06 study. Participants were divided into groups and given access to obscure online songs. Some groups could see number of prior downloads.

Song popularity in different groups was very weakly correlated. Some of the same songs would sometimes be hits and sometimes busts for no apparent reason.

In other words, even song popularity (in an i.i.d. trial) doesn't predict song popularity.


It’s crazy how much sway Spotify and other streaming apps have over the future of what music is popular. Radio pop hits aside, I know lots of people who’s music preference is something along the lines of “a few bands I like and their associated Spotify recommendations”. Music preferences and trends have always been culturally motivated both through grassroots word of mouth, and top-down by those creatively in charge at record labels.

But now with algorithmic suggestions there is a third trend-motivator, which is music listeners and music producers reflecting on the overall system for statistical trends and correlations. I expect this motivator to push out the other two because it makes labels more (and safer) money and consumers seem to like it. The conditions for success are now highly metricked and quickly A/B testable. It’s like the internet before and after PageRank—and just the same, I would expect a cottage industry of “SEO” to pop up for the music industry.

I'll take algorithms over payola. At least they suggest things to me that I'd like to hear.

No, algorithms can be used as tools to hide payola far better. If someone subtly tweaks the algorithm to favor a certain artist, how do you ever catch that?

It's particularly concerning because there did a study that showed in different control groups of music listeners, new songs became popular just based on which ones became popular earlier.

I mean, try this trick on yourself: find a random song and listen to it 20 times. Odds are you'll like it a lot better after the first several listens than the first time.

If in addition to that all your friends happened to know and like the song, and they started advertising the artists heavily, bam! It's a hit. It's just groupthink, and we're all susceptible to it.

Really? Ed Sheeran is so good that he had 5 of the top 10 hits simultaneously?

If that's the end result, I think we need to bring back payola.

meh? Charts have always been poor representations of musical quality/popularity. I use spotify because the algorithms suggest me good music based on my taste, and despite the weird polarity of it they always find new music that interests me.

I have no problem with Spotify's algorithms. They do a very good job with a very difficult problem.

When I was 16 two decades ago I loved music, went to concerts, knew the local bands and their entourages, visited the library to loan CD's to copy on tape, etc. I still discover albums of my favorite bands I've seem to have missed. This all to say that discoverability is such a big plus in my book I would be happy for Spotify to take over radio in the lives of those less musically interested.

Agreed! I discover many songs through friends who are big on Spotify and it's great. Though my personal approach to discovering new music now is going through the full discographies of artists I love (so neat "seeing" their evolution), and then Shazaming any great song I hear while out and about (and then YouTubing the heck out of that artist to quickly mine for any other potential treasures). So far this has been a wonderful approach, especially since I inevitably end up researching the artists and songs I go through, learning so much more about the background stories like song meanings and the issues of those times.

As an aside, and speaking of the nonmusical elements that make a song popular, three other things that do it for me are when I heard that particular song (huge fan of Garth Brooks' "Callin' Baton Rouge" because it was what my host parents played on a fantastic summer I spent on exchange), and the CD cover or other imagery that's associated with the artist (it actually makes me sad to delete humdrum songs from my library if they have fantastic album covers) are two. The third is the subsection of songs that I grew up with, which is different than those from the summer exchange situation above because these are emotionally tied to a time in my life when everything was fresh and exciting; they were the theme not just songs but entire albums running through the background of my youth. Objectively speaking, there's nothing too special about Oasis or Weezer or ATB or Robert Miles or Darude or Aqua but that they found me in a particularly impressionable time in my life. Simon and Garfunkel are in this gathering as well, but it's just sheer luck that my friends were turning hipster and so I was exposed. They were actually what caused me to spring off from the popular tunes of the day into more timeless classics : )

Yeah I didn’t think that it is worse or better, only different!

I'm not sure if it's better or worse than how it was back in the big radio days, where records would get play based on kickbacks, swag, and brown paper bags with drugs or cash.

Isn't this basically what radio, mtv, and every music technology does in a way?

Yes, but there was always a human at MTV deciding which videos to play!

Per the paper, in regards to fixing the data set imbalance:

> In order to balance our data, we randomly sampled 12,000 non-hits from the Spotify data and created a new dataset. This dataset contained approximately 12k non-hits and 12k hits (∼24k tracks total).

Won't a indiscriminate random sampling of the non-hits introduce a temporal sampling bias, as a) music trends change over time and b) music output is not equal across years?

I was thinking along the same lines but in general.

How do you account for the linearity of humans judging against and liking things that we've heard before, excluding the things we haven't heard from the future?

Do we trust that SVM/NN account for it somehow?

Is there a way to limit corpora or have a point in time training set i.e. only against inputs that precede it.

Of course this could just be a fine tuning knob against the other temporal style/fashion trends you would expect in the data-set.

So, more motion towards ever more boring music in bulk and more marginalisation of quirky, weird, different, interesting, intelligent music. Yeah, progress. ️️:-(

Step 2: Train a GAN to try to fool the classifier: instant hit generator?

Check the new hit by artist Justin Timberbrook: Baby You!



Hey baby you \ Summer sun beatin' down \ I texted my ex the other night \ Going down to the bar to see if I can hook up

<1 second burst of static>

I like the way you shake it \ (Shake it, shake it, shake it) \ Got dolla dolla bills in area codes \ Yeah

<Sirens and machine gun noises>

Not going to lie, you got me. Thought that was real until the last line.

I really want to know what it would sound like. Given the images that image classifiers make, I can only imagine it would be "spooky" and "strange" in an uncanny valley way?

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact