Hacker News new | past | comments | ask | show | jobs | submit login
Recommending music on Spotify with deep learning (benanne.github.io)
140 points by benanne on Aug 5, 2014 | hide | past | favorite | 23 comments

This would be awesome. The one problem I've had with Spotify's radio service is only being able to pick up on genres and such. Many times I might like only 1 or 2 bands from a more obscure genre, but spotify radio does not understand this no matter how many songs I downvote.

In fact, my "starred" playlist radio is currently stuck on some kind of 80s metal thing. I have no idea how it got that way, but it refuses to play anything but metal, when that's not really what I like. If I make a copy of my starred playlist and make a radio from the new playlist, it reflects the actual songs at hand..

There's also the matter of unique bands that while fitting into a very wide genre, don't really fit into any widely known sub-genre. Meaning radio for the band is almost useless in many cases. There is the "related bands" bit which helps it a bit, but this only works with popular bands and still is very mismatched often times

I've had a similar experience. At one point the radio was 90% Childish Gambino despite only having one of his songs in my playlist

This is all great stuff. But until Spotify fixes some very low hanging fruit problems, such as not playing a song ever again if I click "don't like", I find it hard to believe they will use anything remotely complex as this.

Well yes, but it is "just" an internship, in which you can do something else than the company is doing. I hope they eventually will use this work, after fixing the low hanging fruit.

This was my thought exactly. It seems like they're putting the cart before the horse a bit.

Great post! It was fun to see 4 tracks in the (first convolutional layer) filter 242 set that I recognized from my own 'ambient' Spotify playlists, and pretty impressive at the topmost layer as well. Loved that approach of looking at a few tracks that represented maximally or average activated filters.

Curious if you think the low-level features learned from the vector_exp latent factors are different from, say, unsupervised learning with sparse autoencoders? For example, are there phonemes associated with Chinese pop or Spanish rap that are learned at a low level, that the network might not learn with "unlabeled" data?

I haven't tried it, so I can't say for sure, but my intuition is that learning to predict the latent factors is a much less 'complex' task than learning good features to reconstruct the input (i.e. the spectrograms), in terms of the required capacity of the model.

With a purely unsupervised approach, you are basically wasting capacity on modeling aspects of the data that are relevant for reconstructing the input, but not for solving the task at hand. For example, the model doesn't have to care about precise pitches and timing, because those are not relevant for recommendation (and latent factor prediction) anyway. That means no model capacity is wasted on these things. With a fairly complex task such as this one, I think that probably makes a big difference.

Is it possible to get my own listing history natively from Spotify?

I'm not interesting in scrobbling to another service, etc. Seems like this should be a well documented and used service from the Spotify API, but as far as I know, it does not exist.

I once heard that they don't store that info, though I would love to be proven wrong.

They definitely store it, as you can find it on the Play Queue page under the "History" tab.

I think that this data is only local to each computer. I use spotify on two computers and the phone: The history tab does not sync between these devices.

the web api [1] looks like it just lets you play with a users play list.

I agree listing history is more usefull

[1] https://developer.spotify.com/web-api/

So what's the activation function on the last layer? Linear?

With the intriguing exception of the Global Temporal Pooling layer, this matches up with a lot of my ideas for music analysis. Nice work!

Yep, it's linear. This is essentially a regression task and the distribution of the factors across the dataset is pretty close to Gaussian in most cases, so it made sense not to have any nonlinearities there.

As a sidenote, if it weren't for the L2-pooling in the global temporal pooling layer, the network would be completely piecewise linear from input to output :)

Great stuff! However, I thought that the Echo Nest tools that they bought were already doing such stuff. If not, what are they currently using for content-based recommendation?

That's true, there is quite some overlap with what the Echo Nest are doing, albeit using a different approach. As I mentioned in the post, a lot of different collaborative filtering techniques are currently used together, so the same could be done for content-based techniques.

Nice post - i am not extremely familiar with mel-spectrograms but I was wondering if you have considered using an advanced wavelet transforms as features?

I haven't really optimized that part of the pipeline, for now I'm just using what I always use because getting the data and massaging it into the right format is quite time consuming.

There is a lot of knowledge about good time-frequency representations for music analysis among the people who were formerly at The Echno Nest (which was acquired by Spotify a few months ago), so it would definitely be a good idea to apply their knowledge to this and come up with a better input representation. I only have a few weeks left though, so it doesn't have a high priority at the moment :)

This web page is using up a lot of processing power and crashes a lot. And the playlists don't work for me at least. It just opens a blank page.

How come people with papers on machine learning are doing internships? Is this absolutely necessary before landing a job?

I can't really answer this question in general, but I can speak for myself: while it's not something a lot of PhD students do in Belgium (it seems to be more common in the US actually), I felt like it would be a great opportunity to experience what working in industry is like. When I finish my PhD, I'll have to decide whether I want to stay in academia or move to industry, and this internship is hopefully going to help me make that decision.

You can get research internships at large companies. If your PhD grant doesn't cover summer at full-time rates, SV companies pay on the order of $5k-$8k per MONTH for interns. Actually, even if your PhD grant covers your summer expenses, it probably won't match the internship salary.

Why wouldn't they? Tech internships tend to be well-paid and a lot of fun.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact