- The algorithm is kind of like collaborative filtering, but it filters over previous listening sessions of the same user rather than different users.
- Backend is written with Clojure on Datomic cloud ions. Having Datalog queries for one thing was a huge help. Not sure if the performance characteristics of Datomic will make it work for this use case in the long term, but I'm planning to stick with it until I have to move.
Sorry the installation procedure is a little janky. I was going to polish it but decided it would be a better use of time to move immediately into developing mobile apps instead since those will reach a wider audience. Let me know if you run into any install issues though.
I've been using this for ~1.5 weeks now and it's been really nice. I hope someone else finds it useful.
What things would you be interested in reading about? I'll likely have enough material for another writeup soon.
I was interested to see this other post show up on HN
yesterday too: https://news.ycombinator.com/item?id=20574182#20583543
At this early stage, I haven't started using any matrix factorization techniques since scaling isn't a huge issue when you only have two active users, but it'll be very relevant later on.
So now I mostly listen to NTS. If I don't like what's on the air I search for a recent episode that looks good. I highly recommend it. https://nts.live.
I'm planning later to make it adapt to preferences changing over time and also contextual factors like working out. There's a lot of good research on context-based music recommendation.
I am sure other services exist based on key, chords, dynamics etc - any recommendations?
As time goes on I am planning to integrate things like this into Lagukan. I think a major problem with mainstream recommenders is that they mix together the two problems of 1) deciding when and how often to play songs the user already knows, 2) recommending entirely new songs. Lagukan is an ensemble recommender, i.e. it combines different algorithms. The core algorithm I've made is focused entirely on problem #1, but the plan is to incorporate lots of different approaches (e.g. algorithms based on compositional elements) for problem #2, and then have Lagukan be able to adapt to what works for individual people.
Most recommendations are either too broad (Louis Amstrong and Tony Bennett is not the same kind of jazz as Miles Davis or Weather Report) or too conservative (Youtube Music Your Mixtape is basically repeating same songs over and over never venturing out). And some seem more like paid adverts then recommendations.
So please grow and give me good recommendations. And I will pay for it.
Spotify had for example acquired some company, which is looking into how music is structured to come up with recommendations. This should help in discovering new music, which does not yet have enough plays to be picked up by more traditional recommendation systems.
Hopefully it'll be worth the hassle, but please try to make it easier to install somehow?
Licensing within the music industry is the #1 enemy of getting good music cue suggestions across all music artists though, as many streaming providers can't just access any song they want to, because the agreements to play on each service are so limited and segmented.
Pandora’s concept of a music genome is quite fascinating too. For whatever reason the repeat rate gets too high though, even if you seed it with multiple songs.
The current app was optimized for development time, just so I could get a prototype working at least for my own use as quickly as possible.
I think they mainly want to prevent people from competing with them; I'm not sure if they care too much about charging you money for it (I guess they probably would if you were making a lot of money).
Also, thanks for your work on Mopidy!
I still subscribe to Spotify but for anything I love I buy the vinyl/flac copy and store it on my Roon server at home.
edit: so after a quick search for "what is smoothing in machine learning", I think I'm doing something similar, though basic, already. Basically each value in the matrix is a tuple of (completions + j, skips + k), where j and k are constants. e.g. "In listening sessions where A was completed, B was completed 3 times and skipped 4 times." But if j and k are 2, then the effective values would be (5, 6). I got this idea from my bayesian textbook.
In addition, Spotify's algorithms have the significant constraint of needing to optimize profit, and not all tracks they have are worth the same amount (that's what someone told me, anyway).
Also, sometimes algorithms rely too much on explicit feedback like thumbs up/down rather than implicit feedback (like skips). Deezer's "Flow" feature is an example. The way you use it is basically the same as for Lagukan, but from reading reddit posts about it, it apparently just doesn't work very well. There was even a comment from a Deezer employee saying that it won't respond to skips; you have to use thumbs up/down to make it adapt.
Implicit feedback can be more messy to work with, but it's much more plentiful (and useful, if you use it correctly).
My listening is very eclectic though and highly dependent on current mood so I'll be following your project closely.
Also, if you sign up for an auth token on this page (https://lagukan.com/docs/), I'll send you periodic updates as the project progresses.
Is it something you can use in your model?
(Actually, it already contains a hierarchical model like this, but it's used for adapting to the user's current mood, not for recommending new music)
If you've got a spotify premium account, it'll use spotify's recommendation API to occasionally mix in new songs based on your current listening session. Presumably their API uses objective metrics.
So it's on the roadmap for sure. I'm hoping to get more scaffolding up so I can have some kind of revenue stream, and then I'll start to iterate much more heavily on the algorithm.
related: signing up with email requires loading a google domain? that's... really unfortunate. understandably probably not a priority for you, but hopefully one day i can try this out without loading google resources