Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Lagukan, a highly personalized music service (lagukan.com)
95 points by jacobobryant 80 days ago | hide | past | web | favorite | 76 comments

If anyone's interested in some technical details:

- The algorithm is kind of like collaborative filtering, but it filters over previous listening sessions of the same user rather than different users.

- Backend is written with Clojure on Datomic cloud ions. Having Datalog queries for one thing was a huge help. Not sure if the performance characteristics of Datomic will make it work for this use case in the long term, but I'm planning to stick with it until I have to move.

Sorry the installation procedure is a little janky. I was going to polish it but decided it would be a better use of time to move immediately into developing mobile apps instead since those will reach a wider audience. Let me know if you run into any install issues though.

I've been using this for ~1.5 weeks now and it's been really nice. I hope someone else finds it useful.

Would be fascinated to see a writeup on Dataomic being used in anger.

I have a writeup for a previous app I made (a budgeting app) here: https://jacobobryant.com/post/2019/ion/

What things would you be interested in reading about? I'll likely have enough material for another writeup soon.

Very cool Jacob!

Thanks Dustin :)

I was interested to see this other post show up on HN yesterday too: https://news.ycombinator.com/item?id=20574182#20583543

At this early stage, I haven't started using any matrix factorization techniques since scaling isn't a huge issue when you only have two active users, but it'll be very relevant later on.

I've never found a perfect recommendation for me simply because data-driven approaches are 1) backwards looking and 2) assume I have a stable set of preferences. My preferences change every month, are tied to certain activities (e.g. workout), and based on mood.

So now I mostly listen to NTS. If I don't like what's on the air I search for a recent episode that looks good. I highly recommend it. https://nts.live.

I'll check nts out. Although it's not super developed yet, I do intend for lagukan to handle unstable preferences. It already kind of handles different sets of songs--i.e. it doesn't think "You like song A," it thinks "You just listened to song A, and previously when you listened to A you also listened to B". So it can adapt based on the songs in your current listening session, although it takes a while for it to learn the different sets.

I'm planning later to make it adapt to preferences changing over time and also contextual factors like working out. There's a lot of good research on context-based music recommendation.

Loving NTS, thanks. How do I get the artist/title of the track I'm listening to? Really want to save some of this.

Most shows display their tracklist e.g. https://www.nts.live/shows/ruf-dug/episodes/ruf-dug-28th-jun... scroll down a bit. Often the hosts talk about the tracks.

Cannot tell you how much pleasure NTS has brought me. AMAZING

Thanks! Music is sooo dope.

I've mostly come to the conclusion that basing music recommendations on bands/albums/songs just doesn't work for whatever my taste is. It seems like the priors are so swamped by popular stuff that I rarely hear anything even approaching the songs I like. Apple Music's For You is particularly bad. Pretty sure in a moment of weakness I must have liked a Mumford and Sons song years ago because it thinks I love folk music. It wants me to listen to country. Data quality is clearly an issue because it thinks I like the rappers Frank Black and Gene because I like Pixies front-man Frank Black and 90s UK indie posers Gene.

I am sure other services exist based on key, chords, dynamics etc - any recommendations?

I'm not sure off the top of my head, but that's definitely an active area of music recommendation research. There was someone in the lab at my university (BYU) who was/is working on a project he called pop*, although I'm not finding it with a quick google search. It's a system for composing new music, but part of the idea is that by figuring out how to make a system that understands the elements of music composition, it'll help in other areas like music recommendation too.

As time goes on I am planning to integrate things like this into Lagukan. I think a major problem with mainstream recommenders is that they mix together the two problems of 1) deciding when and how often to play songs the user already knows, 2) recommending entirely new songs. Lagukan is an ensemble recommender, i.e. it combines different algorithms. The core algorithm I've made is focused entirely on problem #1, but the plan is to incorporate lots of different approaches (e.g. algorithms based on compositional elements) for problem #2, and then have Lagukan be able to adapt to what works for individual people.

Exactly. It is also a question of a moment and granularity. I am in the mood to listen to certain kind jazz and I might be in a mood for a different music tomorrow morning.

Most recommendations are either too broad (Louis Amstrong and Tony Bennett is not the same kind of jazz as Miles Davis or Weather Report) or too conservative (Youtube Music Your Mixtape is basically repeating same songs over and over never venturing out). And some seem more like paid adverts then recommendations.

So please grow and give me good recommendations. And I will pay for it.

I'll do my best! If you haven't already, you can create an account at https://lagukan.com/docs/ and then I'll send you updates as the project progresses.

Just want to comment on the title of the app. "Lagukan" in Bahasa (Indonesia's official language) means "Sing it" or "Make it a song"

So is Lagukan an actual word then? I lived in Malaysia for two years (as an LDS missionary), so I know about "lagu" and "-kan", but wasn't sure if the combination was actually correct or not.

The first thing I saw the name immediately relates to the Malay word Lagu. But no you don't see a word like this in real world, but you can certainly add them together.

in Indonesia no one actually uses it in day to day interaction but in some old text books or poetries they do use it

Somewhat related, Gustav Söderström from Spotify was recently discussing with Lex Fridman about various topics, including the problem of music recommendations. This was pretty interesting.

Spotify had for example acquired some company, which is looking into how music is structured to come up with recommendations. This should help in discovering new music, which does not yet have enough plays to be picked up by more traditional recommendation systems.


Looks interesting, I'll definitely check that out. It's also comforting to hear about music recommendation companies getting acquired ;)

Oh man... I haven't been able to find any recommendation algorithms that recommended songs I liked even 1% of. Spotify's is a trash fire. Last.fm's wasn't great. Pandora is okay. I am very interested in this project, but the setup instructions are... long. Not hard if you're good with computers, but long.

Hopefully it'll be worth the hassle, but please try to make it easier to install somehow?

It would probably be better if these sites found a way to compile songs from DJ mix tracklists and then scan for the highest quality audio from sites like YouTube and Spotify and then stream entire mix tracklists by genre as individual tunes, that way matching songs within music genres would be more properly targeted genre-wise. I fear a future where AI picks all the music because what I like to hear usually is never what I hear on streaming cues.

Licensing within the music industry is the #1 enemy of getting good music cue suggestions across all music artists though, as many streaming providers can't just access any song they want to, because the agreements to play on each service are so limited and segmented.

Have you tried putting 100 songs you really like that are in similar sub genres into a new playlist in Spotify, then playing the last song and letting it generate new suggestions from there? Kind of a weird hack, but I’ve been getting good results from it that have improved my #2 daily list selections a lot.

Pandora’s concept of a music genome is quite fascinating too. For whatever reason the repeat rate gets too high though, even if you seed it with multiple songs.

Unfortunately I like a little bit of everything, so I don't have 100 songs I like in the same genre. I have 1000 songs from wildly different genres, so I'm very hard to generate suggestions for.

Ah. You might get some benefit even doing 10 song playlists and letting them run after the last song, with ml driving the song selection. This has worked well for me personally. Might be worth experimenting with.

Have you tried Gnoosic (http://www.gnoosic.com)? That is the one that works best for me.

I'm thinking of incorporating gnoosic (or something similar--haven't looked yet to see if they have an API) into Lagukan for recommending new artists. Mainly my work has focused on re-recommending songs you already know.

No, I didn't know about it, thank you!

When I use the "Start Radio" function on a song in Google Music it often leads to good stuff.

Yes, definitely! I'm working on mobile apps right now which will be super easy to use, and then I'll likely come back and improve the desktop app.

The current app was optimized for development time, just so I could get a prototype working at least for my own use as quickly as possible.

Oh, if it's a prototype, then that's understandable. Disregard, I will beta-test and report back!


I think you need to add a step to install libspotify-dev on Ubuntu, mopidy-spotify-web from your repo wouldn't install for me unless I installed that first.

Also, did you already install `mopidy-spotify` (https://github.com/mopidy/mopidy-spotify#installation) already? I think that package is supposed to have all the needed dependencies, maybe it's out of date.

Yes I did, alas.

Oh, good to know! I'll update the docs. I'm on arch linux so I wasn't aware. Should've spun up a VM to test it first or something.

I was under the impression that using Spotify's APIs in a commercial project required an agreement with them, presumably that costs money. Have you investigated that or will you have to drop the Spotify aspect if/when you start taking money?

I've applied for a commercial Spotify API key. They let you start using it right away. I haven't heard back from them (it's been a couple weeks). It's possible they'll decide not to approve my use case, in which case I'd have to drop spotify support.

I think they mainly want to prevent people from competing with them; I'm not sure if they care too much about charging you money for it (I guess they probably would if you were making a lot of money).

Also, thanks for your work on Mopidy!

Good luck with that! I've been hoping to see Spotify support for streaming or searching or casting to a roodready endpoint but Spotify has not been helpful.

Thanks! Do you have a link for what roodready is? I did a quick search but didn't find anything.

Spelling mistake on my part. I meant "Roon Ready", its a standard that the company Roonlabs created for their streaming solution. I personally use Roon at home as I got tired of losing access to music I loved when Spotify changed contracts.

I still subscribe to Spotify but for anything I love I buy the vinyl/flac copy and store it on my Roon server at home.

I would like to try this out, but am not willing to go through the current setup process. You should add a way to be notified when you are done with the first version of the app. I would gladly give my email or twitter.

If you sign up for an auth token at the top of this page (https://lagukan.com/docs/), that'll get your email into the system. I should've explained that; I'll update the website.

Is it just me, or do all of the "personalized music services" only take a few next songs until they are playing you Justin Bieber, Taylor Swift or mumble rap? These things just converge towards popularity.

yeah, that's been my experience too. Hence the plan with Lagukan is to go the other way--we might start out with popular music as a guess, but it'll adapt heavily to your individual preferences as it collects data.

What I would love is for you to through the music through SampleCNN to reduce it to a vector representation, and then use that for smoothing. This could help smooth the cold start problem.

I'll look into that! I'm no machine learning expert, but I'll certainly be digging into it more as the project progresses.

edit: so after a quick search for "what is smoothing in machine learning", I think I'm doing something similar, though basic, already. Basically each value in the matrix is a tuple of (completions + j, skips + k), where j and k are constants. e.g. "In listening sessions where A was completed, B was completed 3 times and skipped 4 times." But if j and k are 2, then the effective values would be (5, 6). I got this idea from my bayesian textbook.

Thank you thank you thank you. I've been wanting something like this for my local MP3 collection for a while. (Might also use it for an RPi music server in the car :D)

Wait, Spotify is not already using skips and how much of a song was played to rank songs and build a preference profile? What is it doing exactly, then?

I don't know since it's closed source, but I too would think that Spotify is already using skips, but in practice I think their methods are muddied up by other factors, like I mentioned here: https://news.ycombinator.com/item?id=20585757

In addition, Spotify's algorithms have the significant constraint of needing to optimize profit, and not all tracks they have are worth the same amount (that's what someone told me, anyway).

Also, sometimes algorithms rely too much on explicit feedback like thumbs up/down rather than implicit feedback (like skips). Deezer's "Flow" feature is an example. The way you use it is basically the same as for Lagukan, but from reading reddit posts about it, it apparently just doesn't work very well. There was even a comment from a Deezer employee saying that it won't respond to skips; you have to use thumbs up/down to make it adapt.

Implicit feedback can be more messy to work with, but it's much more plentiful (and useful, if you use it correctly).

Also--for the profit optimizing thing, that's why a significant part of my plan is to structure the business in a way that lets me optimize the algorithm for user satisfaction. e.g. I'm thinking about helping artists to sell concert tickets and taking a cut of that.

This is a very compelling app idea for me. However nearly all of my music is on Google Music these days so I have no easy way to use it.

My listening is very eclectic though and highly dependent on current mood so I'll be following your project closely.

Glad to hear! Unfortunately there isn't an official API for google music, but there is an unofficial python API. In fact, Mopidy has a GMusic extension already. I could probably add support to the current desktop app fairly quickly, although it might not work on Windows, let alone the mobile apps. Would you use it on mac/linux if I added support for that though?

Also, if you sign up for an auth token on this page (https://lagukan.com/docs/), I'll send you periodic updates as the project progresses.

I would absolutely try it out on mac.

Same. Would definitely want to use this if it would support GPM or YTM.

I've just updated Lagukan so it'll play from tracks you've uploaded to google music. I've added instructions to enable google music to the docs: https://lagukan.com/docs/

I'll check it out.

Perfect. I'll send you an email once it's working.

Have you seen https://news.ycombinator.com/item?id=20574182

Is it something you can use in your model?

I did see it! I haven't looked into it deeply yet, but I think it'll be valuable for later when I need to make the algorithm scale better.

To me it seems like a recommendation can be based on two main factors: objective similarity (distance in some space, or matching some features), and the statistics from users who connect two pieces together by liking both. It's not clear how to balance them.

In practice I think objective similarity ends up being a form of the latter, i.e. basically the average of all user's preferences. The plan with Lagukan is to have a hierarchical model that'll default to the objective/averaged similarity but then shift over to a more individualized model as data is collected on the given user.

(Actually, it already contains a hierarchical model like this, but it's used for adapting to the user's current mood, not for recommending new music)

Wait, are you saying you don't currently use any objective similarity metrics (distance or feature based) in your model?

Not directly. It starts out doing random shuffling over your existing collection (local mp3s, spotify playlists and spotify saved tracks), and then uses your skipping data to build the model.

If you've got a spotify premium account, it'll use spotify's recommendation API to occasionally mix in new songs based on your current listening session. Presumably their API uses objective metrics.

Do you see the value of incorporating objective similarity metrics into your recommendation engine? I don't know if you can tap into something like Gracenote API, but I'm sure it would improve the quality of your recommendations.

Yeah, I'm planning to integrate as many metrics as I can. The key thing is just that the system has to be able to adapt on the fly. So Lagukan will have all these different inputs/different recommendation engines, but it'll only use them as hypotheses to be tested rather than the final word. As we get more data about an individual user, the system will figure out how effective the different engines are for that particular user.

So it's on the roadmap for sure. I'm hoping to get more scaffolding up so I can have some kind of revenue stream, and then I'll start to iterate much more heavily on the algorithm.

Is this always going to need Mopidy?

No. I'm working on mobile apps which of course won't use Mopidy. After that I'll come back to the desktop version, and there's a good chance I'll roll my own instead of continuing to use Mopidy. (If I keep using Mopidy, I'll improve the install process so it's not so painful).

edit: figured out, thanks for quick feedback!

There should be two buttons that say "Sign in with email" and "Sign in with Google" respectively, either of those will create an account for you. If they're not showing up, maybe see if you've got a browser extension blocking it?

ah okay, yeah i'm blocking gstatic.com

related: signing up with email requires loading a google domain? that's... really unfortunate. understandably probably not a priority for you, but hopefully one day i can try this out without loading google resources

yeah, sorry about that. I use firebase for authentication since it's the quickest-to-set-up auth solution I'm aware of. But later on I'll see what I can do.

setting this up is profoundly laborious, not to mention risky considering I have to enter my spotify username/password in a config file somewhere...

Yeah, sorry about that. This prototype was optimized for development speed; I'm working on mobile apps now which will be easier to use. I'm not positive if the plaintext username/password is strictly required, but it's in docs from the spotify plugin's author (https://github.com/mopidy/mopidy-spotify#configuration). I think it may be related to the fact that a native integration with spotify is only possible through a lib that spotify abandoned a while ago, so it's a bit of a hack.

Spotify's closed-sourced abondon-ware libspotify library requires the username and password. I'm not convinced by the argument that storing those details in a suitably protected text file is particularly risky but I appreciate others may have other ideas. You can actually read the password from the system keyring but we don't document that very well since it's more of a hack (if I remember correctly).

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact