How I Would Use the Google Prediction API (To Find Your Musical Profile)

carbocation · on May 22, 2010

From the article, he admits that this is a hypothesis about how the service might work. It's actually just an introductory overview of naive bayes, and doesn't address an actual use case of the G prediction engine (at least, not that they have confirmed). The actual examples from google all seem to have discrete outcomes, so far.

Naive bayes is almost definitely going to be something that they offer — it seems like it's just a question of 'when'.

physcab · on May 22, 2010

Right. At this point you kind-of have to just imagine the workflow (which is actually what I do quite a bit before tackling an analytics problem). You have to envision an ultimate goal of what you want your output to be, an understanding of what's being done to your data, and then make sure you correctly accumulate and format your data to insert into the system. When they flush out their API docs (and let me use their API) I will write another post!

carbocation · on May 22, 2010

Right. I don't mean this negatively, but your post is not really about Google's prediction tool at all. It's a general setup to Naive Bayes. I understand that what got you enthusiastic to write this up was Google's announcement, but at the end of the day I could remove any reference to google and the post would be just fine. I suppose I got excited, despite knowing about the general procedure already, because I thought this post was directed at how to actually use the google offering. It's just a title issue, I guess.

physcab · on May 22, 2010

Sorry about that. A lot of the meta-discussion I had with others (mostly non-techs or semi-techs) this past week about the Prediction API was simply "But what would I use this for?" kinds of questions. That's what this post was meant to address.

Incidentally this is only my second post, and I'll continue to write both on general insights to existing public data, as well as more technical (with-code) posts geared towards those who want to get their hands dirty.

carbocation · on May 23, 2010

Fantastic! This community (or, at least, I) will absolutely devour content written by someone with your background.

milkshakes · on May 22, 2010

broken link? it comes up blank here

[edit: the page loads, but doesnt render in chrome for os x] also, the links are broken (they have an extra http://)

mattmillr · on May 22, 2010

Same for me. Anybody know why this breaks in Chrome?

[edit: looks like the source is truncated. Several closing tags -- including a few divs, body, and html -- are missing. Not sure why Chrome can't handle that though.]

physcab · on May 22, 2010

Yeah thats weird. I'm not sure why. If you go to the homepage or take out the trailing slash, it loads up on Google Chrome.

lt · on May 22, 2010

Similar artists and genres? Boring.

I wanted some tech like Midomi's ou Soundhound's music fingerprinting mixed in with this. Show me new artists that sound similar to artists that I like. Better yet, similar to a mix of artists that I like. Now that would be nice.

physcab · on May 22, 2010

Good idea! But I also want people to understand what I'm talking about to some degree :)

This is actually quite difficult to do. First you need to identify which features of a song are representative of its genre (a song might have 3 million of them). Then you need to build a model that can classify songs accurately based on those features. This has to be done in a speedy way, because you know, you don't have time to wait for a few million songs to process...

Relevant MATLAB code if you want to try your hand: http://labrosa.ee.columbia.edu/~dpwe/resources/matlab/finger...

Relevant algorithm you might want to try (Hidden Markov Model): http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.131...

Company that does recommendations based off audio fingerprinting pretty well: http://www.bmat.com/

tamarindo · on May 22, 2010

Interesting discussion of a topic I haven't read much about elsewhere.

Anyone know of any books that dig into this or related topics?

sketerpot · on May 22, 2010

A good book for this sort of thing is Toby Segaran's excellent Programming Collective Intelligence. It walks you through this sort of fascinating thing with easy examples and clear explanations. It's sprinkled with simple Python code.

http://oreilly.com/catalog/9780596529321

If you want a good introduction to Naive Bayesian classifiers, there was a pretty readable explanation in Artificial Intelligence: a Modern Approach. It's an expensive book, but I'm sure you can find a copy in any well-stocked university library.

http://aima.cs.berkeley.edu/

elibryan · on May 22, 2010

Tom Mitchell's Machine Learning is also very good (despite being a bit older) http://www.cs.cmu.edu/~tom/mlbook.html

jafl5272 · on May 22, 2010

Somebody should try to use it to predict Google's next move...