

How I Would Use the Google Prediction API (To Find Your Musical Profile) - physcab
http://www.thedatascientist.com/2010/05/22/how-i-would-use-the-google-prediction-api/

======
carbocation
From the article, he admits that this is a hypothesis about how the service
might work. It's actually just an introductory overview of naive bayes, and
doesn't address an actual use case of the G prediction engine (at least, not
that they have confirmed). The actual examples from google all seem to have
discrete outcomes, so far.

Naive bayes is almost definitely going to be something that they offer — it
seems like it's just a question of 'when'.

~~~
physcab
Right. At this point you kind-of have to just imagine the workflow (which is
actually what I do quite a bit before tackling an analytics problem). You have
to envision an ultimate goal of what you want your output to be, an
understanding of what's being done to your data, and then make sure you
correctly accumulate and format your data to insert into the system. When they
flush out their API docs (and let me use their API) I will write another post!

~~~
carbocation
Right. I don't mean this negatively, but your post is not really about
Google's prediction tool at all. It's a general setup to Naive Bayes. I
understand that what got you enthusiastic to write this up was Google's
announcement, but at the end of the day I could remove any reference to google
and the post would be just fine. I suppose I got excited, despite knowing
about the general procedure already, because I thought this post was directed
at how to actually use the google offering. It's just a title issue, I guess.

~~~
physcab
Sorry about that. A lot of the meta-discussion I had with others (mostly non-
techs or semi-techs) this past week about the Prediction API was simply _"But
what would I use this for?"_ kinds of questions. That's what this post was
meant to address.

Incidentally this is only my second post, and I'll continue to write both on
general insights to existing public data, as well as more technical (with-
code) posts geared towards those who want to get their hands dirty.

~~~
carbocation
Fantastic! This community (or, at least, I) will absolutely devour content
written by someone with your background.

------
milkshakes
broken link? it comes up blank here

[edit: the page loads, but doesnt render in chrome for os x] also, the links
are broken (they have an extra <http://>)

~~~
mattmillr
Same for me. Anybody know why this breaks in Chrome?

[edit: looks like the source is truncated. Several closing tags -- including a
few divs, body, and html -- are missing. Not sure why Chrome can't handle that
though.]

------
lt
Similar artists and genres? Boring.

I wanted some tech like Midomi's ou Soundhound's music fingerprinting mixed in
with this. Show me new artists that sound similar to artists that I like.
Better yet, similar to a mix of artists that I like. Now that would be nice.

~~~
physcab
Good idea! But I also want people to understand what I'm talking about to some
degree :)

This is actually quite difficult to do. First you need to identify which
features of a song are representative of its genre (a song might have 3
million of them). Then you need to build a model that can classify songs
accurately based on those features. This has to be done in a speedy way,
because you know, you don't have time to wait for a few million songs to
process...

Relevant MATLAB code if you want to try your hand:
[http://labrosa.ee.columbia.edu/~dpwe/resources/matlab/finger...](http://labrosa.ee.columbia.edu/~dpwe/resources/matlab/fingerprint/)

Relevant algorithm you might want to try (Hidden Markov Model):
[http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.131...](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.131.2084&rep=rep1&type=pdf)

Company that does recommendations based off audio fingerprinting pretty well:
<http://www.bmat.com/>

------
tamarindo
Interesting discussion of a topic I haven't read much about elsewhere.

Anyone know of any books that dig into this or related topics?

~~~
sketerpot
A good book for this sort of thing is Toby Segaran's excellent Programming
Collective Intelligence. It walks you through this sort of fascinating thing
with easy examples and clear explanations. It's sprinkled with simple Python
code.

<http://oreilly.com/catalog/9780596529321>

If you want a good introduction to Naive Bayesian classifiers, there was a
pretty readable explanation in Artificial Intelligence: a Modern Approach.
It's an expensive book, but I'm sure you can find a copy in any well-stocked
university library.

<http://aima.cs.berkeley.edu/>

~~~
elibryan
Tom Mitchell's Machine Learning is also very good (despite being a bit older)
<http://www.cs.cmu.edu/~tom/mlbook.html>

------
jafl5272
Somebody should try to use it to predict Google's next move...

