Hacker News new | past | comments | ask | show | jobs | submit login
How I Would Use the Google Prediction API (To Find Your Musical Profile) (thedatascientist.com)
28 points by physcab on May 22, 2010 | hide | past | favorite | 14 comments



From the article, he admits that this is a hypothesis about how the service might work. It's actually just an introductory overview of naive bayes, and doesn't address an actual use case of the G prediction engine (at least, not that they have confirmed). The actual examples from google all seem to have discrete outcomes, so far.

Naive bayes is almost definitely going to be something that they offer — it seems like it's just a question of 'when'.


Right. At this point you kind-of have to just imagine the workflow (which is actually what I do quite a bit before tackling an analytics problem). You have to envision an ultimate goal of what you want your output to be, an understanding of what's being done to your data, and then make sure you correctly accumulate and format your data to insert into the system. When they flush out their API docs (and let me use their API) I will write another post!


Right. I don't mean this negatively, but your post is not really about Google's prediction tool at all. It's a general setup to Naive Bayes. I understand that what got you enthusiastic to write this up was Google's announcement, but at the end of the day I could remove any reference to google and the post would be just fine. I suppose I got excited, despite knowing about the general procedure already, because I thought this post was directed at how to actually use the google offering. It's just a title issue, I guess.


Sorry about that. A lot of the meta-discussion I had with others (mostly non-techs or semi-techs) this past week about the Prediction API was simply "But what would I use this for?" kinds of questions. That's what this post was meant to address.

Incidentally this is only my second post, and I'll continue to write both on general insights to existing public data, as well as more technical (with-code) posts geared towards those who want to get their hands dirty.


Fantastic! This community (or, at least, I) will absolutely devour content written by someone with your background.


broken link? it comes up blank here

[edit: the page loads, but doesnt render in chrome for os x] also, the links are broken (they have an extra http://)


Same for me. Anybody know why this breaks in Chrome?

[edit: looks like the source is truncated. Several closing tags -- including a few divs, body, and html -- are missing. Not sure why Chrome can't handle that though.]


Yeah thats weird. I'm not sure why. If you go to the homepage or take out the trailing slash, it loads up on Google Chrome.


Similar artists and genres? Boring.

I wanted some tech like Midomi's ou Soundhound's music fingerprinting mixed in with this. Show me new artists that sound similar to artists that I like. Better yet, similar to a mix of artists that I like. Now that would be nice.


Good idea! But I also want people to understand what I'm talking about to some degree :)

This is actually quite difficult to do. First you need to identify which features of a song are representative of its genre (a song might have 3 million of them). Then you need to build a model that can classify songs accurately based on those features. This has to be done in a speedy way, because you know, you don't have time to wait for a few million songs to process...

Relevant MATLAB code if you want to try your hand: http://labrosa.ee.columbia.edu/~dpwe/resources/matlab/finger...

Relevant algorithm you might want to try (Hidden Markov Model): http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.131...

Company that does recommendations based off audio fingerprinting pretty well: http://www.bmat.com/


Interesting discussion of a topic I haven't read much about elsewhere.

Anyone know of any books that dig into this or related topics?


A good book for this sort of thing is Toby Segaran's excellent Programming Collective Intelligence. It walks you through this sort of fascinating thing with easy examples and clear explanations. It's sprinkled with simple Python code.

http://oreilly.com/catalog/9780596529321

If you want a good introduction to Naive Bayesian classifiers, there was a pretty readable explanation in Artificial Intelligence: a Modern Approach. It's an expensive book, but I'm sure you can find a copy in any well-stocked university library.

http://aima.cs.berkeley.edu/


Tom Mitchell's Machine Learning is also very good (despite being a bit older) http://www.cs.cmu.edu/~tom/mlbook.html


Somebody should try to use it to predict Google's next move...




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: