Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Google's cloud-based machine learning tools (developers.google.com)
73 points by coffee on Feb 27, 2012 | hide | past | favorite | 11 comments


The free pricing tier looks like a joke.

100 predictions a day isn't enough to test if it works. Somebody needs to make a much bigger commitment of their time to make a machine learning application that really works -- when you look at what's holding machine learning back it's not the algorithms or the hardware requirements, it's that people don't want to do the work of creating high quality training sets and validating them.


There's a meetup at Google's Crittenden Campus (Mountain View) that'll be going over this next Wednesday: http://www.meetup.com/sv-gtug/events/51114452/


At least, by making it a for-pay service, the likelihood of Google just shutting it down in the middle of the night is lessened (although still present). However, anyone competent enough to be passing data to an API can pass data to one of the many open source ML libraries that are available for many languages. I don't see the point.


Here is the 'point':

* Training models can be rather computationally expensive. Especially if your business requires training new models very often, this can be prohibitively expensive to do in ec2, whereas the prediction API solves that for you.

* Just hooking up to an open source ML library isn't the whole story. You still need to do backtesting on different algorithms and do the parameter tuning, aka you need some machine learning know-how. The Prediction API does all this for you automatically and probably uses a much larger set of algorithms than you would bother to test yourself.


It's great to see these tools available on a cloud computing basis. Just make sure you read the ToS:

1.2. From Customer to Google. By submitting, posting or displaying any Customer Data on or through the Service, Customer gives Google a worldwide, non-sublicensable, non-transferable, non-exclusive, terminable, limited license to reproduce, adapt, modify, translate, publish, publicly perform, publicly display and distribute any Customer Data for the sole purpose of enabling Google to provide Customer with the Service in accordance with the Agreement.


Isn't that the standard lawyerspeak to actually run the service? If I understood correctly, when reading those you just need to check that it is limited to the service provided ("for the sole purpose of enabling Google to provide Customer with the Service").

I've found the following (it's about photographs and social networks) quite helpful in explaining the issues: http://www.readwriteweb.com/archives/getty_images_says_googl...


So basically, "by uploading your data to the service, you give Google the rights to use the data to deliver the service." Not seeing the problem.


Incidentally, here is an article I wrote today introducing three new machine learning tools being launched by startups:

http://www.readwriteweb.com/hack/2012/02/three-new-tools-bri...

Discuss: http://news.ycombinator.com/item?id=3640954


Is it only me, or does anyone else think that the 40k predictions per day limit is a little too low for some?


The limit is actually 60k. And you can probably raise that if you talk to them, as they ask you to.


Lol. What a joke. Another BU that needs to be shut down.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: