
Amazon Machine Learning – Make Data-Driven Decisions at Scale - leef
https://aws.amazon.com/blogs/aws/amazon-machine-learning-make-data-driven-decisions-at-scale/
======
rm999
Meh. The more I do machine learning in industry the more I realize how little
the ML part matters compares to everything else. A typical project I've seen
takes 3-6 months and contains thousands lines of code, but the machine
learning part will take a week or two and be 100 lines of code. What Amazon ML
is doing would probably take an hour and 30 lines of R code you can easily
find online.

And here's the not-too-hidden secret: the ML part is the fun part. It's a big
reason we spend months creating banking.csv. Josh Willis did a very funny
presentation at MLconf partly about this. It's like waiting in line at a theme
park for an hour, and then paying someone to cut in line at the last minute
and record the ride for you.
[https://www.youtube.com/watch?v=4Gwf5zsg4vI&feature=youtu.be...](https://www.youtube.com/watch?v=4Gwf5zsg4vI&feature=youtu.be&t=657)

~~~
sputknick
Agree 100%, in that light, anyone know how far we are away from having data
wrangling be more automated? I saw a demo for a product called Paxata a few
weeks ago, it looked like a good start. Anyone know more about things like
that?

~~~
sixdimensional
There are lots of new attempts at data wrangling approaches/tools, each with
different caveats - Datameer, Platfora, Trifacta..

------
vmarsy
Is it just Amazon's catching up with Azure ML launched last year? (And cutting
prices by 80%)

 _Azure ML also supports R and Python custom code, which can be dropped
directly into your workspace._

And this was even before Microsoft acquired Revolution Analytics. Amazon ML
seems to be less flexible in regards to importing your own models:

 _Q: Can I export my models out of Amazon Machine Learning?

No.

Q: Can I import existing models into Amazon Machine Learning?

No._

[http://blogs.microsoft.com/blog/2014/06/16/microsoft-
azure-m...](http://blogs.microsoft.com/blog/2014/06/16/microsoft-azure-
machine-learning-combines-power-of-comprehensive-machine-learning-with-
benefits-of-cloud/)

[https://aws.amazon.com/machine-
learning/faqs/](https://aws.amazon.com/machine-learning/faqs/)

[http://azure.microsoft.com/en-us/services/machine-
learning/](http://azure.microsoft.com/en-us/services/machine-learning/)

~~~
aficionado
No... it's Amazon ML and Azure ML trying to catch up with BigML. They copied
many things from our service but forgot to copy the ease of use. Services like
Azure ML, Amazon ML and even Google Predict API work like a black box, and
lock your model away, making you extremely dependent on their proprietary
service. With BigML, you can easily export your models and use them anywhere
for _free_. If the goal is to democratize machine learning, then the ability
to extract your models and use them as you see fit is essential, and only
BigML offers that level of freedom.

~~~
okisan
I just try out BigML and look awesome. I use Google Prediciton API to fill a
value on form of a web request. I need the result immediately. Why BigML
require two web request and take so long to get a prediction of a trained
model?

~~~
aficionado
If you use BigML's web forms, the first request caches the model locally so
that all the subsequent predictions are performed directly in your browser.

------
ris
Yeah sure, why not make your business process depend on a closed proprietary
cloud-based product?

(in all fairness Amazon are better than many when it comes to unexpectedly
withdrawing products)

~~~
psaintla
I would be less worried about that and more worried about cost. I know of two
different startups that aren't profitable but would be if they hadn't put
their entire platform on amazon services. One of those startups was lucky
enough to be acquired but it's going to take them many unprofitable years to
migrate away.

------
minimaxir
So the pricing is $100 per million data points, at minimum. That doesn't seem
like it scales well for big data at all.

However, that's 5x cheaper than what BigML is offering
([https://bigml.com/pricing/credits](https://bigml.com/pricing/credits)) for
its ad hoc service, so I might be wrong.

~~~
aficionado
BigML cofounder here. Most BigML customers doing machine learning at scale use
either BigML subscriptions (starting $30/mo) or private deployments – both of
which provide unlimited model training and predictions and are suitable for
developers and large enterprises alike. In addition, with BigML you can export
your models (for cluster analysis and anomaly detection and not just
classification/regression) to run locally and/or to be incorporated in related
systems and services.

------
discardorama
Did they basically just put a wrapper around VW[1] ?

[1]
[https://github.com/JohnLangford/vowpal_wabbit](https://github.com/JohnLangford/vowpal_wabbit)

~~~
mturmon
No -- see [https://aws.amazon.com/machine-
learning/faqs/](https://aws.amazon.com/machine-learning/faqs/) \--

"Q: What algorithm does Amazon Machine Learning use to generate models?

Amazon Machine Learning currently uses an industry-standard logistic
regression algorithm to generate models."

But disappointingly:

"Q: Can I export my models out of Amazon Machine Learning?

No.

Q: Can I import existing models into Amazon Machine Learning?

No."

Note that they are doing classification and regression on iid feature vectors.
Of course, ML is much larger than this setting, but this setting is generic
enough that it has some applicability to lots of problems.

~~~
etrain
This does not mean they are not using Vowpal Wabbit. It is very easy to run
Vowpal Wabbit with a logistic loss function.

Also, vw _is_ what I'd consider "industry standard."

------
pinkunicorn
I am really amazed at the kind of things Amazon turns into a service. And this
ML service is just wow'ing. I have fiddled with basic SVM's before, but this
takes away the part of writing code and makes it sort of a end user
product(you are still expected to know basics about ML). On the other hand, I
also don't think this will take off very well. Maybe a few companies/startups
who have cash in their pocket will use it/try it out, but the audience is
really limited beyond that in my opinion.

~~~
gallamine
> Maybe a few companies/startups who have cash in their pocket will use it/try
> it out

Honestly, I'd see it the other way around. Small companies without a DS team
might be drawn to this. I don't see how any company with a lick of sense would
lock down their prediction model into AWS. They very clearly won't let you
export your model once the training is done.

~~~
minimaxir
Small companies without a DS team will likely fall into the ML pitfalls which
make the resulting analysis invalid.

------
addisonj
At first glance, this looks to go somewhat beyond Google's Prediction API,
which (at least from my experience) is pretty limited in its usefulness.

Its nice to see tools for analyzing your data as well as multi-class
classification, and some tune-able parameters but this doesn't seem to bring
anything 'new' to the game.

All the hard parts, feature selection, noise, unlabeled data, etc are still up
to the end user, which makes me wonder how many people will try this out and
get poor results.

It would be nice to get an idea of what sort of model they are using on the
backend or even having a choice of models.

~~~
alooPotato
What differences did you notice beyond Google's Prediction API?

~~~
addisonj
This may be different now, but when I used Prediction API a few years ago, I
don't remember it having any data analysis tools or multi-class
classification. The UI was also pretty lacking. Haven't looked at in a while
but perhaps it has gotten better?

------
aficionado
Did anyone actually give it a try? I only get this error with any dataset
(even a humble Iris): Amazon ML cannot create an ML model: 1 validation error
detected: Value null at 'predictiveModelType' failed to satisfy constraint:
Member must not be null

~~~
mloudon
go to the datasources tab and see if there's an error message from data source
creation. i had the same error due to an issue with variable names.

------
saurabhtandon
I like the "Introduction to Machine Learning" which sort of briefly outlines
the basics of machine learning for people who don't know about it.

------
orionblastar
I predict we will see more cloud based machine learning services. Since
machine learning is hard to learn and write for the average person, providing
the services will greatly help them.

It would be good if there were an open source tool like Libreoffice that does
Machine Learning in their spreadsheet app. It would be a good feature to add,
and then the competitors would have to add it to their software as well.

------
chrischen
Google's competing product:
[https://cloud.google.com/prediction/docs](https://cloud.google.com/prediction/docs)

------
sandstrom
Cannot find it (in N. Virginia)? Is that only me?

(if anyone has the direct link for the console, please share :)

~~~
dbarlett
[https://console.aws.amazon.com/machinelearning/home?region=u...](https://console.aws.amazon.com/machinelearning/home?region=us-
east-1#/)

~~~
sandstrom
Thanks!

(weird, still doesn't show in the menu)

~~~
discodave
It takes the teams a while to launch everything.

------
rcpt
Some have already taken this kinda thing a few steps further:

[http://www.automaticstatistician.com/](http://www.automaticstatistician.com/)

