
Ask HN: Who uses Azure ML? - Avalaxy
I was checking out this video (https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=kZ04LnSjWek) on Azure Machine Learning the other day and I was very impressed. Basically you can click-and-drag all sorts of components together to create a model, test it, compare algorithms, visualize data sets and results and then publish the model as a working web service. With this tool you can set up a fairly complex machine learning model in just one hour, including &#x27;implementing&#x27; different algorithms and comparing their results. Especially for building a proof of concept and for experimenting, I think it&#x27;s extremely powerful.<p>I never heard anyone here about using it, so I was wondering why not? Bad experiences? Nobody knows about it? Can anyone tell me about their experiences?
======
BIackSwan
Its very good for quick prototyping. I have used it for experimentation and it
speeds up your iterations by a significant amount.

From my blog post [1] - "I explored a whole bunch of options on what platform
to use for training ML - Scikit Learn, Tensorflow, AWS Machine Learning and
Microsoft Azure. Azure’s ease of use puts it in another league altogether -
especially for people who want to just apply existing ML algorithms. Drag and
drop, kicking off experiments parallely, easy to construct flows etc. allow
you to move really really fast as compared to writing code.

This might become a bottleneck later as we scale or it might become more
expensive to run - but the tools they provide save tons of time, especially
for prototyping and experimentation. I highly recommend trying out Azure -
their free tier gives a good taste of what’s possible on their platform."

[1] [http://blog.doctorc.in/post/156115218607/predicting-
medical-...](http://blog.doctorc.in/post/156115218607/predicting-medical-
chronic-conditions-with-machine)

~~~
Avalaxy
Very interesting! Bookmarking your blog post for later. I'm still deciding
whether I should focus on Scikit/TF/AzureML, or multiples. Any comparisons
between them are welcome. Would be nice to see a cost breakdown for each of
them either (as in development time and $$ to run it).

~~~
BIackSwan
Its very contextual. So the answer is "it depends".

Azure ML will cost you since they have a non free tier in $$ whereas the other
tools can scale without that. BUT for the other tools which are free in $$ you
need more powerful hardware available as you start using larger datasets.

Also, the ease of use of Azure is a benefit in itself, you don't need to learn
any coding library and how to use it - speeds up your iteration loops. So the
dev time spent in Azure will be less than others because of how its designed -
I am not so sure about using it on a large production level scale though -
haven't hit that case yet.

I can't get more specific than this because I don't know what your
requirements/constraints are and what sort of resources are available to you.

~~~
a3camero
Although you can sign up for Bizspark (or if you have a startup: Bizspark
Plus) and use quite a bit for free.

------
joakleaf
I have used several times. Uploaded a CSV file, thrown neural networks or
random decision forests at it to see how well it worked.

It works well when you are prototyping and is quite accessible. It is also
good to test various algorithms and the GUI enables you to get a good sense
'flow of data' (input, filters, algorithms, comparisons, output, etc.).

However, it becomes painful really quickly, and it is not for production!

For the particular project, I ended up using RDFs (because they showed the
most promising results in Azure), and implemented RDFs by hand. After some
tweaking, my own implementation performed better than the one on Azure, so I
was satisfied with that. BTW. RDFs are quite easy to implement!

I think, I'll switch to TensorFlow for equivalent future prototyping. Mainly,
because I can run it locally and the script (Python) support on Azure ML is
terrible.

~~~
Avalaxy
Interesting experience, thanks!

> For the particular project, I ended up using RDFs (because they showed the
> most promising results in Azure), and implemented RDFs by hand. After some
> tweaking, my own implementation performed better than the one on Azure, so I
> was satisfied with that.

What language/libraries did you implement this in? Is this Python? Do you have
any idea what makes it faster than Azure?

~~~
joakleaf
Java, actually... But not by choice -- The client requested that the project
was written in Java.

It is always hard to compare these algorithms because you have lots of
parameters, and I must admit, that I did not tweak the settings on Azure.

I don't know why (or indeed if) it was faster/better results. You cannot
really time the algorithms on Azure the same way you can locally. But because
I ran my own implementation on a fast local machine it probably felt a lot
faster -- This and then that my implementation was more specific to the
overall problem.

In any case, speed when constructing the RDFs is not the main concern -- It is
the quality of the model you create. Here I could add some specific tweaks for
the particular problem (e.g. weighing different classes) that were not
available in Azure.

Finally, using the RDFs for classification is extremely fast and straight-
forward (basically you just branch through a large set of trees), and were
nowhere near the biggest bottleneck in the specific project.

It just works a whole lot better locally.

------
roopalik
Thanks for your interest in Azure ML! I am part of the product team and I am
happy to answer any questions you have.

Specifically on pricing, once the model is built using the Azure ML Studio,
the deployed web service can be accessed by using the dev/test tier. The
purpose of the dev/test tier is for you to play around with the web service
and ensure that it is giving you the results you are expecting. Once you are
satisfied with the results and plan to go into production then we have paid
tiers starting at 100$ per month (Standard S1) which includes 100K API calls
and 25 hours of compute. The exact plan you might need (S1,S2 or S3) depends
on your application and the number of calls / compute it uses up. You can
change / cancel plans at any time paying only for what you used.

Here [1] is one case study of Track Revenue, a startup using Azure ML to
optimize marketing offers for its customer’s campaigns. They are a member of
the BizSpark program [2](mentioned earlier on this thread) at MS which offers
startups a significant number of free azure credits that can be applied
against any Azure service usage including Azure ML.

Let me know if you have any further questions. You can reach me directly at
roopalik@microsoft.com.

[1] [https://customers.microsoft.com/en-us/story/this-maverick-
mo...](https://customers.microsoft.com/en-us/story/this-maverick-moved-from-
amazon-to-azure-and-boosted-customer-revenue-per-click-by-38-percent) [2]
[https://bizspark.microsoft.com/](https://bizspark.microsoft.com/)

------
oduis
I used it for analyzing some organisational customer data. It worked very well
if you can convert your (offline) data to one of the supported formats.

Yes, price is an issue if you are a heavy user of web services with masses of
calls, but especially for corporate environments where it is difficult to set
up a machine but the amount of data to analyze is reasonable I can really
recommend it.

------
hcrisp
To my knowledge, it only works with dataframes, has limited algorithms, and
results in non-portable models due to vendor lock-in. That said, it makes ML
easier for the non-programmer, is seeing fast growth due to MS investment, and
can be well integrated to custom R and Python code.

~~~
Avalaxy
> and results in non-portable models due to vendor lock-in

Yea, I was wondering about this. The free tier
([https://azure.microsoft.com/en-us/pricing/details/machine-
le...](https://azure.microsoft.com/en-us/pricing/details/machine-learning/))
says that publishing it as a production API is not available. I wonder how you
can get data out of it without the API.

I'm also wondering how the costs in a real life scenario would compare to
running your own software on bare metal. Maybe it would be a good trade off to
experiment with Azure ML to find the winning algo combination and then
implement it yourself and host it on your own server.

~~~
conjectures
Not MS Azure, but AWS - alot of their more advanced APIs seem to rack up bills
quite quickly.

Their prediction API is $0.0001 per hit. So if you get r requests per second
cost per hour is r _0.0001_ 60^2. Cost to rent an EC2 server is x per hour.

So if x<r(0.0001)(60^2) it's cheaper to rent. Say you need an r3.large that's
x=$0.166. So it's cheaper if you get more than 0.46 requests per second, which
sounds like quite a light load.

------
SteveCoast
If you have Mathematica, the ML is insanely automated and easy to use to
prototype with.

[https://www.wolfram.com/mathematica/new-in-10/highly-
automat...](https://www.wolfram.com/mathematica/new-in-10/highly-automated-
machine-learning/)

A Mathematica license is $99 or something.

------
pplonski86
You should check MLJAR, this is really good tool for rapid prototyping and
development machine learning models. The example
[https://youtu.be/ijmw94h4qCk](https://youtu.be/ijmw94h4qCk)

I read article about Azure ML [https://www.linkedin.com/pulse/why-microsofts-
azure-ml-offer...](https://www.linkedin.com/pulse/why-microsofts-azure-ml-
offering-failing-ben-taylor-ai-hacker)

