
Cloud AutoML: Making AI accessible to every business - theDoug
https://www.blog.google/topics/google-cloud/cloud-automl-making-ai-accessible-every-business/
======
cdl
I think "making AI accessible to every business" is a bit of stretch. While
there's no doubt that the AutoML suite will bring tremendous benefits to
businesses with recommendation and speech and image recognition needs, it
falls short of providing more useful insights such as those gleaned by
association rules, clustering (i.e. segmentation), and general probabilistic
models.

I think that if AI is to be accessible to every business then it will deliver
insights rather than the machinery to produce the insights. This is especially
true in the context of small businesses.

~~~
TheIronYuppie
Disclosure: I work at Google on Kubeflow

Can you say more about the specific insights you'd like us to provide? The
more specific the better :) Happy to see what we can do!

~~~
cdl
I'm not personally looking for the insights myself, just an observation from
working with SMBs trying to leverage data science more generally to improve
their businesses.

------
kmax12
Despite the claim to make AI accessible to every business, this release is
fairly limited in that it only applies to images. We will have to see how they
extend it going forward. Given the technology it's based on, I'd expect things
like text, audio, videos to come next.

However, I'm curious if they plan to support structured/relational datasets
which are definitely something every business needs. In Kaggle's 2017 State of
Data Science [0] survey, data scientists said they spent 65% of their time
using relational datasets vs 18% for images. Given that Kaggle is owned by
Google, this must be something on their radar.

For those data scientists, I maintain an open source library for automated
feature engineering called Featuretools
([https://github.com/featuretools/featuretools](https://github.com/featuretools/featuretools)).
For people interested in trying it out, we have demos
([https://www.featuretools.com/demos](https://www.featuretools.com/demos)) to
help you get started.

[0] [https://www.kaggle.com/surveys/2017](https://www.kaggle.com/surveys/2017)

~~~
TheIronYuppie
Disclosure: I work at Google on Kubeflow

We're just getting started! Stay tuned for lots more AutoML goodness.

~~~
innagadadavida
Recently YouTube started pulling off kids eating Tide pods video. Can AutoML
figure this out or do they use manual labor to do it? Couldn’t this have been
done before? I mean can it detect stupid/dangerous videos automatically and
pull it when it threatens to become an epidemic?

------
zengid
I don't think this is 'democratizing' AI but rather centralizing Google's
control of a utility service.

~~~
ovi256
You would have been right if ML would be as accessible as your hypothetical
"utility service" implies. It is not, and getting farther from it.

If you compare ML to electricity, we're still in the stages where a few
players have found that electrifying their manufacturing plants makes sense.
Small players can't afford he investment in machinery and skills. Maybe when
the machinery is hidden behind a "utility" provider (which would also bring
down the skill level) they will.

~~~
zengid
What makes it inaccessible? Are GPUs prohibitively expensive? Are pretrained
models unavailable? Is the software source code closed off?

~~~
eggie5
I would say training a CNN from scratch or even fine-tuning one takes a lot of
domain knowledge and best practices which often are not standardised yet.
Besides, we don't even know why they generalise in the first place! See:
[https://arxiv.org/abs/1611.03530](https://arxiv.org/abs/1611.03530),
[https://arxiv.org/abs/1711.11561](https://arxiv.org/abs/1711.11561)

------
whoisjuan
ClarifAI has been doing this for 4 years... I really like their service and
they have a fair price. It would be interesting to see how it compares (on
quality and pricing).

~~~
TuringNYC
Curious if anyone from the Product or Tech team for AutoML could describe how
this differs from MetaMind (sadly, now subsumed into SalesForce.) Richard
Socher seems to have achieved some of this in 2015 with MetaMind
(...or...perhaps he just had a lot of turkers behind the scene hand-crafting
networks to fit data drops...)

------
strin
Democratization should allow users to "own" their models. This is not the case
in Cloud AutoML. Users cannot download their models and host them elsewhere.
This dependency means Google can have control over the business's AI
capabilities.

~~~
jorgemf
Does it really say it somewhere? As far as I know when you train TensorFlow
models they are stored in gs, I thought this would be similar. Otherwise I
don't know how they are planning to integrate this with the api they have to
upload your models and make requests.

------
benkarst
Google wants to monopolize, not democratize, AI.

I wonder if Google tests for Doublethink skills before you can get hired there
now.

~~~
manigandham
Perhaps you mean monetize instead? They certainly aren't the only ones who can
do ML.

~~~
benkarst
Monetize and monopolize are synonymous in Silicon Valley. Read Thiel's Zero to
One.

------
sharemywin
If I collect a bunch of data and train a model is the data/model mine or
theirs?

what if in the future they change their minds and decide to change the TOS?
did I just build on top of quicksand?

~~~
TheIronYuppie
Disclosure: I work at Google on Kubeflow

The data remains yours and the model is yours - eg if you delete your account,
the data and model goes away (think of it like the data or model you stored on
a VM). However, what I think you're looking for is to be able to actually
download the model, and I'm afraid that's not possible.

Are you looking to avoid lock-in? Or something else?

~~~
chimtim
Then what is "Auto ML"? It sounds like just another cloud service.

~~~
TheIronYuppie
Disclosure: I work at Google on Kubeflow

Correct, it's a cloud service, based on the research Google published for
model exploration[1][2]. There are research examples today where this service
provided better models than humans were able to achieve by hand or with
genetic algorithms (models trained faster and/or with better error rates)[3]

[1]
[https://static.googleusercontent.com/media/research.google.c...](https://static.googleusercontent.com/media/research.google.com/zh-
CN//pubs/archive/46180.pdf)

[2] [https://blog.acolyer.org/2017/10/02/google-vizier-a-
service-...](https://blog.acolyer.org/2017/10/02/google-vizier-a-service-for-
black-box-optimization/)

[3] [https://arxiv.org/abs/1712.00559](https://arxiv.org/abs/1712.00559)

~~~
charls_Aws
Excuse me for saying this, but you don't have to put that disclosure sentence
with every comment ...(personally, I find it kind of irritating)

~~~
TheIronYuppie
My worry is that people deep link to a comment and think that I'm
astroturfing. Trying to balance spam vs. full disclosure.

Note I left it off of this one. :)

------
prats226
For those who don't like to wait for access,
[https://nanonets.com](https://nanonets.com) Just upload your training data
and we will provide machine learning API automatically.

------
alanlewis
It's not clear from the demo video, but will this help with labeling data? In
my experience, that is the most time consuming part of creating models.

~~~
T-A
From [https://cloud.google.com/automl/](https://cloud.google.com/automl/)

Integration with human labeling

For customers with images but no labels yet, we provide a team of in-house
human labelers that will review your custom instructions and classify your
images accordingly. You will get training data with the same quality and
throughput Google gets for its own products, while your data remains private.
You can use the human labeled data seamlessly to train a custom model.

------
jorgemf
This has to be very expensive for companies. A good business for Google

~~~
illumin8
This seems to be a reaction by Google to the Amazon SageMaker release in
November:
[https://aws.amazon.com/sagemaker/](https://aws.amazon.com/sagemaker/)

It's great to see that other cloud providers are acknowledging the talent and
training data gaps that many large enterprises face when adopting deep
learning.

Disclaimer: I work for AWS

~~~
TheIronYuppie
Disclosure: I work at Google on Kubeflow

This is an externalization of the service we use at Google internally called
Vizier[1], first discussed publicly in June[2].

The idea is that instead of having to build a model yourself, we can use ML
(yes, it uses ML to provide ML) to autotune your model and solve your business
problem. Basically, instead of having to deal with all the steps in opening an
editor, choosing a algo, tweaking, debugging, etc etc, just provide your
structured or unstructured data and we'll help you answer your question (which
is what customers actually care about).

[1]
[https://research.google.com/pubs/pub46180.html](https://research.google.com/pubs/pub46180.html)
[2]
[https://www.youtube.com/watch?v=Z2YL4XJKVpQ](https://www.youtube.com/watch?v=Z2YL4XJKVpQ)

~~~
illumin8
Same idea for Sagemaker. Nice to see I get a bunch of instant downvotes - I
sometimes wonder why even bother participating in this community.

~~~
TheIronYuppie
Disclosure: I work at Google on Kubeflow

Interesting! I read up on Sagemaker here[1] and didn't see any AutoML style
training/tuning features, but you would certainly know better than me :)

[1]
[https://aws.amazon.com/blogs/aws/sagemaker/](https://aws.amazon.com/blogs/aws/sagemaker/)

~~~
Bollack
As far as I know, they haven't implemented the HPO on Sagemaker. They're
planning to implement it soon. There's still no date announced.

~~~
jorgemf
HPO means Hyper Parameter Optimization? Because AutoML has nothing to do with
it, AutoML is mostly about the architecture of the model, not about Hyper
Parameter Optimization.

~~~
joshuamorton
model shape is a hyperparameter ;)

~~~
jorgemf
for the same logic the researcher is another hyperparameter

(I know you are right, but so many people here think AutoML is exactly the
same that the HPO they were doing since long time ago)

~~~
joshuamorton
That's fair, yes automl is not simply tuning the learning rate and picking
your favorite nonlinearity, its fancier than that, but its still tuning
hyperparamters.

------
eanzenberg
“Making AI accessible to every businees” with image classifiers (because every
businees needs image classifiers) ^_^

~~~
theDoug
Not everyone _needs_ accessibility ramps, but we all benefit from them. :)

Many businesses need them, but don't have the staff or expertise, many may
want them for fun functionality (building their own "Not Hotdog" app), but the
aim is ease of model creation for anyone with a bit of data and time.

(Disclosure: I work in Google Cloud)

~~~
peatmoss
I get that it's a metaphor, but I seriously am having a hard time equating
automated image classification to accessibility ramps.

~~~
pathseeker
You wouldn't if you worked at Google.

~~~
eanzenberg
Because google is notoriously bad at profitable products, besides search?

------
tabeth
AI without _tons_ of data is, well, overkill. That being said, it seems Google
is sharing a few pretrained models, which is nice.

~~~
rasmi
Disclaimer: I do ML-based work in Google Cloud, but I am not on the AutoML
team.

The post says there is transfer learning involved, which means in practice you
need much less data than you would if creating a classifier from scratch. Of
course, more (good) data may yield better results, but it seems one of the
goals behind this release is specifically to give custom (your own labels, not
just generic object detection) high performance image classification to those
who don't have access to Google-scale training sets.

~~~
ska
Transfer learning is hardly a panacea, however much some would like it to be.

~~~
TheIronYuppie
Disclosure: I work at Google on Kubeflow

Can you say more? I don't think anyone is saying it's magic pixie dust, but it
does dramatically reduce the amount of data you need.

~~~
_delirium
I'd probably phrase it as "can" dramatically reduce the amount of data you
need rather than "does". Getting transfer learning to work in any kind of
reliable way is still very much open research, and the systems I've seen are
heavily dependent on basically every variable involved: the specific data
sets, domains, model architectures, etc., with sometimes pretty puzzling
failures.

I don't doubt Google has managed to make something useful work, though I'm
more skeptical of how general the ML tech is. One advantage of an API like
this is that it allows control over many of those variables. I'm not sure if
this is what it does, but you could even start out by making a transfer-
learning system that's heavily tailored to transfering from _one_ specific
fixed model, which coupled with some Google-level engineering/testing
resources, could produce much more reliable performance than in the general
case.

~~~
TheIronYuppie
Disclosure: I work at Google on Kubeflow

As you can see here[1], we do provide quite a bit of information about the
accuracy and training of the underlying model.

Additionally, the AutoML already (often) provides better than human level
performance[2]. Your comment about transferring a heavily tailored model from
one model to another is basically what it's doing - it's taking something
domain specific (vision) and allowing you to transfer it to your domain.

[1]
[https://youtu.be/GbLQE2C181U?t=1m15s](https://youtu.be/GbLQE2C181U?t=1m15s)

[2]
[https://static.googleusercontent.com/media/research.google.c...](https://static.googleusercontent.com/media/research.google.com/zh-
CN//pubs/archive/46180.pdf)

