
Google launches an end-to-end AI platform - kjhughes
https://techcrunch.com/2019/04/10/google-expands-its-ai-services/
======
siavash
Make sure you've read the service terms[1] if you plan on building apps for
speakers, cars, TVs or smart watches...

 _12.1 The following terms apply only to current and future Google Cloud
Platform Machine Learning Services specifically listed in the "Google Cloud
Platform Machine Learning Services Group" category on the Google Cloud
Platform Services Summary page:_

 _Customer will not, and will not allow third parties to: (i) use these
Services to create, train, or improve (directly or indirectly) a similar or
competing product or service or (ii) integrate these Services with any
applications for any embedded devices such as cars, TVs, appliances, or
speakers without Google 's prior written permission. These Services can only
be integrated with applications for the following personal computing devices:
smartphones, tablets, laptops, and desktops_

[1] [https://cloud.google.com/terms/service-terms#12-google-
cloud...](https://cloud.google.com/terms/service-terms#12-google-cloud-
platform-machine-learning-group-and-google-cloud-machine-learning-engine)

~~~
jimrandomh
I have a hard time imagining who this wouldn't be a dealbreaker for. These
terms also mean you can't use it for open source, and you can't use it if you
don't know what the ultimate application is going to be. And you probably
can't resell technology you create, because no one who buys it is going to
want that restriction, either.

~~~
zeroxfe
It's not a dealbreaker for in-house ML applications particularly in the
enterprise (banks, telcos, etc.), which is a huge market for cloud providers.

~~~
sodosopa
That's how my company would apply the platform to our work. My boss is there
now, will be interesting to see what he find out vs what we're doing with our
current IBM platform.

~~~
tixocloud
Which industry are you in? Would love to get your thoughts on the IBM
platform.

~~~
sodosopa
Health care. If you're running on-prem, it does a good job of integrating
Kubernetes where it's not something my data scientists have to worry about.
Also fairly easy to tie in our data mart and all of our databases - MS,
Oracle, DB2, MongoDB, etc.

------
minimaxir
This is being announced now in the Google Next keynote.

This platform focuses not on the this-AI-is-magic-and-can-solve-everything
like many AI SaaS startups announced on Hacker News, but focuses on _how to
actually integrate this AI into production workflows_ , which is something I
wish was discussed more often in AI.

The announcements here, including AutoML Tables (which is coincidentally
similar to my own Python package:
[https://news.ycombinator.com/item?id=19492406](https://news.ycombinator.com/item?id=19492406)),
the new BigQuery BI tools, and the new Google Sheets integrations, make me a
very happy data scientist.

I'm taking the rest of the week to figure out how to integrate everything
announced into my team.

~~~
amrrs
Looks like Google is taking over Cloud (from AWS) for AI by building an
ecosystem and building tools for non Data scientists - consumer level product.
Surely IBM can do similar thing with their recent Redhat acquisition, but will
they ?

~~~
kerng
Azure and Amazon have these offerings for multiple years already, or?
Including workflow tools, like Flow.

~~~
camuel
So what is exactly Azure or Amazon offering for "AutoML Tables" or for "End-
to-end AI Platform that runs ALSO on premises"?

~~~
pplonski86
There is Autometed ML on Azure (in theit ML studio)
[https://docs.microsoft.com/en-us/azure/machine-
learning/serv...](https://docs.microsoft.com/en-us/azure/machine-
learning/service/tutorial-auto-train-models)

From Amazon there is no automated ML solution (autoML that you can train
models with few clicks)

------
ricklamers
I've tried using Google's AutoML to classify medical images as part of a
Kaggle competition just to see what their process is like and how well it
performs. This was Q4 2018, so things might have changed slightly.

In brief, my experience was quite frustrating. First, getting my dataset to
the cloud required quite a bit of manual labor. Uploading my 25GB of images on
my 40mbit/s wasn't really ideal so I ended up spawning an virtual machine on
GCP and downloading directly from Kaggle. Unzipping the files and writing it
to Google Cloud storage with some terminal commands.

Furthermore, to get the label data into AutoML I had to write some generation
script that generated the exact format CSV that AutoML requires - which was
hidden in their documentation somewhere with no mention in the AutoML
environment itself.

Nothing too cumbersome but generally not a very user friendly experience, or
something I wish to repeat many times if I get a new/different dataset.

Ultimately, when the data with labels was in the model started training. Then
I found, that I didn't really have the tools and information to assess the
model performance. They did a decent job of characterizing model performance
through precision recall graphs and displaying incorrect predictions but that
didn't really satisfy me. I was interested in getting more details about where
it was misclassifying images, specifically how classification performance was
distributed across the 28 classes the model was predicting (in a multi label
context).

This is the point where I think the downside of working with a platform such
as AutoML starts showing. I tried reaching out to someone about gaining more
insight into model performance by opening a ticket, since there was no phone
number. After a couple of days I finally received an email from a product
representative that told me that for any assistance I should contact one of
their local cloud partners.

These are third party vendors that typically assist companies in deploying
cloud based applications in the GCP. However, after calling two of these
companies that were highly recommended by Google's vendor page I was told that
they don't have any experience with AutoML and that I was on my own. The other
company didn't reply at all.

In my view, choosing a product such as AutoML - for a company that is serious
about adopting AI to improve their business - is currently not a good path
(yet). And I see this space as being wide open for competition with current
solutions not cutting it from my point of view.

------
mlthoughts2018
It’s so disingenuous for Google to brand these efforts as “democratizing AI.”
It is precisely the opposite.

This is classic commoditization of your complement. On one hand, Google is
pushing to centralize the integration, data management and computing platforms
for machine learning, so that these things become as much of a commodity as
possible.

On the other side, they are offering massive compensation packages or acqui-
hiring as much AI talent as they can, not really because they have useful work
for these folks to do, but to artificially reduce the supply of statistical
algorithm talent, making their consulting and pre-packaged AI solutions go up
in value in a manner that is pretty much the same as De Beers pushing silos
full of diamonds to keep diamond prices artificially high.

This is very much the opposite of democratization, and my advice to anyone
considering services like this or like Amazon’s out of the box models, don’t
do it!

If you think it’s going to save you money over paying competitively to get
your own in-house machine learning staff, you’re wrong, and you’re going to
waste probably hundreds of thousands of dollars before you learn you’re wrong.

~~~
m0zg

      >> It’s so disingenuous for Google to brand 
      >> these efforts as “democratizing AI.”
    

Exactly. NVIDIA building fast consumer GPUs and CUDA/cuDNN is "democratizing
AI". FB (and Google) releasing open source deep learning toolkits is
"democratizing AI". People releasing reproducible research code and datasets
are "democratizing AI".

Cloud vendor lock-in and proprietary hardware, software, _and_ datasets is not
in any way "democratizing" anything.

    
    
      >> not really because they have useful work for these folks to do
    

This, however, is where your argument flies off the rails, IMO. They offer
this much because there's very limited supply of people who can do both
research and development at the same time. Meaning, they don't just write
papers, but also can code pretty well. It's usually one or the other, and
hardly ever both. And the total in this case is much greater than the sum of
its parts.

~~~
mlthoughts2018
I agree with you that the supply of those workers is low (I am one of them!
And hiring more is extremely hard!) ... but it doesn’t mean Google / FB / etc
have meaningful work for them. A friend of mine was hired as a senior ML
person at Facebook and ended up working solely on cartoon avatars and page
responsiveness / latency optimization (not using ML).

When he raised the issue to managers that he wasn’t working on anything
related to his specialization (deep learning for NLP) and this made him
unhappy, the response was essentially, “Get in line.” I’ve heard similar
stories about Google from a former boss who had been a long time manager in
Google.

You essentially get paid super well to be put out to pasture so that your
skill isn’t being used by other companies (leading to more demand for Google’s
managed AI solutions).

In order to get career-developing work, you have to play political games or
get hired in a non-standard way, like acqui-hire or poached, where you can
negotiate your projects as part of your hiring conditions. Eventhen it will
probably only be respected for a short time while it’s convenient for Google,
and they’ll find a way to manage you out of that situation when they want to.

There are some tremendously talented AI engineers in places like Google. Some
of them create awesome products and tools. A bunch of others sit around and
atrophy working on dumb shit locked in golden handcuffs just to ensure they’re
not on the market and able to help a company build things in-house more
cheaply / more optimally than if they needed to buy it through some managed
services through Google.

~~~
m0zg
As an ex-Googler, that's not how you're supposed to operate at those
companies. You find something you like to do, talk to the team, ensure the
other team has open spots and would like to take you on, and move your shit
from one desk to the other. Done. You're working on your specialization, if
that's what you like to do.

Discussing stuff with your manager is utterly pointless because your interests
aren't really aligned. You want to do something else. Your manager wants you
to do whatever you're doing now because finding a replacement for you is a bit
of a pain in the ass. She gets no brownie points if you leave.

It is true that Google has a ton of PhDs who just copy one protobuffer into
another and browse memegen all day while earning half a million dollars a
year. But they also have a ton of PhDs who do meaningful work, too. It's not
really Google's problem that someone can't be bothered to look around and find
something meaningful for themselves to do. Or to be more exact, it is a
problem _for_ Google, because there are a lot of people who can be deployed in
higher leverage occupations, but not one that Google itself can solve, because
one of the main tenets of how they operate is _nobody tells you what to do_.
You're supposed to figure it out on your own. A lot of people can't deal with
that.

~~~
wokwokwok
That might be how Valve works, but I’m quite sure it’s not how Apple /
Facebook / Google do.

Just move your shit to another desk and join whatever team you want with out
talking to your manager or hr or updating your goals?

Yeah... nah.

~~~
m0zg
Well yeah, you have to tell your manager of course, but _they can't stop you_
from moving to another team.

Don't know about Apple or Amazon but Google/FB are like that. If you're very
senior, a few months (no more than 6 no matter how senior) delay might be
imposed so that you hand off your stuff, if you're less senior, a few weeks is
usually enough. It is also expected (at Google, don't know about FB) that
you'll stay on each team for at least a year, and that you'll wind down your
obligations in an orderly fashion, which I think you'll agree is not
unreasonable.

But _nobody_ will force you to do work you really don't like to do. In
contrast, at most other companies it's easier to get a job _at another
company_ than to move to another team.

~~~
NovaX
There are many ways at Google to stop people from moving teams. For me it was
a code yellow. I am hazy, but it was either referred to as indentured
servitude or more politely stated as such by our senior director who expected
3 year stints. This was at MTV, probably a toxic environment due to CxO
visibility.

~~~
m0zg
Code yellow is by its very nature a finite-duration thing. I've never been a
part of one that was longer than a month. It's reasonable to not allow
transfers for the duration, if it helps to resolve the code yellow IMO.

~~~
NovaX
We had 4-6 month cyclical code yellows. It's not unlike the game where 20%
projects could only be within the same team without significant blowback.
Things are only reasonable when used as designed, which is not how all of
Google operates for everyone. You can talk in the general sense, but you
cannot speak definitively especially when such actions are blessed by the
executive team.

------
andrewtbham
I have tried Google AutoML, Microsoft cognitive services and looked at
Clarifai for image labeling and classification. I ended up writing my own
system. Maybe this new Google product will be better... definitely room for
improvement in this space.

------
petard
This will make a bunch of startup's life really hard. I think it makes it
harder to justify investing in your own ML pipeline or even building your own
models for many use cases.

~~~
pplonski86
I'm running a startup which offers the same solution as Google AuotML Tables.
Recently I decided to go open source. I will need to compare my solution with
Google AutoML Tables (compare in terms of final model accuracy). But anyway I
think many times the best model accuracy is not the most important in ML
solutions.

Any ideas what can I do with such a situation with my solution? Can I compete
with Google?

~~~
holoduke
Usually google is good in the initial release of a product, but they lack good
customer care and support. They have non transparant pricings and they
disrespect privacy. Those are something's you can differentiate on

~~~
sodosopa
> lack good customer care and support.

Depends on how much you're paying them and the SOW you've signed.

> non transparant pricings and they disrespect privacy.

For pricing, talk with them. Or use a cloud broker.

~~~
tatersolid
> For pricing, talk with them.

You’re joking, right? Every other day there’s a top 10 article on HN about
Google locking out a whole business customer with no humans to speak with.

~~~
sodosopa
Far from it. If you have an actual sow and contract with them, you can define
your support terms and get better support than if you just sign up with a
business account and avoid talking with salespeople.

------
silasdavis
Paraphrased from the article: AutoML tables takes a generic table and predicts
can predict a columns value (for an unseen partial row, I assume). Their
product page emphasises how easy it is: for developers with limited machine
learning experience. What are the consequences of using such structured data
predictions devoid of interpretation or quantified uncertainty? Presumably
such predictions could be from a set of discrete values, how is that smoothed?
What is the result of providing definite results in the absence of intention?

~~~
perturbation
AutoML is essentially training a ML model using some heuristics or
optimization algorithm to select model architecture and train a model. Feature
engineering / feature synthesis as well as interpretability remain open
challenges.

If I'm understanding your questions correctly, the main problems I see with
this are:

\- Using raw data instead of feature engineering (less of a problem given
feature synthesis libraries like
[https://www.featuretools.com/](https://www.featuretools.com/) and other
heuristic methods). I'd expect Google to do a good job of basic things like
normalization of raw input features before training.

\- Using features that it really shouldn't (if you just throw ML at your
database for say, loan applications, then sensitive / personally identifying
information can/will be used as features)

\- Lack of insight / understanding as to what is driving the model. This can
be partially overcome with post-training methods like LIME, Shapley values,
etc.

I wouldn't expect predictions to be from a set of discrete values - if (say)
predicting housing values and training a NN, the output should be continuous
and based on the input features.

~~~
mritchie712
Another common error I see is timing (e.g. using data from the "future" to
predict an event). To build on your loan example, if you inadvertently
included the current FICO score of an applicant that applied 12 months ago, it
will be unfairly correlated with the loans current performance.

~~~
kmax12
This is very important! If you use Featuretools, we provide a mechanism to
avoid this very problem. See how we handle time in our documentation here:
[https://docs.featuretools.com/automated_feature_engineering/...](https://docs.featuretools.com/automated_feature_engineering/handling_time.html)

------
chibg10
I work in building and deploying production ML/AI models but I'm having a lot
of trouble cutting through the marketing jargon in this article and on
Google's website as well.

Can someone explain what this does in engineering terms? How does this differ
from something like AWS Sagemaker?

~~~
streetcat1
So for me, automl means machine learning without writing a single line of
python code or any code. I.e. data and schema IN, deployed models OUT.

With sagemaker you still need to provide the python code for training.

~~~
cle
SageMaker has a wide array of built-in training algorithms:

[https://docs.aws.amazon.com/sagemaker/latest/dg/algos.html](https://docs.aws.amazon.com/sagemaker/latest/dg/algos.html)

------
jedberg
This is where Google really has a big lead on AWS -- the AI space. AWS has AI
tools, but Google's are better and easier to use.

The big question is: If all your data sits in AWS, because your app that
generates the data is there, do you reach across and try to use the Google AI
tools, or are their tools compelling enough to get you to move your app and
all your data to GCS?

~~~
agentofoblivion
What AI tools? SageMaker is awesome. What can Google do for me? All I've seen
from them is marketing hype. I also haven't looked very hard.

~~~
agentofoblivion
Note: this is a real question with real curiosity. What are googles best AI
tools?

------
crooked-v
Neat. How long until they shut it down?

~~~
andybak
Please don't post glib cliched comments. There's a valid point behind this but
the same one-liner appearing several times on every Google product
announcement is just tiresome.

~~~
crooked-v
You say 'glib', but I actually want the answer to the question, so I know
whether to bother investing any time learning about this or not.

~~~
Gigablah
"but I actually want the answer to the question"

No, you don't. Because any rational person would know that nobody is going to
have or tell you the answer. So, still glib and disingenuous to boot.

------
meeks
Does any of this announcement affect Google Datalab? It feels like they keep
Datalab as some second class citizen, that doesn't even get a spot in the menu
bar in the cloud console and doesn't get any love when all of these let-us-
build-the-model-for-you products get all of these announcements and upgrades.

------
Liveanimalcams
The point of autoML from what I've gathered is to make it as easy as can be to
get a model working in production. AutoML at least AutoML Vision is using
transfer learning to retrain X number of layers from their algorithm (the one
google uses is escaping me right now). The number of layers it has to retrain
is the value they offer, it tries and tries optimizing for accuracy.

I've had good results with it, but you do have to do things their way and its
not always well documented. If you want more control you should create your
own model and host it on google app engine, otherwise AutoML is what it is, no
way to customize or tune it other than changing the training data you give it.

------
Gravityloss
Google launches: 1500 HN entries
[https://hn.algolia.com/?query=google%20launches&sort=byPopul...](https://hn.algolia.com/?query=google%20launches&sort=byPopularity&prefix&page=0&dateRange=all&type=story)

Google shuts: 700 HN entries:
[https://hn.algolia.com/?query=google%20shuts&sort=byPopulari...](https://hn.algolia.com/?query=google%20shuts&sort=byPopularity&prefix&page=0&dateRange=all&type=story)

(thought it would be funny to compare, it's not very scientific of course...)

~~~
what_ever
Using "" will be more correct?
[https://hn.algolia.com/?query=%22google%20shuts%22&sort=byPo...](https://hn.algolia.com/?query=%22google%20shuts%22&sort=byPopularity&prefix=false&page=0&dateRange=all&type=story)
gives 71 results and that includes shutting down websites as well.

Disc: Googler.

------
kerng
How is this different to Azure ML and Amazon's offerings?

I know Azure ML as has been out for 3+ years - so I assume they have many
features and enterprise learnings baked in over the years.

Does anyone have good comparison?

------
AndrewKemendo
I'd be curious to know who has used one of these end to end "AI" services in
production for a successful, market fitting product.

From my experience doing this for a handful of companies, it's almost always
better long term to use ML libraries that fit into the organizational
architecture that exists, rather than outsource the whole ML pipeline. It's a
serious amount of lock in to do that.

Maybe it's a an easier sell if your entire pipeline is already built into GCP
and you're just tacking this on as a parallel path.

~~~
tixocloud
Agreed with the long term benefits on fitting into existing architecture.

I did speak with someone in financial services who’s been all GCP and they are
quite impressed by the way everything is integrated and how they do not have
to shift data from storage to train.

------
sgt101
TMForum are developing an end to end standard for model dev and management,
unfortunately it's closed but I really think that standardised components
(feature store, model store, model registry) and model non functionals
(identity, use history, status) with standardised API's are pretty necessary.
BTW : deployed to prod is only 50% of the story; someone poor fool has to
manage (and account for) these things in life!

------
burtonator
I'm thinking of integrating this into Polar for people that use our cloud sync
support.

I should be able to use the documents and their tags to build more derivative
tag data so that when I'm given a new document I can compute a set of
automated tags for the user.

This way when you save a document to your repository it's given some suggested
tags.

------
pplonski86
I cant find in the documentation what kind of algorithms they are using? Is it
only one algorithm type (neural network)?

------
alexnewman
Only a fool would use any of these platforms and large companies never will.
The ml industry is full of fledgling products that no one should ever use and
clearly if you use this you will be locked into google forever

~~~
eyeball
some large companies here:

[https://www.datarobot.com/customers/](https://www.datarobot.com/customers/)

------
n4s33r
In before Google kills this

------
mlindner
I'm really not happy to see this. This further entrenches people into Google's
AI platform where all the input data will be harvested for ad revenue and even
further profiling of human data. Laws enshrining natural rights of privacy of
personal data can't come soon enough.

~~~
nealmueller
Google Cloud does not use customer data for Google's search graph. No cloud
does this.

From Google: "At Google Cloud, we do not access customer data for any reason
other than those necessary to fulfill our contractual obligations to you.
Technical controls require valid business justifications for any access by
support or engineering personnel to your content. Google also performs regular
audits of accesses by administrators as a check on the effectiveness of our
controls."

Justification reason codes for data access
[https://cloud.google.com/logging/docs/audit/reading-
access-t...](https://cloud.google.com/logging/docs/audit/reading-access-
transparency-logs#justification_reason_codes)

That said, I agree with what you've said here: "Laws enshrining natural rights
of privacy of personal data can't come soon enough." Nicely put.

