
Inside Google Brain - albertzeyer
http://www.wired.com/2014/07/google_brain
======
api
"Google isn't really a search company-- it's a machine learning company."

No, it's an advertising company. That's who pays the bills. At the end of the
day all this cool tech is to better understand and model human beings in order
to better push ads.

Sometimes it depresses me that so many of the world's most brilliant minds are
working on that, but a generation or two ago they'd all be building doomsday
bombs. I guess that's some progress.

~~~
cpeterso
Google is DoubleClick, literally. I don't think people would be so quick to
use a web browser or mobile phone built by DoubleClick instead of Google.

~~~
api
I do think people understand that, and I think it's one of the reasons Glass
seems like it's a failing product. People are drawing the line at wearing a
heads-up mobile camera controlled by an advertising company.

~~~
magicalist
You could say the opposite thing about Android: people embracing it in spite
of the possibility of a screen they take everywhere and look at all the time
controlled by an "advertising company".

And in practice, the same thing is true for Android and Glass: there aren't
ads popping up in your face all the time, because that would be a sure way to
alienate all your users (which is one of the reasons why the "they're just an
advertising company" phrase doesn't give any real insight).

I think it's more likely that having a giant thing on your face that doesn't
actually do all that much is more the reason for people seeming to start
drawing the line with Glass. That and meme status ("glasshole") and echo
chamber effects (most people outside of tech circles don't care either way).

------
daviddumenil
The original paper being discussed:

[http://arxiv.org/abs/1312.6082](http://arxiv.org/abs/1312.6082)

~~~
rtkwe
Strange they didn't link to google's own publication page:

[http://research.google.com/pubs/pub42241.html](http://research.google.com/pubs/pub42241.html)

------
cliveowen
"They’ve also found that the models tend to become more accurate the more data
they consume. That may be the next big goal for Google: building AI models
that are based on billions of data points, not just millions. "

I'm not versed in machine learning, but it looks to me that any model whose
output quality is dependent on the quantity of data it ingests is deeply
flawed. There's no doubt a bigger number of samples will make the predictions
more accurate, but isn't the challenge to develop a system that is as accurate
as possible regardless of the number of data points its fed, like the human
brain?

~~~
rwissmann
Until you have reached a very large subset of all available information, more
data allows you to make better predictions. Period. That is as true for
machine learning as it is of the human brain.

You often want your models to also perform well when you have fewer data
points. Those are two separate - if in effect related - design goals.

~~~
niangb
This is true to the extent that you are not overfitting your dataset. Neural
networks and random trees are quite good at fitting anything! And still they
can perform poorly on your validation set.

~~~
MaysonL
Overfitting will only occur when the dataset is too small for the model.

~~~
niangb
Not only. Your example is too particular. I would say that overfitting tends
to occur when one do not understand the underlying dynamic of a system you are
trying to model. Any model with enough degrees of liberties can fit anything
and still explain nothing.

------
roneesh
These Wired click-bait headlines are terrible.

~~~
dang
They certainly are. Can anyone suggest a better one here? We want accurate and
neutral.

~~~
michael_nielsen
Could submissions from Wired (& other sites which practice link-baity titles)
be automatically put in a moderation queue? They wouldn't appear in the "New"
queue until the title had been reviewed by a moderator, and fixed if
appropriate. That sort of fix seems to be necessary a large fraction of the
time. Regularly having such titles on the front page is a small but real hit
to HN's quality.

~~~
dang
There actually is a status like that, sort of. But Wired isn't currently in
that bucket. We may have to put it in there; their titles have gotten
noticeably (to me) more sensational lately.

------
woodchuck64
Seems inevitable that the company with the world's largest distributed compute
engine (estimated at 40 petaflops as of 2012, must be 80 petaflops by now)
would eventually get into machine-learning.

------
bcRIPster
Anyone else notice the sample street number photos looked a lot like some of
the CAPTCHA images going around not too long ago? Hmm...

~~~
VikingCoder
That's old, published and confirmed news:

[http://techcrunch.com/2012/03/29/google-now-using-
recaptcha-...](http://techcrunch.com/2012/03/29/google-now-using-recaptcha-to-
decode-street-view-addresses/)

------
thisjepisje
I knew this was a Wired article without even looking at the URL.

~~~
soapdog
me too... argh, I hate those headlines. Its a proof of how much tech and
digital culture journalism has fallen.

I remember on my pre-broadband days here in Brazil subscribing to Doctor Dobbs
Journal to the tune of 25 USD per issue and being happy. Each issue filled
with little gems that would advance my knowledge a lot... these days its all
those glossy covers with photoshop covers and over the top headlines.

:-(

~~~
icebraining
I don't think tech journalism as really fallen - Wired has always been fluffy,
and Dr. Dobbs is still around [1]. I think the change happened more on the
shelves, as laymen with an interest in technology replaced geeks as the most
profitable group.

[1] [http://www.drdobbs.com/](http://www.drdobbs.com/)

------
sidcool
It seems they missed mentioning Ray Kurzweil, the AI master who's a Director
of Engineering at Google.

~~~
eitally
To clarify, Ray is _a_ director, not _the_ director. Google has many
engineering directors.

~~~
sidcool
Corrected it. But I do think he deserves a mention when talking about AI and
Google in particular.

------
michaelochurch
_Google is not really a search company. It 's a machine-learning company._

It wants to be seen that way, but until it abolishes closed allocation (and
maybe it has, but I haven't heard anything to indicate that it has) it will
just be an _ads_ company. It's still a pretty good place to work, by industry
standards, but the percentage of people who'll get to work on machine learning
is very low.

Google definitely wants to have the image of being _the_ machine learning
company because that's a great way to attract talent (even if that talent is
mostly wasted under closed allocation). And if you land in the right place,
there is interesting work. The reality most people face, though, is that most
people (especially outside of Mt. View) aren't going to get real projects and
won't be anywhere near the machine learning work.

Google does have a lot of talent and probably would be the undisputed #1 tech
company if it implemented open allocation, though.

~~~
programminggeek
Google is an Advertising company and it delivers ads via these platforms like
search, mobile apps, etc.

It clearly is a tech company, but to pretend it's anything other than an ad
delivery machine is sort of ignoring the elephant in the room.

Everything goes back to ads. Even Google doesn't bite the hand that feeds it.

~~~
dsymonds
That's a common trope, but it doesn't make sense. We don't call the New York
Times an advertising company simply because the majority of its revenue comes
from ads. Why apply the same standard to Google?

------
astazangasta
>About a year later, Google had reduced Android’s voice recognition error rate
by an astounding 25 percent.

lol. This is the grand payoff?

~~~
deong
Considering that for the preceding 25 years or so, progress in the state of
the art had been annual reductions of far less than 1%, it's a pretty big
deal.

~~~
astazangasta
From an academic perspective, maybe. In terms of practical use of machine
methods, not much. Machine learning is largely hype. I pity the army of PhDs
they must have building training sets.

~~~
sp332
Have you tried Google's voice recognition? It's been fantastic for me.

------
gunnario
"This form of internal code-sharing has already helped another cutting-edge
Google technology called MapReduce catch fire." Map-Reduce is not a Google
technology.

~~~
paperwork
I believe it is. Obviously map and reduce operations have existed for a long
time, but MapReduce was Google's work, before it was popularize by hadoop and
friends. Soon after MapReduce became popular, there was a paper published (by
non-googlers) which 'described' the algorithm and poked some fun at the
terminology:

[http://lambda-the-ultimate.org/node/1669](http://lambda-the-
ultimate.org/node/1669)

~~~
dekhn
Most people miss this important point: the most important part of MapReduce is
the shuffle step, which is a global, partitioned disk-to-disk sort. See the
FlumeJava for a bit more discussion on why this is relevant. Everything else
about MapReduce is just framework to make programmer's lives easier. BTW, not
having strong typing in classic MR is a major pain point. Flume goes a long
way to addressing this in a practical way.

BTW, this point elucidates the importance of this classic Google Interview
question: [http://www.glassdoor.com/Interview/Sort-a-million-32-bit-
int...](http://www.glassdoor.com/Interview/Sort-a-million-32-bit-integers-
using-only-2MB-of-RAM-QTN_120936.htm)

