
What Hinton’s Google Move Says About the Future of Machine Learning - osdf
http://moalquraishi.wordpress.com/2013/03/17/what-hintons-google-move-says-about-the-future-of-machine-learning/
======
nonsequ
I know HN thinks Google is much cooler than IBM, but is it weird that I like
IBM's chances in ML progressing to AI? For one thing, Watson was a very
impressive demonstration. For another, it has the old materials science know-
how to create neuron-inspired chip architecture. And here's an important one:
Fortune 500 companies with lots of valuable data trust IBM to solve their
problems. Anybody here with other thoughts, caveats? Other companies with good
shots at commercializing ML/AI technology?

~~~
dude_abides
Great point. As per the author's thesis, we are in phase 2, and phase 3 is
going to be initiated by a startup, and not before another 20 years.

But IBM is currently doing (and making good progress at) exactly what the
author describes as phase 3.

------
conductrics
I think what makes Hinton surprising is that he has a long established
academic lab and so many current top researchers went through that lab. Yann
LeCun (ANNs/Deep Learning), Chris Williams, (GPs), Carl Rasmussen (GPs), Peter
Dayan (NeuroScince and TD-Learning), Sam Roweis (RIP). As you note, industrial
research labs (with Nobel prize winning researchers) have been around at IBM,
NEC, and ATT Bell etc. One thing that I think about, is what happens to the
quality of research as top folks who have an established record of producing
new researchers are pulled from that role? Also not sure about startups having
anything to do with with making technology real. Is Google still a startup?

------
auggierose
Machine brains are already much "smarter" than human brains. For certain
tasks, that is, like calculation. With increased computing power, these tasks
will grow more and more. But will machines ever be REALLY smarter than humans?
I will only believe that when I see it. This question might be (but does not
necessarily have to be) related to the question of all questions: Can machines
have consciousness, like we do?

~~~
igravious
There is a growing realization that cognition is fundamentally _embodied_
cognition. If you think of the mind as an ethereal entity removed of its
physicality (or at the very least made out of a different substance from the
body - that is to say, substance dualism) then it is easy to imagine the
following scenarios. Containers are unimportant, so minds can be uploaded and
downloaded, whether machine or human the housing is unimportant.

If we come to accept cognition as fundamentally embodied then it becomes less
sensible to compare cognition across differing architectures - human cognition
will always be quite unlike any other type of cognition except itself. I think
machines will have consciousness (why should they not be able to, what is so
special about us that would limit this phenomenon to us?) but it will be a
machine consciousness and radically different from ours.

I think we're going to have to get a lot more fine-grained about how we talk
about features and functions of brains whether human or machine. You've
already put "smarter" in quotes which shows that already you're aware of how
blunt and crude our terms are.

Does this all seem reasonable?

~~~
auggierose
I understand your point of view, which is basically the one shared by many
people in CS. But personally, I don't think that machines will ever develop
consciousness as we have it. Because I understand how current technology
works, and there is no consciousness there. I would have no qualms shutting
down a machine, even if it begged me to keep it running.

~~~
rdtsc
Just like a computer science freshman I imbued computer systems with magic. Oh
look I feed this machine numbers and it spits out words (text to speech). Or I
search for something and a magic algorithm find me the result. As I learned
more about algorithms and data-structures, that magic disappeared. Now I had
the same feeling about hardware. This magic black square on the motherboard
that can execute a set of couple of hundred or so assembly instructions many
billions of times per second. Then I took a hardware architecture class and
poof! magic disappeared. We started with transistors and build to designing
our own CPU chip.

I am guessing something similar is going on with our understanding of the
brain and mind. I think we just haven't figured out a good way to model and
represent knowledge. There was terrible optimism at the end of 50s that super
human AI will take over in just a decades. But it didn't happen. We have sort
of been stomping our feet (I personally don't consider playing chess an AI
achievement). I think there will be a breakthrough -- maybe it will be a
simple organization of existing ML and knowledge representation methods
(neural networks, mixed with evolutionary algorithms) or some new framework -
OR - enough of very specific applications (chess playing, image recognition
and speech recognition) advanced will slowly chip away at this "magic" AI core
until maybe nothing will be left. And we'll look back at that and at our
brains and say "ah, it wasn't that complicated after all, it is just all these
specific subsystems working together"...

------
aheilbut
This is reading way too much into it. Google happens to have a very nice
confluence of money, data, people, and interesting applications at the moment.
But there is and always has been back-and-forth of ideas and people between
academia and industry in machine learning and all other fields.

------
davmre
As an ML researcher, this article isn't persuasive to me for a few reasons:

\- Computing power is getting exponentially cheaper even as computing
requirements increase. The resources available to a university lab in the
future will be much greater than those available today, even given the same
budget. Of course this is also true for industry, but this growth is not a
unique advantage of industry.

\- Other scientific fields already have equipment costs that are orders of
magnitude larger than CS. Physicists regularly write grant proposals for
multimillion-dollar pieces of equipment. If building large clusters is
necessary for academic research to stay relevant, academics will start
building large clusters. The foundational work done at Bell, IBM, Xerox, etc
in the 70s and 80s was not due to resource constraints in academia (academics
had expensive computers too, and also did plenty of good work during that
time), it was because those companies had the right combination of smart
people and an immediate need to find practical solutions to difficult
problems.

\- Finally, and most importantly, _even in the age of big data_ almost all
fundamental research can be done quite successfully at small scales with
modest hardware requirements. Notice that Hinton et. al. have spent 6+ years
developing deep learning _in academia_ , and it's only in the past couple of
years that it's matured to the point of implementation at scale.

Here's the basic pipeline of most machine learning research: you come up with
a new approach for training SVMs, or multilayer perceptrons, or some new type
of more interesting model. First you develop your ideas conceptually, with
some equations on a whiteboard. If you're a theorist, you might prove some
theorems. Next you write a toy implementation in Matlab or Python to show that
your method actually works, and that you get improvement over previous work
_for the dataset size you're using_. This could mean that your method is
faster -- which indicates it'll be able to scale to bigger data -- or that
it's smarter / taking advantage of some new type of structure, in which case
it still ought to get decent (if not state-of-the-art) results on small data.
_Only then_ , usually after publishing a few papers and working out the kinks,
does it generally make sense to put in the effort to implement and test a big,
efficient distributed version of your algorithm. And while that last part
might be best done by industry, the first few steps are easily possible in
academia and will continue to be for the foreseeable future.

Case in point: Google Translate is a massive system whose performance rests
squarely on exploiting big data, in that they use the Internet as their
training set. But academic machine translation research still runs quite
effectively with smaller datasets on small clusters. The academics come up
with ideas, implement and test them, and some ideas flop while others take
off. The idea that take off get picked up by Google and implemented into
Translate, where they hopefully end up pushing the envelope. So even though
the academics don't have the resources to work at massive scale (which most of
them don't want to do anyway -- ML researchers are usually more interested in
ML than in building distributed systems) their research still has impact,
through transfer to industry. This sort of relationship has been the model for
academic/industry research collaboration for quite a while, and I don't think
it's dead yet.

~~~
aspis
I attended a talk by Quoc Le at UCSD recently, and he made the case that it is
necessary to get the algorithms tested large scale, rather than sending too
much time on it at small scale.

He had presented a graph comparing some models and their accuracy as the
number of features was scaled up to the tens of thousands, his point being
that some models that work best at smaller number of features fall off as the
number is scaled up. Unfortunately the slides he has on his web page is
outdated, so I haven't been able to find that reference. I'd be very happy if
one of you know which paper he was referring to. In the old slides he refers
to this paper, which makes something of the same point:
[http://ai.stanford.edu/~ang/papers/nipsdlufl10-AnalysisSingl...](http://ai.stanford.edu/~ang/papers/nipsdlufl10-AnalysisSingleLayerUnsupervisedFeatureLearning.pdf)
It shows how simple unsupervised models with dense feature extraction reach
the state of the art performance of more complex models.

Of course, I can see how it makes sense to at least do some small scale
prototyping, to work out kinks like you say - but the lesson is that if you
are planning to do large scale machine learning you can't necessarily use the
small scale tests as a good guide for large scale performance. It's certainly
promising if you get very good accuracy, speed or both at small scale, though
neither necessarily will carry over to large scale. On the flip side, if your
method is worse than state-of-the-art at smaller scales, that doesn't mean it
won't beat state-of-the-art at large scales.

~~~
jmares
Data shows, as you say, that small scale performance is no indicator of large
scale performance.

How then do you decide which projects are worth trying on the large scale?

------
ilaksh
I think that ml people should take a look at the AGI field. I also think that
more powerful techniques, specialized hardware like qualcomms baby Brain
corporation are building, and/or large peer computing networks will make
general intelligence accessible for small groups or individuals In fewer than
twenty years.

~~~
davmre
AGI has cool ideas, and is in some sense the "right" theoretical framework for
AI, but it's not clear that it gives any kind of practical path forward for AI
research. The main problem is that its basic idea -- an AI performing Bayesian
inference over a hypothesis class of all potential environment-generating
computer programs, with a Kolmogorov complexity prior -- is wildly
uncomputable, so to make it practical we'd need to find simple, computable
approximations that work on real problems. But this is basically what modern
ML research is _already trying to do_ \-- finding models that are complex
enough to capture interesting structure in the world, but still simple enough
for efficient inference to be practical.

~~~
ilaksh
"an AI performing Bayesian inference over a hypothesis class of all potential
environment-generating computer programs, with a Kolmogorov complexity prior,
-- is wildly uncomputable, so to make it practical we'd need to find simple,
computable approximations that work on real problems"

That's not what AGI is trying to do or how they are trying to do it.

~~~
wookietrader
It's at least one way which has been advocated by leading researcher of the
field. If you think differently, you should give references and explain what
your AGI definition is.

------
jfoutz
I'd wager human brains to a lot of stuff unrelated to solving problems at
hand, like keeping the heart beating. Given that the machines don't need to do
all of the underlying biological stuff, you can probably get away with fewer
connections.

~~~
jpadkins
might be the same amount of overhead needed for an OS to keep tabs on it's
hardware, cluster, etc. Just like brains need to translate to the physical
world via the nervous system, pure software needs to translate to the physical
world via an OS.

------
gingerlime
_by which time (2050s-2060s) we will have machine brains that are orders of
magnitude smarter than human ones (!)_

that's a fascinating yet chilling thought (granted, orders of magnitudes
_dumber_ than those future thoughts of the machines)

------
jmares
Dear Googlers, it would be interesting to know how computational resources are
allocated to new ideas (eg. Kurzweil's PRTM-based NLU system) at each stage,
from prototype genesis to mature technology. What are the factors that come
into play?

------
wookietrader
There is a machine learning that is not related to big data, you know. Many
interesting problems in machine learning, and most of the hard ones, have a
computational demand for which a single i7 and 16 GB of RAM are more than
enough.

------
kespindler
I've been thinking this _exact_ same trend ever since I saw Hinton's move to
Google, but I didn't have the historic background to make these comparisons.
Really nice job.

