
Geoffrey Hinton spent 30 years on an idea many other scientists dismissed - varunagrawal
https://torontolife.com/tech/ai-superstars-google-facebook-apple-studied-guy
======
rm999
>For more than 30 years, Geoffrey Hinton hovered at the edges of artificial
intelligence research, an outsider clinging to a simple proposition: that
computers could think like humans do—using intuition rather than rules.

This is so disrespectful to the 1000s of researchers who have been studying
machine learning since well before 2012. It was well established that the
future of teaching computers came from statistics and not rules in the 80s/90s
by researchers like Michael Jordan
([https://en.wikipedia.org/wiki/Michael_I._Jordan](https://en.wikipedia.org/wiki/Michael_I._Jordan))
and his students.

It was even engrained in popular culture: neural networks are how the AI brain
worked in Terminator 2, in 1991!
[https://www.youtube.com/watch?v=xcgVztdMrX4](https://www.youtube.com/watch?v=xcgVztdMrX4)

edit: I don't want to downplay Hinton's accomplishments, I've been lucky to
have been surrounded by and motivated by his work since I started learning
machine learning. I did my masters research on neural networks that were
partly inspired by his work, and it was a deep networks paper he presented at
a NIPS 2006 workshop that got me really excited to stay in machine learning
while I was starting my career.

~~~
nostrademons
There've been cycles & fads in the meantime, though.

I remember that in the early 90s, neural nets were supposed to be the huge new
thing. Many scanners shipped with built-in OCR, the Apple Newton had
handwriting recognition on a PDA, and my Centris 660AV could do speech
recognition & text-to-speech out of the box.

But they ultimately weren't powerful enough to satisfy customers or
meaningfully change how people interacted with computers, so they failed in
the market, and the hype cycle moved on to the World Wide Web.

I guess this shows the power of continuing to study something when the fad
goes away, so that you're well positioned to capitalize when the _next_ fad
hits.

~~~
jstarfish
> I guess this shows the power of continuing to study something when the fad
> goes away, so that you're well positioned to capitalize when the next fad
> hits.

So true. VR popped up in the 90/00s (Virtual Boy, VRML, etc.), went nowhere,
and here we are again.

I suspect something similar will happen with blockchain/cryptocurrency. Only
once all the hype and speculation dies off will meaningful uses for the tech
become evident.

~~~
alanfalcon
Are you trying to suggest that Cryptokitties aren’t meaningful? Blasphemy :-)

------
NumberSix
The article is misleading if not false. Neural nets were hot in academic AI
research 30 years ago (1988). The original Perceptron had fallen out of favor
in part because of arguments that it could not implement a exclusive or (XOR)
in Minsky and Papert's book Perceptrons

[https://en.wikipedia.org/wiki/Perceptrons_(book)](https://en.wikipedia.org/wiki/Perceptrons_\(book\))

Neural nets fell out of favor in the 1970's but came back and became hot in
the early 1980's with work by John Hopfield and others that addressed the
objections.

[https://en.wikipedia.org/wiki/John_Hopfield](https://en.wikipedia.org/wiki/John_Hopfield)

Practical and commercial successes were limited in the 1980's and 1990's which
led to a reasonable decline in interest in the method. There were some
commercial successes such as HNC Software which used neural nets for credit
scoring and was acquired by Fair Isaac Corporation (FICO).

[https://en.wikipedia.org/wiki/Robert_Hecht-
Nielsen](https://en.wikipedia.org/wiki/Robert_Hecht-Nielsen)

I turned down a job offer from HNC in late 1992 and neural nets were still
clearly hot at that time.

Some people continued to use neural nets with some limited success in the late
1990's and 2000s. I saw some successes using neural nets to locate faces in
images, for example. Mostly they failed.

AI research is very faddish with periods of extreme optimism about a technique
followed by disillusionment. One may wonder how much of the current Machine
Learning/Deep Learning hype will prove exaggerated.

Also, traditional Hidden Markov Model (HMM) speech recognition is not rule
based at all. It uses a maximum likelihood based extremely complex statistical
model of speech.

~~~
YeGoblynQueenne
Hinton himself did just fine in terms of academic popularity in the '80s and
'90s. We can look at his citations on Semantic Scholar:

[https://www.semanticscholar.org/author/Geoffrey-E-
Hinton/169...](https://www.semanticscholar.org/author/Geoffrey-E-
Hinton/1695689?year%5B0%5D=1976&year%5B1%5D=2017&sort=influence)

Citations to his papers have been rising steadily from between 88 and 107 in
1987 to between 685 and 826 in 1999. That's hardly an unpopular researcher.

And for a bit of comparison with other machine learning researchers, here's a
link to a data set of family relations from a 1986 paper by Hinton:

[https://archive.ics.uci.edu/ml/datasets/Kinship](https://archive.ics.uci.edu/ml/datasets/Kinship)

At the bottom of that page, in the _Relevant Papers_ sections there's two
links to two papers using the data set, one Hinton's own paper that introduces
it and one by Quinlan.

Clicking on the [Web Link] links for the two papers, I can see the references
to those papers. There is a single reference to Quinlan's paper. There are 43
to Hinton's, of which all but 6 are from 1999 and earlier. And those are not
self-references, neither references by Bengio, Le Cun et al. If there is a
clique, it is hard to see it.

So there was a lot of interest to Hinton's work even in the years he was
supposed to be "exiled to the academic hinterland" as another article said.

------
gameswithgo
Most other scientists dismissed neural networks? Is there some history I am
unaware of as that doesn't seem true. Did the article want to push the idea of
the lone rogue thinker a bit too much?

~~~
gumby
They are referring to Minsky and Papert's 1967 book "Perceptrons". Which said
that a single layer NN can't be turing complete because it can't do XOR. The
big 1990s breakthrough was having _multilayer_ networks...which actually were
also described in the Perceptrons book, though at the time computers weren't
powerful enough to implement them.

~~~
blt
I don't understand how it took an entire book to show that a linear classifier
can't learn the xor function.

~~~
gumby
There’s more to the book than my one-sentence summary. Believe Minsky and
Papert have something extensive to say on a subject it’s probably not shallow.
There’s an ok, if brief discussion about the book’s influence on NN research
in its Wikipedia entry.

It’s not a super long book and an excellent, eye opening read even 50 years
later.

------
scottlocklin
I wonder when they'll write this about Michael Jordan. "History doesn't repeat
itself, but it often rhymes"

Probably they'll never mention Friedman and Breiman, which seems pretty unfair
considering their gizmos have arguably had a bigger impact in "actual machine
learning gizmos deployed..."

~~~
rvo
Hah. I am genuinely confused about which Michael Jordan you were referring to.
I am assuming you mean Michael I Jordan.

------
visarga
I see this article as an opportunity to know a little bit more about Hinton's
personal life and personality. It's not a neural nets article, we we shouldn't
dig too deep into the controversy about who invented what and if they were
alone or not.

~~~
anonytrary
In my single interaction with Hinton, we were talking about a theory, he told
me how he thought of it years ago and he remembered people distinctly
disagreeing with him. I feel Hinton carries a tiny bit of salt with him where
he goes, which explains his sarcasm as well.

------
taeric
This is an odd story that seems to gloss over the downsides of neural
networks. Computational power needed to build some of the models is enough to
explain how slow uptake was. At least in large. I would be interested to see
just how many multiplications go into a typical model nowadays. In particular,
the training of one.

But that still skirts the big issue, which is generalization. We are moving,
it seems, to transfer learning. The danger is that we don't seem to have a
good theory to why it works. At a practitioner level, I don't think this is as
much of a problem. For the research, though, it is pretty shaky.

I think there is more than a strong chance this remains the future for a
while. And I am layman in this field. At best. But this story presupposes that
the past was wrong for not being like the present. That is a tough bar.

~~~
js8
I believe neural networks are not quite the right approach, and their success
is misleading.

Historically, almost all ML approaches were based on separation in Euclidean
(vector) space, which is understandable because they were developed for much
weaker computers. However, really useful ML tasks require to deal with huge
nonlinearities, and the fact that you're training in linear space becomes less
relevant.

Neural networks have surmounted the nonlinear difficulty by increasing the
number of layers. But it's a question whether a similar result couldn't be
achieved with Bayesian networks on binary representations (which is approach I
favor).

There is some evidence that the precision of linear calculation in neuron (for
example, resolution of the weights) doesn't really make much difference in the
performance of neural network. Could it be that the neural learning through
vector manipulation is just an artifact of the origin of the neural networks,
and the really important thing is the overall organization of the network (the
layers)?

~~~
taeric
I'm assuming people are trying with Bayesian methods. Any good research it
practical examples showing advantages?

------
sonabinu
“I cannot imagine how a woman with children can have an academic career. ...."
This is the real truth and reality for anyone actively parenting and trying to
deeply understand and research anything ...Grateful that this article chose to
include the quote

------
intrasight
There weren't many, but there was a strong contingent of neural network
researchers going strong since the time I was in high school in the early 80s.
Jerome Feldman (University of Rochester) was a neighbor. James McClelland
(Department of Psychology) was a mentor of mine in the mid-80s. This field was
far from ignored. We used different names (connectionism, backpropagation) and
most importantly we had computers that were tens of thousands of times less
capable than what is available today.

------
itissid
his model for contrastive divergent learning pre 2000 iirc was what really set
the base for his breakthrough in the mid 2000. I think it took him sometime to
make the jump from contrastive divergence learning to RBMs that learnt good
priors for deeper layers...

------
zerostar07
he was well known name in the 80s and back again with RBMs in the 00s
[https://www.youtube.com/watch?v=AyzOUbkUf3M](https://www.youtube.com/watch?v=AyzOUbkUf3M)
. He and Sejnowski are some of the few names i remember when i took an NN
class a long time ago. He was insistent on working on it when many others saw
it as a peripheral curiosity to their career.

what's with everyone here?

------
wslh
There is something in Canada because the best book (I tried to understand)
about neural networks back in the 90s was "Neural Networks: A Comprehensive
Foundation" by Simon Hayking [1]

[1]
[https://en.wikipedia.org/wiki/Simon_Haykin](https://en.wikipedia.org/wiki/Simon_Haykin)

------
muglug
> His great-great-grandfather was George Boole

Is that true?

~~~
chubot
That is a pretty awesome bit of trivia. I assume they did some basic fact
checking, and if Hinton's grandfather was not Boole, he probably would notice
this in the article and correct it.

It certainly seems plausible:
[https://en.wikipedia.org/wiki/George_Boole](https://en.wikipedia.org/wiki/George_Boole)

~~~
muglug
Sorry, this was just a terrible boolean algebra joke that I was unable to
delete in time.

------
hawktheslayer
This was a very interesting article, but as a juggler, the most interesting
thing to me was how he learned to juggle grapes with his mouth. I need to run
to the store to pick up some grapes now!

------
singularity2001

       "an outsider clinging to a simple proposition: that computers could think like humans do—using intuition rather than rules. "
    

I stopped reading right there.

~~~
miketery
We know humans go by intuition first (the book Rightous Mind supports this if
you want o read further).

So why isn't this something that's in the realm of possibility for computers?

~~~
perfmode
Intuition is an ill-defined term that often distances us from the actual
processes at hand.

~~~
miketery
Is it though? We might not understand it fully, but we understand it's role in
allowing us to make quick decisions and not overload our cognition with the
mundane.

------
vemv
Despite hype and success, Machine/Deep Learning have their own limitations,
which is a generally admitted fact.

At a fundamental level - are our brains actually comparable to how ML works
(beyond some basic analogies)? Do we have an statistical engine running inside
our heads, needing tremendous "CPU power" to do something remotely
useful/accurate?

I'd say that no, and that that conceptual mismatch indicates that the next big
iteration on AI will be something more like what D. Hofstadter
advocates/researched.

(Using ML as a sidekick, why not. No need to trash out the current progress)

