
Artificial intelligence pioneer says we need to start over - elsewhen
https://www.axios.com/ai-pioneer-advocates-starting-over-2485537027.html
======
Animats
Machine learning may be nearing its ceiling.

The history of AI goes in cycles. Someone has a good idea which solves some
problems, followed by "strong AI Real Soon Now" enthusiasm, followed by that
idea hitting its ceiling. AI has been through search, backtracking, the
General Problem Solver, hill-climbing, and expert systems. Each was overhyped
at the time, and each hit its ceiling.

The big difference this time is that the ceiling with machine learning is high
enough for large-scale profitable applications. That wasn't the case with the
previous rounds. AI used to be a dinky field - about 30-50 people each at
Stanford, CMU, and MIT, plus a few tiny groups elsewhere. Now it's a huge
field with big companies and big profits. That makes it self-sustaining.

Hinton has a point in that we're missing something. Back-propagation is an
extremely inefficient method, especially since the slower you do it, the
better it seems to work. More generally, most of machine learning is "turn the
problem into an optimization problem and bang on it really hard with lots of
compute power". This works for a useful class of problems. But it has limits.

How long until the next big idea? The last "AI winter", after expert systems,
was 15 years.

~~~
zardo
>The big difference this time is that the ceiling with machine learning is
high enough for large-scale profitable applications. That wasn't the case with
the previous rounds.

My company (fortune 500) depends on expert systems, some even date back to the
80's AI boom. Doing what we do, at the scale we do it, would simply not be
possible without them.

Lots of companies make a lot of money with expert systems, but they have no
incentive to fund AI research since expert systems have hit a clear capability
ceiling.

~~~
throwaway613834
> Lots of companies make a lot of money with expert systems, but they have no
> incentive to fund AI research since expert systems have hit a clear
> capability ceiling.

I'm not sure I follow the reasoning here. Is the suggestion that AI cannot do
better than expert systems?

~~~
mistercow
I think the point is that just because companies are making money off of AI
advances, that doesn't necessarily mean we won't hit another AI winter once we
hit a ceiling with current techniques.

~~~
throwaway613834
Okay but what's the relevance of expert systems to this argument?

~~~
tinym
This isn't the first round of AI that's been useful for "large-scale
profitable applications"

------
Nomentatus
Back in the nineteen-nineties I'd built a neural-net creator using a method
other than back-propogation that was able to create nets with about a hundred
neurons that played tic-tac-toe well or perfectly (depending on the net
created that day.) This was on a 12 mhz 286 using what I called static point
multiplication (based on shift instructions, not multiplication instructions.)
So I went to Toronto to see Professor Hinton, to share. My not being a big
back-propagation fan was one thing that divided us before meeting him, but
otherwise I thought we'd be pretty much on the same page.

It didn't go well. I happened to mention that I thought parallel programming
would be a big help speeding up neural nets, and that I'd toured such a multi-
processor computer in Alberta. He blew up. I never got to explain what I'd
done with neural nets. He tossed me out of his office summarily and wouldn't
listen to another word I had to say - somehow he'd gotten the idea in his head
that I was a salesman for the company whose machine I'd seen, which machine
was too weird to be considered anyway, and absolutely nothing I could say
would persuade him that I wasn't a salesman. So the nets I'd created and
methods other than back-propagation never got mentioned. Weirdest meeting of
my life, and the end of my desire to do anything with neural nets, since other
academic departments were even more violently rejecting of the idea back then.

~~~
sjg007
What's the idea? Sounds interesting? Publish a paper on it? Or put up a blog
post?

------
Eridrus
Obviously a clickbait headline, few people think current tech will lead to
AGI, but while we all wait on a new breakthrough in AI, we can build much more
impressive systems with what we have now than even a few years ago.

I went to a talk from people who had built more scalable training algorithms
for Restricted Boltzmann Mahcines (which Hinton was a pioneer in) a few months
ago and asked them why they had chosen to use RBMs. Besides the waffly answers
about how generative models were more interpretable than discriminative
models, the real answer seems to have been that this was a research program
that had started when RBMs were still in vogue, and their funding was tied to
it, and once it ran out they were going to switch to investigating Generative
Adversarial Networks like everyone else.

~~~
dnautics
I don't think that's correct. There are a lot of people _who should know
better_ that think that the current ml technology will lead to agi.

~~~
red75prime
Let me guess. You think so, because we don't know what the general
intelligence is, but the current approaches of ML are surely can't be extended
to produce it.

~~~
dnautics
We have some simple models of not even general intelligence that ml can't do.
To steal from hofstadter:. Let's say I have a function that maps two strings.
I'll give you an example function. "abc" -> "abd". What does "efg" map to? And
how do you train current generation of ml to solve those problems in one shot
like you just did?

I don't want to knock machine learning. There's a very huge set of very
interesting problems that it does solve that were basically intractable till
now and we haven't even scratched the surface of applications for. But the
existing techniques have understandable restrictions to the scope of soluble
problem.

I would also not be surprised if the best models included current ml
strategies in some way, for example, in supervisor threads or initial
processing of complicated input data into semantic-ish vector valued forms.

------
iandanforth
Brief description of the issue for non-experts:

Supervised learning

You can judge the output of your network against ground truth. You say that's
a cat? Nope, it's a dog! And then slightly adjust your network so it's less
likely to give that wrong answer in the future. How exactly you adjust the
network is what backpropagation describes (in combination with something
called a learning rate).

Unsupervised learning

You need to learn without someone telling you what the answer is. Most of the
time for biological intelligence there isn't an oracle describing the truth at
every moment of life to judge actions/decisions against. If you don't have
someone telling you you've made a mistake, how can you know when to adjust
your network? And if you don't know what the truth is, exactly how to adjust
the network becomes tricky.

Somehow biological brains work without that oracle, and there are lots of
ideas about how it does that. But right now for artificial intelligence none
of those ideas has been shown to work so amazingly well that it has taken off
like backprop has in the supervised learning world.

Hinton wants to find that amazing algorithm.

~~~
eanzenberg
Actually:

>>Most of the time for biological intelligence there isn't an oracle
describing the truth at every moment of life

This is false for most of learning of human intelligence. Most of schooled
human intelligence looks like supervised learning.

Innate intelligence and other animal intelligence looks like pre-trained
models and some reinforcement learning.

~~~
31reasons
I think innate intelligence/ animal intelligence is developed by evolutionary
algorithm. You don't need to do back prop for it. If the network is wrong, the
animal just dies.

~~~
fngl51
I would agree with this approach. In evolution, it would be the equivalent to
a conditional lethal mutation. In humans and even non-humans, may behaviors
are learned through a form of adaptive behavior that oftentimes becomes a form
of abductive reasoning. What is learned is "good enough" to serve as ground
truth until there is evidence to contradict.

------
52-6F-62
Everything Hinton says echoes strongly Kuhn's philosophy on the scientific
paradigm. I guess it's not too much of a surprise that he thinks this is the
case, but it also shows how intelligent (and shockingly humble) he is to make
such a statement.

I wish I'd heard of the conference, dammit!

~~~
tw1010
I think it's great that top researchers are actually learning from Kuhn's
observations. I have hope that humanity can indeed learn from its earlier
historical mistakes, but it requires us to be very aware of them in order to
collectively avoid them.

~~~
52-6F-62
I agree.

Mr Hinton was somewhat vague and brief in his statements here, though. What do
you think the crisis point is that he's reached exactly? (It must be sure
enough for the man who introduced the paradigm in the first place to have
caught on, yet the change of thought process unclear enough for him to
recognize that it will be somebody newer who will have to introduce something
else)

~~~
kobeya
I think his quote is out of context. If he is saying that back propagation
networks alone are insufficient for AGI, that is neither surprising nor
controversial. Hinton has never been working on the AGI problem. It sounds
more like the author is making use of the quotes he/she got from Hinton in the
most clickbaity way possible.

~~~
52-6F-62
I get what you're saying, but I don't think he was speaking explicitly about
AGI. To me it didn't sound like they were leaving the applied problems he has
worked on. Maybe I'm reading too much between the lines, but so far there
haven't been any major successes in un-supervised learning models and he's
suggesting here that rather than continue with the models that have had
success up to this point, it would be more worth the energy to start again at
a lower level, rethinking the problem, in order to achieve similar results
unsupervised.

This may be where my limited knowledge is my handicap, but I didn't think you
would need an AGI in order to achieve unsupervised learning. There would still
be an input and an output as there is now -- but instead of massive datasets
used in training the models could simple experience the problem in real time,
adjusting and learning based upon new results. Of course unsupervised learning
models could contribute greatly to the pursuit of AGI, but I didn't think AGI
predicate unsupervised learning models (which is what I am taking from the
article -- or more like article abstract -- here).

~~~
kobeya
> Maybe I'm reading too much between the lines, but so far there haven't been
> any major successes in un-supervised learning models

Word2vec, a variant of which is Transform, Google's new machine translation
framework which beats the pants off every other approach, to name one. The
system it is replacing is also unsupervised.

~~~
argonaut
Transformer is heavily reliant on supervised learning. Not sure where you're
getting that from.

~~~
kobeya
I was talking about word2vec.

------
S_A_P
So it says "In 1986, Geoffrey Hinton co-authored a paper that, four decades
later, is central to the explosion of artificial intelligence." Is it just me
that considers that 3 decades? I am curious which is the typo. Was it 76? or 3
decades?

~~~
zwerdlds
I was trying to rationalize it this way:

1980's - first decade

1990's - second decade

2000's - third decade

2010's - fourth decade

I guess, technically, the third decade of age for this paper ended in 2016,
making 2017 the fourth.

But this seems like silly ways to rationalize it. Genuinely seems like a typo.

------
tbabb
It is fantastic to hear this from a seasoned academic about his own field--
People get trapped in orthodoxy and become unconsciously unwilling to explore
bold ideas in favor to minor tweaks. The framework of thinking itself becomes
precious, but big advances usually come from discarding the old framework in
favor of something radically new. The kind of humility and bold thinking
needed for that kind of change is rare and hard to foster (especially while
still remaining grounded by empiricism).

~~~
inetknght
I currently work in DNA analysis. I wholeheartedly agree.

------
fsavard
That article is pretty light on details. I wonder if he pointed towards a
specific form of unsupervised learning.

Anyway it's pretty funny in light of an intro I remembered from one of his old
papers:

"It would be truly wonderful if randomly connected neural networks could turn
themselves into useful computing devices by using some simple rule to modify
the strength of synapses. This was the hope that lay behind the original Hebb
learning rule and it is the vision that has driven neural network modelers for
half a century. Initially, researchers tried simulating various rules to see
what would happen. _After a decade or two of messing around, researchers
realized that there was a much better way to explore the space of possible
learning rules: First write down an objective function [...] and then use
elementary calculus to derive a learning rule that will improve the objective
function._ " [1]

ie. backprop

So actually backprop was the solution to all that initial "messing around"
with unsupervised rules. Though of course to be fair (if I understand
correctly) those rules had very little to do with modern "unsupervised
learning" methods (e.g. autoencoders, which still rely on backprop or similar
optimization).

[1]
[http://www.cs.toronto.edu/~fritz/absps/hebbdot.pdf](http://www.cs.toronto.edu/~fritz/absps/hebbdot.pdf)
published in 2003

------
bryananderson
At first glance, the obvious solution seems to be to create intelligence the
same way nature did: some sort of evolution. Some sort of algorithm where
multiple "networks" mutate and reproduce in some way, in response to some
fitness function. Mutations that make a network more fit result in an
increased reproductive rate, while other mutations decrease that rate.

Evolutionary algorithms have been around for a while but haven't really taken
off. Maybe the problem is that you can't get from zero to intelligence with a
single fitness function.

Think about it: the "fitness function" for our single-called ancestors was not
how well they could thrive in a post-industrial service economy. Nor was it
how well they could thrive as hunter-gatherers in the African Savannah. It was
a very, very different fitness function.

To get from zero to humans, nature had to evolve single-called organisms, then
mitochondria, then basic multi-celled organisms, then animals that lived in
the ocean, then amphibious ones, then land-dwelling ones, then mammals, then
apes, then early humans, then modern humans.

By the way, I left out the overwhelming majority of the steps.

At each step, the evolutionary pressures on these organisms were wildly
different. Each set of pressures was necessary in order for the next layer of
complexity to evolve.

I think that if we want to evolve an intelligent entity, we would probably
have to do it like this.

~~~
tehramz
What makes you think that our brains or intelligence is the most efficient or
most effective? We happened to evolved a brain that allowed us to think on
what we consider sophisticated. This doesn't mean that it's a great system. We
also evolved to have a spine, which is a terrible design. There's a ton of
other examples of evolution designing things that are terrible and
ineffective, but get the job done. Is that what we're striving for?

~~~
uoaei
Why pooh-pooh ideas if you're not sure either? Let's explore, combine, and
mutate the iterations that each party thinks may be successful.

------
jfv
To be fair, reinforcement learning can be done with neural networks and can be
seen as a form of "unsupervised learning". The results aren't as spectacular
(yet?) as what we've seen in unsupervised learning, but it has potential.

I think the main problem is that we're incredibly dependent on advances in
computing power to make advances in deep learning. IMHO, we've only made
incremental _software_ progress in using deep networks since the advent of
convolutional networks.

------
bra-ket
If you are interested in biologically plausible models of cognition check out
'vector symbolic architectures' and 'associative memory' research of the 90s.

~~~
tw1010
Biological plausibility is just silly. What it mostly does is just put
artificial constraints on the problem that doesn't need to be there. It's all
mathematics. However you want to interpret or metaphorize the mathematics is
up to each math-phobic field, but constraining yourself to what randomly
evolved serves little good.

~~~
52-6F-62
While it's likely true that there are other ways to approach the problem,
we're yet unsure of how to get there. The existing biological model has as
least proven itself to work to some level, at least insofar as Descartes has
lead us to believe.

~~~
tw1010
I would bet a lot of money that engineers testing model after model in a state
of flow, without any concern for biological validity, will yield a better
result than any biologically inspired thing will. Time efficiency is also an
important factor here. The model that will win is the one that is first to
"market" (academic paper).

~~~
52-6F-62
Can't say I'm disagreeing with you. In fact, I'm not. To illustrate, I think
it is far more likely we'll have "Jarvis" before we are building our own
analogues of the human brain (or better).

I do want it to happen, though. And I think it will, in spite of whatever
model wins. It's just too interesting a problem. How that takes shape, I can't
dream of predicting. (eg. How soon will or understanding of the use of
organics in computing advance to such a level)

------
rdlecler1
What we lack is an AI equivalent of a theory of aerodynamics. At the moment
we're just brute forcing random solution when we have perfectly good systems
we should be trying to reverse engineer: biological brains. This will likely
require genetic & evutionary algorithms, a developmental genotype-phenotype
mapping, and competitive ecological systems for open ended evolution. Right
now we have back propogration which is effectively reinforcement learning--
just one piece of the puzzle. I actually applied to Hinton's lab in 2009 for a
postdoct position to address these very issues. He said they didn't have any
money and were not taking new postdocs.

~~~
tehramz
I've never understood this about "AI". Why do we think that our brains or
animal brains are the most effective models for intelligence? Just because we
evolved to have a brain that works a certain way doesn't mean it's the most
efficient or even efficient at all. Of course, I don't have much experience
with AI other than what I've read.

~~~
uoaei
Do you have much experience with brains? Are you aware of the ratio of
computational capacity to energy requirements that is as yet unmatched in
power?

Not to mention that, as a system, the brain is highly efficient in reusing
modular structures for increasingly complex tasks.

I have a feeling when people mention "AI modelled after a brain" they consider
a simplification of neuronal dynamics to be how to achieve that. But that is a
very naïve way to do it. Looking at it from the system level is the more
fruitful approach, IMHO.

------
zebrafish
I don't know. I think backprop is probably utilized a lot in biological
networks. Isn't that why we take tests in school? Obviously backprop doesn't
make sense in an unsupervised setting. There's no label to backprop on.

But here's an example:

I see a stove eye is black when cold, then when I see it turn red, I touch it.
Ow, it hurts. That's supervised learning. Don't touch things that are glowing
red when they don't normally glow red. Now, I see an iron pole glowing red.
It's not a spiral like the stove eye, but it's not normally glowing red and
now it is. I'd better not touch it. I can deduce that these two objects are
made out of a similar material since they're normally black or gray and now
glow red. I can also deduce that something glowing red means it's hot. That's
unsupervised learning.

To me, unsupervised learning is looking at a fluffy object with four legs,
eyes, a nose, tail, it moves, etc. and knowing that it's some kind of animal.
Creating groups of things like k-means. Supervised learning is your mom
telling you that this one example is a tiger. Unsupervised learning is
understanding the delta between your one labeled example and the rest of the
examples in your animal group is largest when the animal is NOT a tiger.

~~~
31reasons
I think Unsupervised Learning is fundamentally connected with the mission/goal
of the AI. Whatever the mission is give to AI, it need to start learning
information landscape by itself and make classification and use that knowledge
to make predictions and act accordingly to optimized the outcome of the
mission its given.

~~~
aeorgnoieang
But you can't even 'give an AI a mission' without a model of the world! And,
if you expect the AI to achieve its mission in a way that's acceptable to you,
you better be supplying it with a big enough model for it to reasonably be
able to hit that tiny target in the space of possibilities.

Take self-driving cars as an example. Do you not think that any AI that will
be able to do this will be essentially 'given' a huge set of knowledge before
it's expected to learn anything by itself?

------
tchitra
I should point out that OpenAI put out a blog post / paper suggesting that
genetic algorithms can perform similarly to backpropagation for
Atari/Q-Learning tasks:

[https://blog.openai.com/evolution-
strategies/](https://blog.openai.com/evolution-strategies/)

------
meri_dian
It's time we begin to incorporate meso to large scale structure into neural
nets. The human brain is able to accomplish what it can because of the way it
organizes neurons into clusters and sub organs (thalamus, Broca's area etc.)

------
tehramz
I think it's silly to think that modeling AI after how a brain works is silly.
We happened to develop the brain we have over many random iterations over
billions of years. I don't understand why we think the human brain is the
optimal model or method for intelligence. I guess it gives us something we
kinda sorta understand to use as an example, but it seems silly to think our
brain and how it works is a good model.

------
bluetwo
I agree with the article. Throwing more cycles at a problem using the existing
models isn't that answer. Better, more efficient models are needed.

And yes, I'm working toward that goal.

~~~
uoaei
Can you share something about your work? Or how I may get involved?

~~~
bluetwo
Not yet ready to share, but I will put out a bunch a data when the time is
right.

The hardest part is finding interesting challenges for the AI to conquer.

------
billconan
there was this tutorial on how to implement a more biological accurate neuron
in javascript.

[https://medium.com/javascript-scene/how-to-build-a-neuron-
ex...](https://medium.com/javascript-scene/how-to-build-a-neuron-exploring-ai-
in-javascript-pt-2-2f2acb9747ed)

I asked the author how to put it into use, how to train it? because clearly
this biological neuron is not a smooth function. but I didn't get any
response.

------
XR0CSWV3h3kZWg
Does anyone have a more substantive link?

------
masterponomo
They also need to start over on all that wiring behind Dr Hinton in the photo.

------
EGreg
I like this guy!

Yes back-propagation can help classify and predict many things.

But semantic information needs more research. For now the state of the art is
Cyc.

