
Why Robot Brains Need Symbols - zwieback
http://nautil.us/issue/67/reboot/why-robot-brains-need-symbols
======
d--b
My case against hardcoding some kind of symbolic logic within the architecture
of an AI model is that there won't be a way to challenge the symbols as the
brain does.

When I think "the house is red", I know what it means very well, but I'm also
able to doubt or modulate my understanding of the symbols.

These conversations would be hard to put in symbols:

    
    
        - This house is red
        - No! It's crimson!
        - But crimson is red!
    

Or

    
    
        - The house is red
        - No! it's green!
        - Nah, it's red, you're colorblind
    

Brain logic is MUCH fuzzier than symbolic logic. Symbols exist, sure, but
they're part of a bigger logical soup. And I believe that no low level logical
circuit exists in the brain (which also explains why humans are so slow at
logic, calculation and so on).

Maybe symbolic logic could be a intermediate step for some applications. Or
maybe humans could be much smarter if they had access to some symbolic logic
processing unit. But as far as research is involved, I think there is way more
to gain if we managed to have symbolic logic emerge from deep neural nets.

In fact, you could argue that Alpha Go definitely developed some kind of
symbolic logic, especially in the end-game, where there's little intuition and
way more calculation.

~~~
YeGoblynQueenne
>> My case against hardcoding some kind of symbolic logic within the
architecture of an AI model is that there won't be a way to challenge the
symbols as the brain does.

Who said anything about hardcoding anything? Marcus is advocating for the use
of gradient descent to _learn_ symbols- he even cites the DeepMind paper on
θILP, a differentiable Inductive Logic Programming system that learns symbolic
rules with deep learning.

~~~
d--b
From the article

> Their solution? A hybrid model that vastly outperformed what a purely deep
> net would have done, incorporating both back-propagation and (continuous
> versions) of the primitives of symbol manipulation, including both explicit
> variables and operations over variables

This is what I called "hardcoding some kind of symbolic logic". Is it not?

~~~
YeGoblynQueenne
I don't understand what you mean. Where's the hard-coding?

------
Esai
AI development has hit a plateau in terms of performance and funding
saturation. All investment is now expected to deliver some semblance of market
performance before the great Margin Call of 2019 hits, which means scaling out
AI to absurd domains and deploying vast armies of surplus humans to either
code or emulate profitable-enough behavior to pass the risk on to the next
investor round.

In the end, the Mechanical Turk has gone worldwide with no brakes.

~~~
buboard
What is the evidence for a plateau? The corporate temperaments are irrelevant
to the actual progress of the field. If anything , ai systems have not even
started being deployed , and the field is even missing a proper theoretical
framework. this should indicate that it's still very early stage

~~~
Esai
There's a long history of cycles in this field.

[https://en.wikipedia.org/wiki/AI_winter](https://en.wikipedia.org/wiki/AI_winter)

The plateau comes from the problems with vanishing gradients.

[https://en.wikipedia.org/wiki/Vanishing_gradient_problem](https://en.wikipedia.org/wiki/Vanishing_gradient_problem)

------
opless
He's right about LeCun. Yann really does not like Deep Learning being
criticized.

Please someone who knows much more than me about deep learning tell me how a
deep learning ai can explain how it came to an answer.

It can't. By definition.

You can say that it's a equation with a lot of terms, but that isn't really
showing your working out.

You need a symbolic system to show that working out.

This isn't to say that deep learning is worthless, just not worth the amount
of column inches it's getting at the moment.

~~~
ben_w
> Please someone who knows much more than me about deep learning tell me how a
> deep learning ai can explain how it came to an answer.

> It can't. By definition.

What definition are you using?

From the my understanding, if you can explain your reasoning, then a deep
learning system can in principle learn how to explain its reasoning. If it
couldn’t, you couldn’t either.

~~~
opless
We're not deep learning systems. So reasoning by analogy there isn't correct.

(Aside: There have been experiments that show that many explanation for
[everyday] actions are, in fact made up after the decision to execute an
action. Which gives some insight to the whole nature of free will - but that's
another discussion entirely. However, indeed we're able to show our workings
for higher level workings, like mathematical proofs. )

Deep learning systems, are examples of supervised learning. Once trained, they
cannot adapt to new input. You have to retrain to accommodate new features
that you want to capture.

Deep learning systems, are in essence a very long equation that hides all
meaning over the entire network - good luck in finding out why it came to a
decision. You might get lucky and see a definitive triggering path through the
network, but that's an exception to the rule, as far as I understand. (I'm
willing to be proven wrong here.)

Symbolic systems, can trivially be made to trace the code execution paths they
made for any action they might take.

(also - hi Ben! I recognise your blog!)

~~~
FakeComments
I don’t think we’re able to show our workings for higher level things like
proofs.

For instance, I can’t tell you (accurately) how I composed the proof
conceptually, ie what strategy it takes, or even determined the appropriate
steps, ie how I determined to use a particular tactic.

I can tell you the premises lead to the conclusion via a chain of reasoning,
but that’s the artifact of my thought process, not anything about how it
works. The output of proving is that chain of steps.

So I think here is somewhere DL systems may have an advantage: they actually
can freely introspect their thought process as part of their thought process.

~~~
mindgam3
"I don’t think we’re able to show our workings for higher level things like
proofs."

I can't speak for proofs, but this is false in the case of higher level things
like chess. I'm not a grandmaster, but I was a chess master at age 10 and I am
ranked ~2400 in bullet chess. To the average person my ability to play chess
is "magic". But to me it's not magic at all. I can explain my thought process
at any time. It's all based on symbolic manipulation at progressively higher
levels, i.e. clustering pieces into "chunks", connecting these chunks into
higher level patterns like weak pawn structure or forks, and ultimately
deciding on the best course of action by weighing all of the different high
level patterns. Every step is rule-based logic which I can readily explain to
anyone, even a chess novice. The part that appears "magic" is the ability to
do all of these calculations in the blink of an eye. But that too is simply
due to having trained so many of these patterns extensively at a young age.
Anyone who can speak a language is doing the same thing, manipulating complex
symbolic objects in real time at progressively higher levels (i.e. letters,
words, sentences, paragraphs).

~~~
FakeComments
There’s chess theory, which is rule based and what you start off describing.

But then you admit you don’t work directly with chess theory when selecting
moves, there’s a trained black box evaluator that selects candidate moves,
which you then select from via chess theory.

That’s how you’re finding chess moves in the blink of an eye: you run a fuzzy
approximation, then refine the results using higher level reasoning. But you
don’t have access to the network doing the evaluation and can’t describe
exactly how it operates, just that it was trained on chess theory.

It’s that fuzzy reasoning to speed up the process of actually finding
solutions that I was calling out as the source of the unknowns in our
processing — and at least from my exposure to board games (and their players),
it’s often the source of things like innovative moves.

~~~
mindgam3
"there’s a trained black box evaluator that selects candidate moves, which you
then select from via chess theory"

You're missing my point. There is no "trained black box evaluator." There is
indeed a trained evaluator, but it is not a black box. It is fully
understandable. If I gave you private chess lessons, I could teach you my
heuristics. And eventually you would understand them enough to be able to
teach them to others. This would not be possible if it were truly a black box.

------
afpx
This is something that’s been a challenge to my team for a long time. For us,
machine learning has produced very good inferences, but it doesn’t create
models of how the universe works. For example, if we input a lot of raw
weather data into a statistical model, we can predict how it may affect the
power output of a solar array. But, I can’t get the machine to ‘understand’
that clouds decrease solar availability.

------
laichzeit0
I really don't get Gary Marcus' fundamental problem. He ends the essay with
"All I am saying is to give Ps (and Qs) a chance.". But who exactly is this
aimed at? Surely no one is being held academically ransom to Deep Learning. If
Ps (and Qs) is a better approach then do the research and publish the results.
Instead he just seems to whine about it on Twitter all day and attack people
who are actually "doing the research and publishing the results."

~~~
YeGoblynQueenne
>> He ends the essay with "All I am saying is to give Ps (and Qs) a chance.".
But who exactly is this aimed at?

It's a slightly garbled (and so not immediately recognisable, perhaps) pun on
the verse _" All that we're saying/ is give peace a chance"_ from John
Lennon's "Give Peace a Chance".

So it's not addressed to anyone- it's just his attempt at injecting a bit of
levity in the debate.

EDIT:

>> If Ps (and Qs) is a better approach then do the research and publish the
results.

Well, people have done that, yes. For example, the Evans and Grefenstette
paper Marcus' article cites towards the end has shown in a very clear manner
the power of combining symbolic with sub-symbolic approaches, as has the work
of, off the top of my head, Artur D'avila Garcez (neuro-symbolic computation),
Luciano Serafini (a differentiable logic), and many others.

And yet, my intuition at least is that most people who have heard about Deep
Learning, haven't heard about that work.

~~~
laichzeit0
This comes across a bit like the Tanenbaum–Torvalds monolithic/microkernel
debate in the 90s. Tanenbaum in this case is Marcus and Torvalds is LeCunn.
There are so many well established benchmarks out there. Just build whatever
this symbolic system is that you're advocating and beat the benchmarks. Or
maybe create your own benchmarks and then we can also apply Deep Learning to
those problems too and see who wins? Until then, no one really has to pay any
serious attention to these clamourings.

~~~
YeGoblynQueenne
That's what the work I mentioned above does- it beats benchmarks and shows
impressive results. The work by Evans and Grefenstette in particular is all
out of chewing gum [1].

Somewhat depressingly, your comment makes me guess that you really think that
none of that work has actually been done, that it's somehow all theoretical or
speculative, or just a lot of twitter talk by Gary Marcus. That couldn't be
farther from the truth.

It's really the case that many people have no idea about AI beyond a quick
article in the popular press here and there, and despite this, they have
already formed very strong opinions about it.

_________________

[1] Here:

[https://deepmind.com/blog/learning-explanatory-rules-
noisy-d...](https://deepmind.com/blog/learning-explanatory-rules-noisy-data/)

Note that the above overview of their paper is published on DeepMind's website
(the two are DeepMind employees). You can rest assured that DeepMind in
particular, would never support the publication of a paper that doesn't
demonstrate impressive results.

------
mindgam3
Robot brains need symbols because the rhetoric around deep learning is getting
out control. In the video analysis below of the latest AlphaZero vs Stockfish
paper, AlphaZero is described as "DeepMind's general-purpose artificial
intelligence system".

Let's give credit where it's due. AlphaZero is an impressive algorithm with
results in 3 different types of perfect-information games: chess, go, shogi.
But to describe it as "general purpose AI" is simply absurd. Yes, this isn't
an official Deep Mind video, but this is the kind of rhetoric they are putting
it out there.

I'm open to the possibility that deep learning might one day solve some of its
core problems (like the elementary schoolbus-snowplow errors described in the
source article) and turn into a general purpose AI. But we aren't there yet,
and we're not even close.

[https://www.youtube.com/watch?time_continue=13&v=2-wFUdvKTVQ](https://www.youtube.com/watch?time_continue=13&v=2-wFUdvKTVQ)

~~~
soVeryTired
I think there's some ambiguity there regarding _what_ is general-purpose: if
you want to be charitable, you might interpret the description as saying it's
a general-purpose system rather than a general-purpose AI.

Who knows what they really meant though...

~~~
mindgam3
My point is, calling a system that learned how to play 3 types of board games
"general purpose" in any sense is a bit of a stretch. At best this is a
"general purpose games-playing" system.

------
317070
No, they don't.

Now, feel free to _show_ me that they do, and I will gladly accept to have
been wrong. But this argument that because of some very theoretical view on
the problem the current engineering solutions should be abandoned, without
actually providing good engineering alternatives, is weird.

I have a bit of a "not even wrong" feeling on the symbolist side of the
debate. Deep neural networks have serious flaws, but they can at least be
pointed out due to them already solving engineering problems. It is easy to
say a field should move to your direction, but they will probably only do it
when provided with tangible evidence. AI has moved into the realm of
application, and that means the scientific bar for purely theoretical ideas
has been raised compared to 20 years ago. And right now, the tangible evidence
for symbolism is simply not cutting it.

~~~
vanderZwan
> _And right now, the tangible evidence for symbolism is simply not cutting
> it._

In AI/ML perhaps. In the biological and psychological sciences the importance
of symbols to cognitive functioning is pretty well established.

Just look at how important the mastery of language is for complicated thought.
One of the main differences between linguistic thought and other forms of
thinking is all about symbol manipulation.

(This says nothing about the quality of this article of course)

~~~
FakeComments
It’s also pretty well known in math and CS/SE communities.

Heck, the mathematical research into the power of language, and in particular
the ability to embed models of reasoning into equations to allow you to
formalize your meta-reasoning, is what led to the birth of electric computers.

So I find it shallow skepticism to suggest there’s not evidence that it would
be useful to AI, because literally the most cursory glance would show mounds
of it.

The real challenge is and always has been how to combine or embed symbolic
reasoning capabilities efficiently within a fuzzier approximator, allowing for
“getting close” in a fuzzy way then fine tuning it using more advanced rules.

That mimics more closely how we think, where we broadly intuit a few potential
answers, then use higher order thinking to sift through them.

~~~
vanderZwan
> _The real challenge is and always has been how to combine or embed symbolic
> reasoning capabilities efficiently within a fuzzier approximator, allowing
> for “getting close” in a fuzzy way then fine tuning it using more advanced
> rules._

Well, maybe the problem is that we spent so much time basing models on the
visual part of the brain that we overlooked the rest:

[https://www.quantamagazine.org/new-ai-strategy-mimics-how-
br...](https://www.quantamagazine.org/new-ai-strategy-mimics-how-brains-learn-
to-smell-20180918/)

------
ajuc
This reads like 19-th century engineer arguing we need articulated wings for
heavier-than-air flight because our current methods doesn't work and birds
have them :)

------
polkapolka
Symbolic AI is machine programming. Connectionist AI is machine learning.

Machine programming simply does not scale. It is also not biologically
plausible: it is not as if God put symbols into our brain, these formed, are
prefaced by, and were learned from neural activity.

Just try to solve the spam problem using symbolic AI. It will keep you busy
(and paid) for a long time, while yielding subpar results.

Deep learning is the most promising direction AI went into since a long time.
Finding flaws in DL does not point back to a programmer crafting handwritten
rules to correct it. It merely points to more DL research needed to have
machines correct this themselves. Preferrably differentiable.

~~~
giardini
polkapolka says> _" Machine programming simply does not scale....Just try to
solve the spam problem using symbolic AI. It will keep you busy (and paid) for
a long time, while yielding subpar results."_

1\. We can scale "machine programming" by making computers faster and more
complex,

2\. The spam problem can be solved with Bayesian methods (which I consider to
be part of "machine programming"): a connectionist solution is not necessary.

Certainly connectionist deep learning is exciting and should be investigated
as far as possible but it is only one of many tools. We should avoid being the
man with a hammer who thinks that every problem looks like a nail.

~~~
polkapolka
Faster and more complex computers does not make you faster at manual
programming. Computer vision had this before the DL boom: engineers painfully
crafting feature extractors. It went nowhere.

Bayesian models underperform to DL by a wide margin (though it is a step up
from handwritten rules: if DEAR FRIEND then Spam score++.

~~~
giardini
polkapolka says _> "Faster and more complex computers does not make you faster
at manual programming. Computer vision had this before the DL boom: engineers
painfully crafting feature extractors. It went nowhere."_

Faster and more complex computers make manual programming faster and make
software faster, including DL software. Without the faster computers of today
we wouldn't be using or even discussing DL.

------
mindgam3
Previous discussion:
[https://news.ycombinator.com/item?id=18639359](https://news.ycombinator.com/item?id=18639359)

------
cs702
Actually, deep learning models do use symbols.

We call these symbols "embeddings" or "representations" in a vector or tensor
space... but these symbols may not have any known or readily interpretable
human meaning. People who are uneducated in basic vector (and tensor) math may
have difficulty understanding how this is possible, but it's a fact.

For example, the ELMo embedding of a particular word in a sentence is a
_symbol_ that represents the complex syntax, semantics, and polysemy of that
word given its location and use in the text.[a] Similarly, a trained WaveNet
model represents a sequence of text as a (fairly large) tensor embedding that
encodes the information necessary for generating the corresponding speech;
this embedding is a _symbol_ too.[b] As a final example, the next-to-last
layer's embedding of an image processed through a trained Detectron model
contains representations of every identified object in the image; those
representations are _symbols_ too.[c]

Moreover, the kind of automatic differentiation that is necessary for training
deep learning models with backprop reduces into _manipulation of symbols in a
tree_ , viewed through the right lens. Witness, for example, the recent work
done by Mike Innes et al for the Julia language.[d] We could very well be in
the early stages of adoption of "deep differentiable programming" for a
growing number of cognitive tasks that might not be solvable via _human_
manipulation of symbols.

The challenge for deep learning going forward, as Bengio says and Marcus
quotes, is in figuring out how to " _extend_ it to do things like reasoning,
learning, causality, and exploring the world."[e] As far as I can tell,
everyone in the field already agrees on these goals; many researchers _are_
thinking about and wondering how it might be possible to pursue them.

Does Marcus understand this? His complaining about the "lack of symbols" is...
misguided, I think.

[a] [https://allennlp.org/elmo](https://allennlp.org/elmo)

[b] [https://deepmind.com/blog/wavenet-generative-model-raw-
audio...](https://deepmind.com/blog/wavenet-generative-model-raw-audio/)

[c]
[https://research.fb.com/downloads/detectron/](https://research.fb.com/downloads/detectron/)

[d] [https://julialang.org/blog/2018/12/ml-language-
compiler](https://julialang.org/blog/2018/12/ml-language-compiler)

[e]
[https://twitter.com/GaryMarcus/status/1065280340669816832](https://twitter.com/GaryMarcus/status/1065280340669816832)

------
mrcoder111
A symbol is just a frozen weight in a neural network.

