Hacker News new | comments | ask | show | jobs | submit login
Why Robot Brains Need Symbols (nautil.us)
88 points by zwieback 42 days ago | hide | past | web | favorite | 66 comments



My case against hardcoding some kind of symbolic logic within the architecture of an AI model is that there won't be a way to challenge the symbols as the brain does.

When I think "the house is red", I know what it means very well, but I'm also able to doubt or modulate my understanding of the symbols.

These conversations would be hard to put in symbols:

    - This house is red
    - No! It's crimson!
    - But crimson is red!
Or

    - The house is red
    - No! it's green!
    - Nah, it's red, you're colorblind
Brain logic is MUCH fuzzier than symbolic logic. Symbols exist, sure, but they're part of a bigger logical soup. And I believe that no low level logical circuit exists in the brain (which also explains why humans are so slow at logic, calculation and so on).

Maybe symbolic logic could be a intermediate step for some applications. Or maybe humans could be much smarter if they had access to some symbolic logic processing unit. But as far as research is involved, I think there is way more to gain if we managed to have symbolic logic emerge from deep neural nets.

In fact, you could argue that Alpha Go definitely developed some kind of symbolic logic, especially in the end-game, where there's little intuition and way more calculation.


>> My case against hardcoding some kind of symbolic logic within the architecture of an AI model is that there won't be a way to challenge the symbols as the brain does.

Who said anything about hardcoding anything? Marcus is advocating for the use of gradient descent to learn symbols- he even cites the DeepMind paper on θILP, a differentiable Inductive Logic Programming system that learns symbolic rules with deep learning.


From the article

> Their solution? A hybrid model that vastly outperformed what a purely deep net would have done, incorporating both back-propagation and (continuous versions) of the primitives of symbol manipulation, including both explicit variables and operations over variables

This is what I called "hardcoding some kind of symbolic logic". Is it not?


I don't understand what you mean. Where's the hard-coding?


There is a small CS/AI community that works on "argumentation", putting exactly the problems you describe in symbols. For anyone who is interested in the details, there are several books on the topic, for example the "Handbook of Formal Argumentation" by Baroni et al. I concede that formal argumentation is--as far as I know--still waiting to be applied in large-scale scenarios.


AI development has hit a plateau in terms of performance and funding saturation. All investment is now expected to deliver some semblance of market performance before the great Margin Call of 2019 hits, which means scaling out AI to absurd domains and deploying vast armies of surplus humans to either code or emulate profitable-enough behavior to pass the risk on to the next investor round.

In the end, the Mechanical Turk has gone worldwide with no brakes.


What is the evidence for a plateau? The corporate temperaments are irrelevant to the actual progress of the field. If anything , ai systems have not even started being deployed , and the field is even missing a proper theoretical framework. this should indicate that it's still very early stage


There's a long history of cycles in this field.

https://en.wikipedia.org/wiki/AI_winter

The plateau comes from the problems with vanishing gradients.

https://en.wikipedia.org/wiki/Vanishing_gradient_problem


He's right about LeCun. Yann really does not like Deep Learning being criticized.

Please someone who knows much more than me about deep learning tell me how a deep learning ai can explain how it came to an answer.

It can't. By definition.

You can say that it's a equation with a lot of terms, but that isn't really showing your working out.

You need a symbolic system to show that working out.

This isn't to say that deep learning is worthless, just not worth the amount of column inches it's getting at the moment.


A trained neural network is a model, just like any other kind of model. If there is any overall property of the data it learned, you can coerce it from the network just like you can read them from any mathematical model. (What is, for some properties, very easily, and for others it's very hard.)

An AI can always explain how it came to an answer. But the explanation can be like "there are way too many factors" or "things are like this just because of those other things, if those changed, the explanation would too". But those are not an attribute of the neural networks, but of the problem they are solving. If you get an explanation like that for a problem that has simple explanations, it means you failed at training your network.


> Please someone who knows much more than me about deep learning tell me how a deep learning ai can explain how it came to an answer.

> It can't. By definition.

What definition are you using?

From the my understanding, if you can explain your reasoning, then a deep learning system can in principle learn how to explain its reasoning. If it couldn’t, you couldn’t either.


We're not deep learning systems. So reasoning by analogy there isn't correct.

(Aside: There have been experiments that show that many explanation for [everyday] actions are, in fact made up after the decision to execute an action. Which gives some insight to the whole nature of free will - but that's another discussion entirely. However, indeed we're able to show our workings for higher level workings, like mathematical proofs. )

Deep learning systems, are examples of supervised learning. Once trained, they cannot adapt to new input. You have to retrain to accommodate new features that you want to capture.

Deep learning systems, are in essence a very long equation that hides all meaning over the entire network - good luck in finding out why it came to a decision. You might get lucky and see a definitive triggering path through the network, but that's an exception to the rule, as far as I understand. (I'm willing to be proven wrong here.)

Symbolic systems, can trivially be made to trace the code execution paths they made for any action they might take.

(also - hi Ben! I recognise your blog!)


I don’t think we’re able to show our workings for higher level things like proofs.

For instance, I can’t tell you (accurately) how I composed the proof conceptually, ie what strategy it takes, or even determined the appropriate steps, ie how I determined to use a particular tactic.

I can tell you the premises lead to the conclusion via a chain of reasoning, but that’s the artifact of my thought process, not anything about how it works. The output of proving is that chain of steps.

So I think here is somewhere DL systems may have an advantage: they actually can freely introspect their thought process as part of their thought process.


"I don’t think we’re able to show our workings for higher level things like proofs."

I can't speak for proofs, but this is false in the case of higher level things like chess. I'm not a grandmaster, but I was a chess master at age 10 and I am ranked ~2400 in bullet chess. To the average person my ability to play chess is "magic". But to me it's not magic at all. I can explain my thought process at any time. It's all based on symbolic manipulation at progressively higher levels, i.e. clustering pieces into "chunks", connecting these chunks into higher level patterns like weak pawn structure or forks, and ultimately deciding on the best course of action by weighing all of the different high level patterns. Every step is rule-based logic which I can readily explain to anyone, even a chess novice. The part that appears "magic" is the ability to do all of these calculations in the blink of an eye. But that too is simply due to having trained so many of these patterns extensively at a young age. Anyone who can speak a language is doing the same thing, manipulating complex symbolic objects in real time at progressively higher levels (i.e. letters, words, sentences, paragraphs).


There’s chess theory, which is rule based and what you start off describing.

But then you admit you don’t work directly with chess theory when selecting moves, there’s a trained black box evaluator that selects candidate moves, which you then select from via chess theory.

That’s how you’re finding chess moves in the blink of an eye: you run a fuzzy approximation, then refine the results using higher level reasoning. But you don’t have access to the network doing the evaluation and can’t describe exactly how it operates, just that it was trained on chess theory.

It’s that fuzzy reasoning to speed up the process of actually finding solutions that I was calling out as the source of the unknowns in our processing — and at least from my exposure to board games (and their players), it’s often the source of things like innovative moves.


"there’s a trained black box evaluator that selects candidate moves, which you then select from via chess theory"

You're missing my point. There is no "trained black box evaluator." There is indeed a trained evaluator, but it is not a black box. It is fully understandable. If I gave you private chess lessons, I could teach you my heuristics. And eventually you would understand them enough to be able to teach them to others. This would not be possible if it were truly a black box.


I don't think this refutes the argument. You've just explained that the process by which you play chess is by calculation. In essence, you're doing something you could program a machine to do (without any learning component).

The question is: how do you explain how you have an original idea, or learn something you previously had no conceptualisation of.

I think chess is a poor example at a professional level, because players have learnt mainly from analysis of other's techniques, rather than mostly making their own inferences. How a novice plays chess, inventing as they go, because they cannot draw on extensive experience, would be a better analogy, in my opinion.


"how do you explain how you have an original idea, or learn something you previously had no conceptualisation of"

This is like "where does your chess intuition come from" - and what I'm trying to express is that my chess intuition isn't actually a black box. There is a set of heuristics which I follow in order to come up with these "original ideas". Some of these I learned from watching other great players. Others I learned through trial and error, by playing tons of games, just like AlphaZero. The difference between me and AlphaZero is I can describe these heuristics conceptually in a way that anyone can understand, without resorting to "it's because this move tended to work in a large portion of the million games I played".


> they actually can freely introspect their thought process as part of their thought process.

Do you mean that they can in principle? I don't know existing systems, which can inspect their own weights and output a vector of confidence scores that they can recognize classes A, B, C, ...


output a vector of confidence scores that they can recognize classes A, B, C, ...

This is literally what neural networks do when classifying patterns.


They classify patterns, not their own ability to classify patterns.


Again, they literally produce confidence scores - probabilities that each prediction is correct.

For example, say there are 3 classes, and the network is shown two different examples of class 2. Say it outputs a class probability distribution {0.02, 0.45, 0.43} for the first example, and {0.02, 0.9, 0.08} for the second. Even though in both cases it correctly identifies class 2, it's a lot more confident in its prediction in the second case.


I just say that there's no introspection in existing networks. There's no part, which can take the weights of a convolutional part of a network and images of a certain class, and then output confidence score of the convolutional part classification ability for that class.

I don't say that it is possible or useful. I said that I don't know of any deep learning systems, which "[...] can freely introspect their thought process as part of their thought process."


> We're not deep learning systems. So reasoning by analogy there isn't correct.

That’s why I’m asking which definition you’re using. It’s broad enough that some models are biologically plausible even though there is criticism that many others are not: https://arxiv.org/abs/1502.04156

> Deep learning systems, are examples of supervised learning. Once trained, they cannot adapt to new input. You have to retrain to accommodate new features that you want to capture.

Supervised, unsupervised, and reinforcement; and there is work on both incremental learning (https://arxiv.org/abs/1712.02719) and avoiding catastrophic forgetting (https://arxiv.org/abs/1812.02464)

> (also - hi Ben! I recognise your blog!)

Oh no, I’m becoming famous. (Infamous?) :)


ooh interesting links -thanks!

I'll read them later.

Famous-ish :)

We probably have similar views on brexit.


Humans can almost always give an explanation judgement but, for intuitive judgements at least, the reasons we give are frequently wrong or incomplete. People seldom notice that we rate essayists with more symmetric faces as better at writing but we have this bias and tons more besides.

But I don't think that there's any reason to doubt that when using abstract reasoning, Kahneman's system 2, our explanations of our thought processes can't be accurate. It's system 1 that's opaque to accurate introspection.


Aren't we - as a group - better in this? I'm thinking about the scientific method, and the common differentiation there between the context of discovery and the context of justification.

So you are free to discover a mathematical theorem by intuition or during daydreaming (like the molecular structure of benzene) - anything goes.

But in the context of justification, you have to be much more disciplined - even if you trust your own intuitions and don't need to persuade yourself (of course generally you should not trust your intuition in scientific context that much;), you have to persuade others who are missing your insight.

edit: And of course - in the end - to accept something as a true A.I. system, it probably has to persuade us that its intuitions are right, using the type justifications we accept. (So even if it has better types of justification - if we won't get it, they has to dumb it down to our level of scientific understanding)


We are verifying the results of our opaque brain processes. We cannot yet verify that our explanations of those processes are correct.


Given some results regarding brain damage we might just be rationalizing afterwards separate from the problem solving. Severed corpus callosum I believe it was.

Or that effect might just be how a brain keeps functioning after losing some pieces - better off evolutionarily have data integrity issues than shut down if a checksum fails.

I believe that ML can technically give you an answer but it ammounts to a summation of its training set and how it parses the data. Which isn't too readable to even mathematicians and leaves it vulnerable to stuff like adding a small pixel cloud to cause it to see a toy turtle as an assault rifle. Our errors certainly exist for visual processing but they tend to be more distorion than that.


Most people can’t explain why they make most of their decisions, really. It’s really only a very small subset of our activity that we are capable of explaining logically.

What’s the chain of reasoning behind me falling in love and marrying my wife? What’s the chain of reasoning behind why I watch the great British bake off compulsively? What’s the chain of reasoning that leads an alcoholic to drink time and time again. What’s the chain of reasoning behind why I look at a rainbow and see bands of distinct colors? What’s the chain of reasoning behind why I paused and scratched my nose just now?


> Please someone who knows much more than me about deep learning tell me how a deep learning ai can explain how it came to an answer.

A mouse can't do that either. Yet we are unable to mimic mouse intelligence with any of our current tools.


This is something that’s been a challenge to my team for a long time. For us, machine learning has produced very good inferences, but it doesn’t create models of how the universe works. For example, if we input a lot of raw weather data into a statistical model, we can predict how it may affect the power output of a solar array. But, I can’t get the machine to ‘understand’ that clouds decrease solar availability.


I really don't get Gary Marcus' fundamental problem. He ends the essay with "All I am saying is to give Ps (and Qs) a chance.". But who exactly is this aimed at? Surely no one is being held academically ransom to Deep Learning. If Ps (and Qs) is a better approach then do the research and publish the results. Instead he just seems to whine about it on Twitter all day and attack people who are actually "doing the research and publishing the results."


>> He ends the essay with "All I am saying is to give Ps (and Qs) a chance.". But who exactly is this aimed at?

It's a slightly garbled (and so not immediately recognisable, perhaps) pun on the verse "All that we're saying/ is give peace a chance" from John Lennon's "Give Peace a Chance".

So it's not addressed to anyone- it's just his attempt at injecting a bit of levity in the debate.

EDIT:

>> If Ps (and Qs) is a better approach then do the research and publish the results.

Well, people have done that, yes. For example, the Evans and Grefenstette paper Marcus' article cites towards the end has shown in a very clear manner the power of combining symbolic with sub-symbolic approaches, as has the work of, off the top of my head, Artur D'avila Garcez (neuro-symbolic computation), Luciano Serafini (a differentiable logic), and many others.

And yet, my intuition at least is that most people who have heard about Deep Learning, haven't heard about that work.


This comes across a bit like the Tanenbaum–Torvalds monolithic/microkernel debate in the 90s. Tanenbaum in this case is Marcus and Torvalds is LeCunn. There are so many well established benchmarks out there. Just build whatever this symbolic system is that you're advocating and beat the benchmarks. Or maybe create your own benchmarks and then we can also apply Deep Learning to those problems too and see who wins? Until then, no one really has to pay any serious attention to these clamourings.


That's what the work I mentioned above does- it beats benchmarks and shows impressive results. The work by Evans and Grefenstette in particular is all out of chewing gum [1].

Somewhat depressingly, your comment makes me guess that you really think that none of that work has actually been done, that it's somehow all theoretical or speculative, or just a lot of twitter talk by Gary Marcus. That couldn't be farther from the truth.

It's really the case that many people have no idea about AI beyond a quick article in the popular press here and there, and despite this, they have already formed very strong opinions about it.

_________________

[1] Here:

https://deepmind.com/blog/learning-explanatory-rules-noisy-d...

Note that the above overview of their paper is published on DeepMind's website (the two are DeepMind employees). You can rest assured that DeepMind in particular, would never support the publication of a paper that doesn't demonstrate impressive results.


Isn’t the kind of programming that most of us do every single day just symbol manipulation of the kind he describes?

On this site, I click reply, the web server interprets my intention and presents me with a text box, etc.

There’s a level of intelligence in pretty much any computer program. I think we take it for granted now because it’s so engrained in our data to day lives, but it’s remarkable isn’t it?

Just look at how many decisions take place entirely within the ‘minds’ of computers when you try and buy a product on amazon and have it shipped to you.

Everything from the pricing of the goods, to approving your credit worthiness to the logistics of getting it to your house is decided on by computers.

All of that would have to have been done by human beings not very long ago.

If anything, symbol manipulation has become such a dominant paradigm for artificial intelligence that we don’t actually recognize any of those activities as requiring intelligence any more — which is always the case every time we mechanize thought.

The question is really whether either of those paradigms gets us to a “general” ai (whatever that is), or both, or neither.


Gary isn't a technologist, but he wants to set the stage with critique so that when/if a fusion of the ideas does better he can claim to have been important to that, regardless of the fact that the idea is obvious, and the implementation is what is hard


This is an article in a popular magazine, and there is no “problem” nor any “whining”. Except here in the comments, oddly!


As stated in the article, Marcus' fundamental problem is:

1) "the notion that deep learning is without demonstrable limits and might, all by itself, get us to general intelligence, if we just give it a little more time and a little more data, as captured in a 2016 suggestion by Andrew Ng"

and

2) "Leaders in AI like LeCun acknowledge that there must be some limits, in some vague way, but rarely (and this is why Bengio’s new report was so noteworthy) do they pinpoint what those limits are, beyond to acknowledge its data-hungry nature."


Robot brains need symbols because the rhetoric around deep learning is getting out control. In the video analysis below of the latest AlphaZero vs Stockfish paper, AlphaZero is described as "DeepMind's general-purpose artificial intelligence system".

Let's give credit where it's due. AlphaZero is an impressive algorithm with results in 3 different types of perfect-information games: chess, go, shogi. But to describe it as "general purpose AI" is simply absurd. Yes, this isn't an official Deep Mind video, but this is the kind of rhetoric they are putting it out there.

I'm open to the possibility that deep learning might one day solve some of its core problems (like the elementary schoolbus-snowplow errors described in the source article) and turn into a general purpose AI. But we aren't there yet, and we're not even close.

https://www.youtube.com/watch?time_continue=13&v=2-wFUdvKTVQ


I think there's some ambiguity there regarding what is general-purpose: if you want to be charitable, you might interpret the description as saying it's a general-purpose system rather than a general-purpose AI.

Who knows what they really meant though...


My point is, calling a system that learned how to play 3 types of board games "general purpose" in any sense is a bit of a stretch. At best this is a "general purpose games-playing" system.


No, they don't.

Now, feel free to _show_ me that they do, and I will gladly accept to have been wrong. But this argument that because of some very theoretical view on the problem the current engineering solutions should be abandoned, without actually providing good engineering alternatives, is weird.

I have a bit of a "not even wrong" feeling on the symbolist side of the debate. Deep neural networks have serious flaws, but they can at least be pointed out due to them already solving engineering problems. It is easy to say a field should move to your direction, but they will probably only do it when provided with tangible evidence. AI has moved into the realm of application, and that means the scientific bar for purely theoretical ideas has been raised compared to 20 years ago. And right now, the tangible evidence for symbolism is simply not cutting it.


> And right now, the tangible evidence for symbolism is simply not cutting it.

In AI/ML perhaps. In the biological and psychological sciences the importance of symbols to cognitive functioning is pretty well established.

Just look at how important the mastery of language is for complicated thought. One of the main differences between linguistic thought and other forms of thinking is all about symbol manipulation.

(This says nothing about the quality of this article of course)


It’s also pretty well known in math and CS/SE communities.

Heck, the mathematical research into the power of language, and in particular the ability to embed models of reasoning into equations to allow you to formalize your meta-reasoning, is what led to the birth of electric computers.

So I find it shallow skepticism to suggest there’s not evidence that it would be useful to AI, because literally the most cursory glance would show mounds of it.

The real challenge is and always has been how to combine or embed symbolic reasoning capabilities efficiently within a fuzzier approximator, allowing for “getting close” in a fuzzy way then fine tuning it using more advanced rules.

That mimics more closely how we think, where we broadly intuit a few potential answers, then use higher order thinking to sift through them.


> The real challenge is and always has been how to combine or embed symbolic reasoning capabilities efficiently within a fuzzier approximator, allowing for “getting close” in a fuzzy way then fine tuning it using more advanced rules.

Well, maybe the problem is that we spent so much time basing models on the visual part of the brain that we overlooked the rest:

https://www.quantamagazine.org/new-ai-strategy-mimics-how-br...


Getting tasks done has never been a ringing endorsement for the general applicability of the underlying algorithm.

Meanwhile, the entirety of human communication is built on symbols. If computers can’t deal with them, how useful can computers ultimately be? Roombas can only improve so much.


> But this argument that because of some very theoretical view on the problem the current engineering solutions should be abandoned

That's not the author's argument at all. Even the subtitle of the article says "We’ll need both deep learning and symbol manipulation to build AI.".


A compromise between the two would be to identify powerful abstructions over nodes and nets that are equivalent to symbol manipulation circuits. Self assembly and self modification are certainly not strange to symbolic logic, but this doesn't work so well for most programmers. So we need more geniuses and they need to tailor neural nets to help them build symbolic logic (if only as intermediate debugging output).


I think they need symbols but they just need to be taught symbols.

If you kept child stationary and show it just static photos one at a time you wouldn't be surprised that it can't tell bus lying on the side in the snow from a snow plow.

What neural networks need is the same thing little children need, motion, rotation, things stacking hitting and occluding each other ... and possibly even interaction.

We haven't taught NN to see properly yet and guy wants to hardwire math into them because he thinks that's the only option. Humans learn symbols late and only with deliberate effort.


On the other hand humans have built-in special processing stages already. One that comes to mind is that the neurons in your eyes in effect run a sharpening filter on the incoming light before the data ever enters your brain. I don't think it's a stretch to propose that we should implement certain features as special-purpose processing stages that the neural network can use, with the goal of having a more efficient system.

The holy grail would be a neural network that has programming capabilities and can write bare metal programs for itself.


> On the other hand humans have built-in special processing stages already.

Yes. But there were cases where human could perform function elsewhere that was usually done in special site when thst site got damaged.


If you're willing to accept a (possible) inexplicable intelligence that cannot explain its decisions or reasoning processes then you'd be fine w/o symbolism. In fact there may be an example for you (perhaps in your pocket):

The next time you have a decision to make, phrase it as a Heads/Tails question and flip a coin. Over its time of use, the coin will be right until it is wrong, putting you in exactly the same situation as you would be with an inexplicable intelligence.

So you see, intelligence is not that far away! But explicable predictive intelligence requires reasoned prediction and can answer "Why that prediction?" when asked and can learn if a prediction fails. The coin cannot but will be adequate, at least until it fails sufficiently to induce you to abandon it.


Or to frame your argument in terms of science rather than engineering, it's a falsifiable hypothesis. The author should go try to test it and then let us know if it works.


i’d say the bus picture classification is a pretty telling example that something’s missing in the amount of knowledge deep learning is able to extract from training data sets.

Anybody who understand what a school bus is wouldn’t mistake it for a snow plow in the other pictures.


This reads like 19-th century engineer arguing we need articulated wings for heavier-than-air flight because our current methods doesn't work and birds have them :)


Symbolic AI is machine programming. Connectionist AI is machine learning.

Machine programming simply does not scale. It is also not biologically plausible: it is not as if God put symbols into our brain, these formed, are prefaced by, and were learned from neural activity.

Just try to solve the spam problem using symbolic AI. It will keep you busy (and paid) for a long time, while yielding subpar results.

Deep learning is the most promising direction AI went into since a long time. Finding flaws in DL does not point back to a programmer crafting handwritten rules to correct it. It merely points to more DL research needed to have machines correct this themselves. Preferrably differentiable.


polkapolka says>"Machine programming simply does not scale....Just try to solve the spam problem using symbolic AI. It will keep you busy (and paid) for a long time, while yielding subpar results."

1. We can scale "machine programming" by making computers faster and more complex,

2. The spam problem can be solved with Bayesian methods (which I consider to be part of "machine programming"): a connectionist solution is not necessary.

Certainly connectionist deep learning is exciting and should be investigated as far as possible but it is only one of many tools. We should avoid being the man with a hammer who thinks that every problem looks like a nail.


Faster and more complex computers does not make you faster at manual programming. Computer vision had this before the DL boom: engineers painfully crafting feature extractors. It went nowhere.

Bayesian models underperform to DL by a wide margin (though it is a step up from handwritten rules: if DEAR FRIEND then Spam score++.


polkapolka says >"Faster and more complex computers does not make you faster at manual programming. Computer vision had this before the DL boom: engineers painfully crafting feature extractors. It went nowhere."

Faster and more complex computers make manual programming faster and make software faster, including DL software. Without the faster computers of today we wouldn't be using or even discussing DL.


Why do you think human-style learning can scale? Human brains are terrible at basic arithmetic like raising 3 to the 6th power.


Statistical learning with connectionist architectures is driving current AI at scale.

To me, this paradigm is also the most promising: learn from data bottom-up, not from experts top-down.



Actually, deep learning models do use symbols.

We call these symbols "embeddings" or "representations" in a vector or tensor space... but these symbols may not have any known or readily interpretable human meaning. People who are uneducated in basic vector (and tensor) math may have difficulty understanding how this is possible, but it's a fact.

For example, the ELMo embedding of a particular word in a sentence is a symbol that represents the complex syntax, semantics, and polysemy of that word given its location and use in the text.[a] Similarly, a trained WaveNet model represents a sequence of text as a (fairly large) tensor embedding that encodes the information necessary for generating the corresponding speech; this embedding is a symbol too.[b] As a final example, the next-to-last layer's embedding of an image processed through a trained Detectron model contains representations of every identified object in the image; those representations are symbols too.[c]

Moreover, the kind of automatic differentiation that is necessary for training deep learning models with backprop reduces into manipulation of symbols in a tree, viewed through the right lens. Witness, for example, the recent work done by Mike Innes et al for the Julia language.[d] We could very well be in the early stages of adoption of "deep differentiable programming" for a growing number of cognitive tasks that might not be solvable via human manipulation of symbols.

The challenge for deep learning going forward, as Bengio says and Marcus quotes, is in figuring out how to "extend it to do things like reasoning, learning, causality, and exploring the world."[e] As far as I can tell, everyone in the field already agrees on these goals; many researchers are thinking about and wondering how it might be possible to pursue them.

Does Marcus understand this? His complaining about the "lack of symbols" is... misguided, I think.

[a] https://allennlp.org/elmo

[b] https://deepmind.com/blog/wavenet-generative-model-raw-audio...

[c] https://research.fb.com/downloads/detectron/

[d] https://julialang.org/blog/2018/12/ml-language-compiler

[e] https://twitter.com/GaryMarcus/status/1065280340669816832


A symbol is just a frozen weight in a neural network.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: