
Differentiable Neural Computers - tonybeltramelli
https://deepmind.com/blog/differentiable-neural-computers/
======
rkaplan
This paper builds off of DeepMind's previous work on differentiable
computation: Neural Turing Machines. That paper generated a lot of enthusiasm
when it came out in 2014, but not many researchers use NTMs today.

The feeling among researchers I've spoken to is not that NTMs aren't useful.
DeepMind is simply operating on another level. Other researchers don't
understand the intuitions behind the architecture well enough to make progress
with it. But it seems like DeepMind, and specifically Alex Graves (first
author on NTMs and now this), can.

~~~
modeless
The reason other researchers haven't jumped on NTMs may be that, unlike
commonly-researched types of neural nets such as CNNs or RNNs, NTMs are not
currently the best way to solve any real-world problem. The problems they have
solved so far are relatively trivial, and they are very inefficient,
inaccurate, and complex relative to traditional CS methods (e.g. Dijkstra's
algorithm coded in C).

That's not to say that NTMs are bad or uninteresting! They are super cool and
I think have huge potential in natural language understanding, reasoning, and
planning. However, I do think that DeepMind will have to prove that they can
be used to solve some non-trivial task, one that can't be solved much more
efficiently with traditional CS methods, before people will join in to their
research.

Also, I think there's a possibility that solving non-trivial problems with
NTMs may require more computing power than Moore's law has given us so far. In
the same way that NNs didn't really take off until GPU implementations became
available, we may have to wait for the next big hardware breakthrough for NTMs
to come into their own.

~~~
svantana
They sure put a lot of focus on "toy" problems such as sorting and path
planning in their papers - perhaps because they are easy to understand and
show a major improvement over other ML approaches. IMHO they should focus more
on "real" problems - e.g. in Table 1 of this paper it seems to be state of the
art on the bAbl tasks, which is amazing.

~~~
TeeWEE
Once you have a learning machine that can solve simple problems. You can scale
it up to solve very complex problems. Its a first step to true AI imho. Al lot
of small steps are needed to go towards this goal. Integrating Memory & Neural
Nets is a big step imho.

~~~
chriswarbo
> Once you have a learning machine that can solve simple problems. You can
> scale it up to solve very complex problems.

Nope. It's _really_ easy to solve simple problems; it can sometimes even be
done by brute-force.

That's what caused the initial optimism around AI, e.g. the 1950s notion that
it would be an interesting summer project for a grad student.

Insights into computational complexity during the 1960s showed that scaling is
actually _the_ difficult part. After all, if brute-force were scalable then
there'd be no reason to write any other software (even if a more efficient
program were required, the brute-forcer could write it for us).

That's why the rapid progress on simple problems, e.g. using Eliza, SHRDLU,
General Problem Solver, etc. hasn't been sustained, and why we can't just run
those systems on a modern cluster and expect them to tackle realistic
problems.

------
IanCal
Does anyone have a readcube link/similar for the paper?

[http://www.nature.com/nature/journal/vaop/ncurrent/full/natu...](http://www.nature.com/nature/journal/vaop/ncurrent/full/nature20101.html)

~~~
ericjang
here you go [http://rdcu.be/kXdj](http://rdcu.be/kXdj)

~~~
pmatev
This is great, thanks. Anyone know how those charts / graphs have been
generated?

~~~
hyperbovine
By hand, I imagine. There is an author credited specifically with the graphics
and nothing else.

------
dharma1
Waiting for Schmidhuber to pipe up that he wrote about something similar in
-93 and Alex Graves was his student anyway

~~~
singham
Exactly. I saw his webpage and was overawed until I read about him on reddit.
That guy is full of himself.

~~~
dharma1
He has done a lot of pioneering work, to be honest. I recommend seeing him
talk (or watch a video), I think his humour comes across better that way

~~~
chriswarbo
It's interesting to think that Schmidhuber's actually applying machine
learning methods to the field of machine learning, e.g. see the opening of
[http://people.idsia.ch/~juergen/deep-learning-
conspiracy.htm...](http://people.idsia.ch/~juergen/deep-learning-
conspiracy.html)

If AGI is the goal and machine learning research is the search algorithm, then
Schmidhuber's attempting to perform backpropagation by pushing rewards back
along the connections :)

~~~
Senji
We should use back propagation with government.

------
tvural
The idea of using neural networks to do what humans can already write code to
do seems a bit wrong-headed. Why would you take a system that's human-
readable, fast, and easy to edit, and make it slow, opaque, and very hard to
edit? The big wins for ml have all been things that people couldn't write code
to do, like image recognition.

~~~
JeffreyKaine
I think they just want to teach the system to crawl before it can walk, run
then eventually fly. Doing something that would be easy for a human to code
makes it easy for a human to see what's going on and help train the system to
think like a human.

------
orthoganol
It appears they are touting 'memory' as the key new feature, but I know at
least in the deep learning NLP world there already exists models with
'memory', like LSTMs or RNNs with dynamic memory or 'attention.' I can't
imagine this model is too radically different than the others.

Maybe I just feel a bit uneasy with a claim such as:

> We hope DNCs provide a new metaphor for cognitive science and neuroscience.

~~~
modeless
The "memory" in a typical RNN is akin to a human's short term working memory.
It only holds a few things and forgets old things quickly as new things come
in. This new memory can hold a large number of things and stores them for an
unlimited amount of time, more like a human's long term memory or a computer's
RAM. It's a big difference, and the implementation is completely different
too.

~~~
orthoganol
I was not referring to typical RNNs, but LSTMs or RNNs with 'attention'. They
are designed to overcome vanishing/ exploding gradient problems and hold
arbitrary memory lengths.

~~~
vintermann
They can technically be as long as you want them, but in practice there are
still severe constraints. LSTMs alleviate the gradient problems, but you still
get real trouble with long-term dependencies.

Alex Graves and some others in DeepMind have focused a lot in the past year or
so on developing practical differentiable data structures, so that the LSTM
can read and write to an external memory (and save its precious internal state
for more immediate needs) yet still be trainable via backpropagation.

------
tim333
I wonder how close these differentiable neural computers are functionally to
cortical columns in the brain that are "are often thought of as the basic
repeating functional units of the neocortex."
([https://en.wikipedia.org/wiki/Neocortex#Cortical_columns](https://en.wikipedia.org/wiki/Neocortex#Cortical_columns))

------
carapace
(What the hell with the thin grey sans-serif body text font? Seriously, do you
hate your readers' eyes that much?)

------
partycoder
I wonder if they will put this to use in their StarCraft bot.

~~~
saguppa
Yeah, games like StarCraft will probably need a working memory component. The
task that they solve here with RL is a simple puzzle game. It'll be
interesting to see if this works for Atari games or StarCraft.

------
outsideline
[https://en.wikipedia.org/wiki/Bio-
inspired_computing](https://en.wikipedia.org/wiki/Bio-inspired_computing)

Present day Neuron models lack an incredible number of functional features
that are clearly present in the human brain.

NTMs = representing memory that is stored in neurons
[https://en.wikipedia.org/wiki/Neuronal_memory_allocation](https://en.wikipedia.org/wiki/Neuronal_memory_allocation)

Decoupled Neural Interfaces using Synthetic Gradients =
[https://en.wikipedia.org/wiki/Electrochemical_gradient](https://en.wikipedia.org/wiki/Electrochemical_gradient)

Differentiable Neural Computers = Won't specify what natural aspect of the
brain this derives from.

Pick an aspect of a neuron or the brain that isn't modeled, write a model...

 _Bleeding edge + Operating on another level_

The fact that someone is going out of there way to remove points from my posts
so that this doesn't see tomorrow's foot traffic instead of replying and
critiquing me just goes to show how truthful these statements are.

Anyone can create such models. No one has a monopoly or patent on how the
brain functions. Thus, expect many models and approaches.. Some better than
others.

You can down-vote all you want. The better model and architecture wins this
game. It would help the community if people were honest about what's going on
here but people instead want to believe in magic and subscribe to the idea
that only a specific group of people are writing biologically inspired
software and are capable authoring a model of what is clearly documented in
the human brain. Interesting that this is the reception.

~~~
gjm11
You're not getting downvoted for being mean about DeepMind, you're getting
downvoted for making overconfident pronouncements about things you don't
understand.

"Neural Turing machines" are not the same thing as neuronal memory allocation:
NTMs' memory is external and neuronal memory allocation is all about how
memory is stored _in neurons_ in the brain.

The "synthetic gradients" in that paper have _nothing_ to do with the
electrochemical gradients you mention other than the name.

No one is claiming that the DeepMind guys are "operating on another level"
_because they do bio-inspired things_. They are claiming that _because they
are getting more impressive results than anyone else_.

Now: Are they really? If so, is that enough justification for such a grand-
sounding claim. I don't know. That would be an interesting discussion to have.
But "Boooo, these people are just copying things present in the brain, there's
nothing impressive about that" is not, especially when the parallels between
the brain-things and the DeepMind-things are as feeble as in your examples.

~~~
outsideline
Overconfident pronouncement by indicating that they are making computational
models of natural processes that no one can confidently state are correct or
are the most efficient?

Making statements that allow people to see behind the curtains and maybe go
off and make their own competitive models... Yes, this is a disservice to the
advancement of A.I and should be downvoted : Removing the prestigious veil and
illusion from published works.

NTMs memory is external in what sense? Please detail what this means in a
'functional' sense. It's biologically inspired. Neurons maintain memory beyond
synaptic weights. The neuron models of present day A.I were basic. Someone
comes along and sees the obvious : There is no computational model for how
neurons utilize memory and suddenly they're thinking on another level? Give me
a break..

Synthetic gradients have everything to do w/ electro-chemical gradients :
[http://www.nature.com/articles/srep14527](http://www.nature.com/articles/srep14527)
[http://www.pnas.org/content/110/30/12456.full.pdf](http://www.pnas.org/content/110/30/12456.full.pdf)
So, where is your establishment that I am incorrect. It is nowhere to be
found. Again, biologically inspired computational models.

Oh look, someone published a paper back in June that is an implementation of
Differentiable Neural Computers:
[https://arxiv.org/abs/1607.00036](https://arxiv.org/abs/1607.00036)

It's hype and that is a disservice to the community of people completing
similar work and taking similar approaches.

It would be an interesting discussion to have. That discussion was terminated
in favor of downvoting me.

They're feeble to someone who isn't well informed on neuroscience. Thus, you'd
rather be wow'd and believe in the fantasy that only a small segment of people
can write computational models of biology.

Continue believing the hype. Rarely will someone be truthful and honest about
where they got their ideas when hype follows. An interesting conversation
could have transpired. Enjoy the feels from the downvotes.

~~~
TeeWEE
Even if they would do copy-pasta from nature.. Even if they copy everything..

They are the first who have a machine learn to solve problems that require
memory. They are the first. These are the stepping stones to artificial
Intelligence.

Note: The whole point of the Synthetic gradients, is to learn a network in
parallel. This allows Google to make computers learn recognize things in
images even better. To recognize human speech even beter... To make self
driving cars even better.....

I don't know if they are copied or not from nature (doesnt look like). The
point is that they are improving mankind.

~~~
outsideline
> They are the first who have a machine learn to solve problems that require
> memory.

Incorrect. It was named a Neural (Turing) machine for a reason. Maybe people
should go back and dust off the white papers from the 70s like those who are
borrowing from that era and respectfully giving credit where credit is due.

They do great work and they are making great progress in Artificial
Intelligence. Many people are. Everything is a stepping stone. It serves no
good to over-hype one person's stones over another's or ignore/downplay where
they were inspired from. Notable visionaries of a past time were visionaries
because they detailed the depths of their thinking and centered on the
hows/whys. It seems it is fashionable now-a-days to do the exact opposite.
This is to a disservice to learning and progress.

The whole point of the human brain is parallel processing. Extra-cellular
chemical Gradients function the same way in the human brain and serve the same
purposes. Take a look at the papers I linked.

> I don't know if they are copied or not from nature (doesnt look like).

Extra-cellular chemical Gradients. I linked to white papers that explain how
memory is stored in them and shared across neurons. This is how it works in
nature and biology.

They named their approach 'Synthetic Gradients'. An artificial form of the
biological Gradient that is decoupled and lies outside of a neuron. They are
clearly giving credit to nature.

They and many other people are improving mankind. Many others can improve
mankind if there was less hype and more of a focus on where the ideas
originated.

That was my point..

The behavior of people regarding selective 'hype' is one of the big reasons
why a tremendous amount of deeply functional work that centers on hard
intuitions and ideas for this area will remain closed source when a real break
is made.

Enjoy the hype train I guess... They're operating on another level than anyone
else.

