
How to Code and Understand DeepMind's Neural Stack Machine in Python - williamtrask
https://iamtrask.github.io/2016/02/25/deepminds-neural-stack-machine/?i=4
======
syllogism
One thing you don't mention in your paper reading method is a stopping
criterion. I'm opening and doing some level of "looking at" \--- not
necessarily reading! --- probably 10-20 papers a week. More if you count
revisits to papers that I've "looked at" before.

The question is, how do you decide which papers to promote to further
attention? It's always a ranking problem, because there's always other work
you could be looking at. You start your reading method at the point at which
you've already decided to read the paper. But how do you get to that point?

(I can describe how I decide, but it's not exactly easy to replicate. It's a
pretty insidery perspective. Here's how it played out for this paper.

I looked at this paper soon after it was published because I follow the first
author on Twitter. When I opened the paper I looked at who was on it. I know
Phil, and I know the sort of work Ed has been doing. I glanced over the paper
and looked at the experiments, and saw that it was similar to the neural
turing machine and algorithm learning line of work, that's still being
evaluated on toy problems. This work is interesting but I'm not doing research
on these things, so I won't commit to learning it while it's moved <1 mountain
empirically. I did read the idea enough to contrast it with Chris Dyer's stack
LSTM parser, which is a model that performs very well empirically. I checked
the related work for criticism/comment on that work, and the paper said it's
inspired by it. Cool. I'll watch for future empirical evaluations.

In total I probably spent about 10-15 minutes looking at this paper. This is
enough for me to remember I'd seen the work, and help me understand future
related ideas a little bit better.)

~~~
williamtrask
Totally agree with you on all of this. I think an update is in order to
include.

------
kidgorgeous
Great post! Thank you for taking time out of your busy life to write this. I
think it's small gestures of kindness like this that advance the field of
artificial intelligence, just as much as the large discoveries. Bookmarked!

~~~
williamtrask
Thank you so much! Very kind of you to say that.

------
rkwasny
This is awesome, it also touches the problem - scientific paper about
algorithms without the accompanying code that implements it is hard to
use/reproduce.

------
Houshalter
This website crashed my browser and I lost a few tabs I have been saving, and
am unable to recover them. It does this even with JavaScript off.

Stack machines are really cool. Are they computationally efficient though? As
the stacks grow bigger the number of possible stacks to keep track of grows
exponentially doesn't it? Or do they only keep track of some of them?

~~~
williamtrask
Hey Houshalter! Thank you so much for respnoding... i think it's the trinket
demos... and i think some of them autorun. I'll look into it.

------
fizixer
I heard of Neural Turing Machines and now this is Neural Stack Machine. How
are they same/different?

~~~
williamtrask
Neural Stack Machines have a slightly more limited memory. Whereas in theory
Neural Turing Machines can learn anything, Neural Stack Machines focus on
algorithms that are conducive to stacks. Phil Blunsom addresses this some in
his recent Russia talk.
[https://www.youtube.com/watch?v=-WPP9f1P-Xc](https://www.youtube.com/watch?v=-WPP9f1P-Xc)

~~~
williamtrask
actually it might have been his DLSS talk... not 100% sure the talks are very
similar...
[http://videolectures.net/deeplearning2015_montreal/](http://videolectures.net/deeplearning2015_montreal/)

------
vmorgulis
Very interesting idea.

We could do superoptimization with that. For example, a superoptimized sort.

------
jaruche
I wonder what is the diference between a neural stack and An LSTM? Aren't both
keeping a state?

~~~
williamtrask
They are similar in that way. Typically an LSTM is used to control the neural
stack... so the stack sortof "sits on top of" the LSTM's memory... allowing it
to be (in theory) infinite.

------
m00dy
Nice post. need to read it again tough

~~~
williamtrask
Thank you!

