
Turing Machines Are Recurrent Neural Networks (1996) - jaybosamiya
http://lipas.uwasa.fi/stes/step96/step96/hyotyniemi1/
======
csantini
TLDR: Given any Turing Machine you can build a Recurrent Neural Network that
does the same thing.

You can then apply known facts about Touring machines, e.g. Halting Problem,
so you can't generally predict when the neural network is gonna get to the
final state.

~~~
seiji
The simplest definition of turing completeness is: a system capable of
manipulating symbols with no restrictions on recursive depth.

Basically, if you can write "while (true) { doStuff(); }" (or an equivalent
TCO version), your system is turing complete.

So, all the "deep learning" fads aren't even turing complete (if we don't
consider RNNs "deep learning") because they are feedforward/backprop networks
of explicitly limited depth.

Humans are turing complete, but only for limited time spans. Humans get tied
up in a version of the halting problem called "this is dumb let's do something
else." It's difficult for humans to execute "find the highest prime number"
without other needs derailing the process over time (boredom, bathroom breaks,
biological death).

~~~
pygy_
_> "while (true) { doStuff(); }" [...] Humans are turing complete [...]_

The example I like to give when explaining algorithms and turing completeness
to laypeople is "Stir until you get a smooth mix", and by extension, cooking
recipes.

~~~
hobs
After thinking about that for a second, that is my new favorite analogy. Way
better than anything I have ever come up with to explain the same thing.

------
kazinator
The paper seems to show that recurrent networks are Turing Machines, not the
other way around.

They are finite state machines, unless the variables stored at the nodes are
of unlimited size (so that a pushdown stack or whatever of unlimited depth
could be represented in a variable). That goes against the "spirit", if you
will, of neural networks, which maintain some small amount of state at each
node.

All finite state machines are Turing Machines, of course. Just the reverse
isn't true.

I don't see the point of the Example, which clearly shows a representation of
a computation in the form of a finite sized matrix with bounded values,
iterating on a vector. Such a contraption is not a Universal Turing Machine no
matter what values you stuff into it and what initial vector you use.

Ironically, the paper was _written on_ a computer built on sequential logic,
which is network of combinational gates which feedback: essentially a
recurrent neural network, and its purpose is evidently to argue that such a
thing can compute.

Ah, we have this flights of fancy toward the end:

 _The variables can be continuous, not only integer valued._

Come on, there is no such thing. Continuous variables are not _constructable_
(just some of them). They are only an abstraction in math. A continuous
variable effectively consists of infinite bits. Even those that are called
computable, like the number pi, still cannot be computed to completion. What
"computable means is that there is an algorithm which makes progress toward a
more and more accurate value: for any specific integer n, the function "find
the n-th digit of pi" completes in a finite number of steps. That doesn't mean
we can have a continuously valued variable and stick pi in it.

Moreover, in this fantasy world where we have true continuous variables, why
can't we also have them in the original language whose elements are equated
with the neural network? After all _" for each variable V in the program,
there exists a node Nv_. If node Nv is continuous, so must be the variable V
in the program.

What the paper really shows is that the syntax of some abstract algorithmic,
imperative steps can be transliterated into the syntax of a network with
feedback which does the same thing.

------
j2kun
> An interesting question arises, for example, whether the class of NP-
> complete problems could be attacked more efficiently in the network
> environment!

It is well known that training even a two-layer feed-forward neural network is
NP-hard, so I doubt that this is a reasonable direction. This was even known
in 1993. [1]

[1]:
[https://people.csail.mit.edu/rivest/pubs/BR93.pdf](https://people.csail.mit.edu/rivest/pubs/BR93.pdf)

