
Surprisingly Turing-Complete - telekid
https://www.gwern.net/Turing-complete
======
zokier
> This matters because, if one is clever, it provides an escape hatch from
> system which is small, predictable, controllable, and secure, _to one which
> could do anything_. It’s hard enough to make a program do what it’s supposed
> to do without giving anyone in the world the ability to insert another
> program into your program, which can then interfere with or take over its
> host

I take issue with the statement that TC means that something could "do
anything". It means it can theoretically _compute_ anything, but Turing
completeness does not in itself give access to any additional resources; being
TC does not magically allow something to talk to internet or write to disk or
spawn new processes, or possibly not even allocate new memory. Computers do so
much more than just compute; TC might be the first step towards being able to
"do anything" but it certainly is not the final step.

From security point of view, what TC can (but not necessarily!) is open the
host to denial of service by excessive resource consumption, or by non-
terminating program. But as another comment noted, also non-TC systems can
consume impractically large amounts of resources even if their resource
consumption is not theoretically unbounded (as it is afaik with Turing
machines).

~~~
tdullien
Author of one of the cited papers here. The author of the post falls into a
common misconception of the weird machine literature (which led me to write my
paper): conflating TC (the ability to compute any function worth computing)
with the ability to transition the victim machine into states that should be
unreachable via paths that should not exist (“weird machine programming”). It
is a bit unfortunate that this misunderstanding is pervasive in early WM
papers :-/ \- this ensures perpetuation of the misunderstanding.

~~~
lidHanteyk
Complexity theorist here. In addition to your point, there's another commonly-
overlooked problem: TC isn't quite the actual top! There are ways to make
problems that, even with an oracle for solving Halting, are still hard [0].

It seems like folks are very quick to confuse the expressiveness of a machine
with the expressiveness of analyzing programs for that machine; usually, a
program is far harder to analyze than to run.

I think that a better way to view the post's author's point is by
_unpredictability_. Given a short program in a weak setting, we can not only
predict what the program will do, but what the program _cannot_ do, usually
because it is too short or too simple. In a Turing-complete setting, though,
there are short programs with very unpredictable behavior.

[0]
[https://en.wikipedia.org/wiki/Turing_degree](https://en.wikipedia.org/wiki/Turing_degree)

~~~
fnrslvr
I don't think this discussion thread is giving the writer's concern enough
credit. Universality _on its own_ obviously doesn't allow the instance to take
over its host, but it _can_ enable the bulk of the malicious payload to be
encoded as legitimate instances of whatever P-hard optimization problem the
cloud service solves, so that it need not be injected directly via the actual
vulnerability that the malicious actor uses to take over the host.

> TC isn't quite the actual top! There are ways to make problems that, even
> with an oracle for solving Halting, are still hard [0].

I'm not sure why you're raising the subject of the degrees here. Malicious
software doesn't need to have hypercomputational powers to be a threat.

~~~
lidHanteyk
I think that, if you're going to care about being exploited by real-world
payloads, then Turing-completeness is a red herring; I agree with the thread-
starter. For example, it is already bad enough to not be able to tell when
computations are in P vs. NP, for responsiveness under load. It is not good
when an NP database query halts a P Web server. For this reason, languages
like Pola [0] which are far weaker than Turing-completeness are valuable.

And, if you thought that it was easy to be accidentally Turing-complete, wait
until you see how easy it is to be accidentally NP [1]. The typical database
query is in NP, because constraint satisfaction problems are in NP. So is the
typical optimization problem.

[0]
[https://www.researchgate.net/publication/266217730_Pola_a_la...](https://www.researchgate.net/publication/266217730_Pola_a_language_for_PTIME_programming)

[1] [https://en.wikipedia.org/wiki/List_of_NP-
complete_problems](https://en.wikipedia.org/wiki/List_of_NP-complete_problems)

------
tdullien
Author of one of the linked weird machine papers here. The use of “Turing
complete” in both the ROP and the weird machine literature is both incorrect
and misleading; I wrote some comments on this here:
[http://addxorrol.blogspot.com/2018/10/turing-completeness-
we...](http://addxorrol.blogspot.com/2018/10/turing-completeness-weird-
machines.html)

This does not detract from this post being a good, fun, and interesting read,
but for anyone that is puzzled why “Turing complete” should imply “insecure”:
It doesn’t.

------
cryptonector
> If that’s not enough, the SVG standard is large and occasionally horrifying:
> the (failed) SVG 1.2 standard tried to add to SVG images the ability to open
> raw network sockets.

!!!

!!!!!!!!!!!

From the SVG 1.2 draft:

> Note that these interfaces expose possible security concerns. The security
> model that these interfaces work under is defined by the user agent.
> However, there are a well-known set of common security guidelines used by
> the browser implementations in this area. For example, most do not allow
> access to hosts other than the host from which the document was retrieved. >
> > The next draft of SVG 1.2 will clearly list the minimum set of security
> features that an SVG user agent should put in place for these interfaces.

"Possible security concerns". No kidding. At least they were going to address
them in the next draft version... though probably not by removing the ability
to open sockets. Words fail me.

------
moreati
Python pickle files are a sequence of op-codes that run on the pickle VM. By
default the VM allows calls to arbitrary Python functions. I'm still puzzling
whether Python pickles _without access to Python globals_ (e.g. using
[https://docs.python.org/3/library/pickle.html#restricting-
gl...](https://docs.python.org/3/library/pickle.html#restricting-globals)) are
Turing complete. I don't _think_ so, because the pickle VM has no branching or
looping, but it does have a stack and my understanding of automata theory is
not great.

My research/tinkering so far is [https://github.com/moreati/pickle-
fuzz](https://github.com/moreati/pickle-fuzz)

------
joe_the_user
_" Peano arithmetic: addition & multiplication on natural numbers is enough to
be TC;"_

My head swims when the situation is described with this level of vagueness. I
mean, sure the task of proving a theorem using the modern version of the Peano
postulates is undeciable and so I'd assume a map from theorems in the Peano
system to proofs of theorems would be Turing complete.

But a computation system based on calculating the values of simple arithmetic
expressions isn't Turing complete. An express involving just adding and
multiplying constant integer values will terminate.

~~~
tgv
Perhaps he means that you could somehow abuse the induction axiom, although it
seems to me that would be in a way that's not what the axiom was meant for.

------
Complexicate
I love this...

"...mov, which copies data between the CPU & RAM, can be used to implement a
transport-triggered-architecture one instruction set computer, allowing for
playing Doom..."

Click on "Doom" link and read:

"The mov-only DOOM renders approximately one frame every 7 hours, so playing
this version requires somewhat increased patience."

------
mappu
If TrueType hinting is turing complete - are outputs observable from a Web
Font context? Is it possible to write a WASM polyfill based on TrueType
hinting?

From [https://docs.microsoft.com/en-
us/typography/opentype/spec/tt...](https://docs.microsoft.com/en-
us/typography/opentype/spec/tt_instructions) looks to have 32-bit words, a
dynamic heap, unrestricted JMP targets, a generous number of math functions,
...

~~~
kristianp
The article mentions tt fonts as being based on the postscript language.

------
saagarjha
> “return-into-libc attacks”: software libraries provide pre-packaged
> functions, each of which is intended to do one useful thing; a fully TC
> ‘language’ can be cobbled out of just calls to these functions and nothing
> else, which enables evasion of security mechanisms since the attacker is not
> running any recognizable code of his own.

Note that ROP attacks in general tend to jump into the middle of functions
because they have partially-cobbled together call states. ROP "chains" join
together a couple of instructions followed by a return into something useful,
but with "return-into-libc" it's usually to just jump straight midway into
system and spawn a shell.

> Pokemon Yellow: “Pokemon Yellow Total Control Hack” outlines an exploit of a
> memory corruption attack which allows one to write arbitrary Game Boy
> assembler programs by repeated in-game walking and item purchasing. (There
> are similar feats which have been developed by speedrun aficionados, but I
> tend to ignore most of them as they are ‘impure’: for example, one can turn
> the SNES Super Mario World into an arbitrary game like Snake or Pong but you
> need the new programs loaded up into extra hardware, so in my opinion, it’s
> not really showing SMW to be unexpectedly TC and is different from the other
> examples.

I fail to see the difference; as far as I understood it, the Sumer Mario World
examples were done by just playing the game? (By the way, I hear that Ocarina
of Time has something like this now, too.)

> This matters because, if one is clever, it provides an escape hatch from
> system which is small, predictable, controllable, and secure, to one which
> could do anything. It turns out that given even a little control over input
> into something which transforms input to output, one can typically leverage
> that control into full-blown TC. This matters because, if one is clever, it
> provides an escape hatch from system which is small, predictable,
> controllable, and secure, to one which could do anything.

You can still prove sandboxing guarantees about executing Turing-complete
programs.

------
pjscott
Stuck in an appendix is a fascinating mini-essay, "How many computers are in
your computer?"

[https://www.gwern.net/Turing-complete#how-many-computers-
are...](https://www.gwern.net/Turing-complete#how-many-computers-are-in-your-
computer)

~~~
pjc50
Indeed. Everything is a hetrogenous cluster now. Don't forget the input
devices and outputs; one of my more interesting jobs was very tangentially
being involved with
[https://www.flatfrog.com/inglass](https://www.flatfrog.com/inglass) . Every
dot around that screen has an ARM and a small DSP.

------
arithma
This is definitely interesting, but there's always Javascript in the browser.
It's turing complete by design, and it can and is sandboxed to a lot of
success. The fast and quick conclusion that TC in itself is dangerous is not
warranted, but when it's not intended, it can have unexpected consequences
that might have some security, or other (stability) implications. That's what
I take away from the article.

------
waynecochran
Turing Complete is not a very high bar. Add a second stack to a pushdown
automata and its Turing Complete. Add two counters to a NFA and it’s Turing
Complete. I don’t think folks know what this means.

------
bertr4nd
A related idea that I’m interested in but find a bit hard to articulate is to
describe “simple” Turing complete languages, where simplicity is defined more
by ease of reasoning for a human than by any objective metric.

Basically, if I wanted to provide someone with a Turing complete language,
what’s the simplest/easiest thing I could provide, that would still be useful?

~~~
jfkebwjsbx
The simplest you could give someone is probably the Turing machine itself, the
Brainfuck language or the lambda calculus.

Simplifying, to have a TC programming language you need two things: RAM and
the ability to decide your next state based on the memory contents.

~~~
bertr4nd
Right, so I think the suggestion of brainfuck illustrates the difficulty I’m
having articulating what I want, because while it’s TC and trivial to
implement, it is basically impossible to use as a language. I think I’m going
for simultaneous ease-of-implementation and ease-of-use rather than any
actually type of “minimalism”.

I’m probably just looking for Lisp, really. It’s easy to implement and usable
enough.

~~~
jfkebwjsbx
Yeah, a simple Lisp like Scheme is basically the lambda calculus.

Brainfuck is useless because the operations are useless. But if you give the
user a few more things, like addressing memory, basic arithmetic and a way to
define variables and functions, then you have something way more useful very
quickly.

------
hirundo
> Turing-completeness (TC) is ... the property of a system being able to ...
> compute any program of interest, including another computer in some form.

So Turing completeness implies _recursive_ Turing completeness. It is the
theoretical threshold at which a device is capable of reproduction, a sort
Schwarzschild radius for complex, heritable behavior, aka life.

------
greesil
Does finding weird and unexpected ways to do computation always imply a
security risk?

~~~
woodruffw
I do research in this space professionally.

The answer is not always, but sometimes: discovering unintended states or
transitions in the execution contract of a program is a common building block
for exploits. However, proving that the execution contract of a program can be
coerced into representing computations in a TC language doesn't necessarily
prove that you can do anything interesting.

Complex formats like PDF are a good example of this: you can probably contrive
a PDF input such that the parser's state represents a minimal (and TC)
language while interpreting it (e.g. a language with a few mathematical
operators and a "store" primitive), but that doesn't magically get you network
access or arbitrary memory read/writes. You need to show that said language,
when programmed in, can affect the overall execution contract in a way that
violates high-level assumptions.

Some resources (FD: my company's) on the subject:

* [https://blog.trailofbits.com/2019/11/01/two-new-tools-that-t...](https://blog.trailofbits.com/2019/11/01/two-new-tools-that-tame-the-treachery-of-files/)

* [https://blog.trailofbits.com/2018/10/26/the-good-the-bad-and...](https://blog.trailofbits.com/2018/10/26/the-good-the-bad-and-the-weird/)

~~~
saagarjha
Just a heads up: I think the second post has a typo; the code has named
"new_item" but is referred to as "item" throughout. (I'm also not sure I
understand the safety added by dynamic_cast.)

~~~
woodruffw
Thanks for the heads up! I'll fix it.

