
X86 mov is turing complete: mov-only compiler - thejj
https://github.com/xoreaxeaxeax/movfuscator
======
M2Ys4U
Who needs instructions? Page faults are Turing complete:
[https://github.com/jbangert/trapcc](https://github.com/jbangert/trapcc)

------
pokle
HN discussion of the paper that this is based on:
[https://news.ycombinator.com/item?id=6309631](https://news.ycombinator.com/item?id=6309631)

~~~
jordigh
That paper is so funny:

    
    
        Finding Turing-completeness in unlikely places has long been a
        pastime of bored computer scientists.
    

And

    
    
        Removing all but the mov instruction from future iterations of the
        x86 architecture would have many advantages: the instruction
        format would be greatly simplified, the expensive decode unit
        would become much cheaper, and silicon currently used for complex
        functional units could be repurposed as even more cache. As long
        as someone else implements the compiler.
    

Well, I guess someone has implemented it now!

~~~
cwre
Also:

> Thus, while it has been known for quite some time that x86 has far too many
> instructions, we can now contribute the novel result that it also has far
> too many registers.

------
pslam
That's a fun program. Similar techniques to this are used to create "ROP"
(return-oriented programming) exploits. You get enough building blocks to get
general purpose programming, and it's "just" matter of mapping to those
building blocks.

X86 "mov" is a bit of a cheat, though. It's a single mnemonic for what's
really dozens of quite different instructions. For example, ARM has "mov" for
register-register operations, but "ldr/str" for register-memory, while X86
just uses mov for everything. Even ARM mov optionally puts a barrel shifter
into the path, so it's not that pure either.

~~~
userbinator
Even if they weren't a single mnemonic, if you read the paper you'll find that
reg = mem[reg + C], mem[reg + C] = reg, and reg = C appear to be the only 3
instruction types that are actually needed; these are only register-memory and
register-constant data transfers. Even register-register isn't needed.

~~~
awalton
register-register movs are needed because compilers are not the world's
smartest code generators, and we don't have an infinite number of registers.
The biggest problem arises when some calling convention expects certain things
to be in certain registers - unless you do some pretty extreme long term
planning in your register allocation code in the compiler, you're pretty
likely to eventually wind up with some piece of information in a register that
you need to move to another register before making or returning from a call.

Of course, modern processors don't really care about register-register moves
because they have hundreds of hidden intermediate registers and register
renaming which makes those types of moves "free." (e.g. Haswell has 168 GP
registers, only 16 of which are exposed through the AMD64 ISA.)

~~~
rdc12
Do you have a citation for the 168GP registers, just wondering if Intel have
"offically" stated the number they use in a given microarch.

~~~
pbsd
You can find an official Intel source for Haswell in here (page 15):
[http://pages.cs.wisc.edu/~rajwar/papers/ieee_micro_haswell.p...](http://pages.cs.wisc.edu/~rajwar/papers/ieee_micro_haswell.pdf)

------
Rexxar
Is there a way to estimate how many time slower would be a program compiled to
use only "mov" instead of standard instructions ? 3x, 10x, 100x, more ?

~~~
DSMan195276
It depends. In this case, I would say it would be _at least_ 100x, probably
much more. But, that's because this only compiles BF (It is basically a POC
after-all), and thus you have to use that as an intermediate language. BF is
extremely un-optimal, and the compiler here spits out a completely unoptimized
one-to-one conversion of the BF source to assembly that uses only `mov`
instructions. Thus, the size of the assembly will balloon extremely quickly
simply because it takes a lot of BF to express fairly simple things. Branching
using only `mov` is also impossible, so the code instead uses a flag to
control where the assembly writes it's results. Because of this, even when you
hit the condition of a loop, if your condition check is before the loop's
code, then it results in you running the entire loop's code another time, and
just throwing out the results. It's extremely wasteful.

Based on this POC alone, it's pretty much impossible to guess how much slower
a real `mov` only compiler would be vs. a regular compiler. It depends a lot
on how much of this can be optimized. Directly compiling C to this via clang
or gcc would be a good start. I have no idea how much work that would take
though, and compiling anything more complex then BF is a real challenge in
itself because of the branching problem.

~~~
0xe2-0x9a-0x9b
"compiling anything more complex then BF is a real challenge in itself because
of the branching problem."

That's not true.

------
cordite
Sounds like if we make a JS to BF transpiler, then we can complete this with
any language that can be transpiled to JS.

Just imagine the applications of this!

The next step would be to make this into an eventual Quine.

~~~
shaftoe
[http://copy.sh/v86/](http://copy.sh/v86/)

It's turtles all the way down.

~~~
mastax
I'm not sure if I'm more amazed by how slow that is or how fast it is.

Oh the world we live in: playing minesweeper in a graphical OS running in a
javascript x86 emulator.

------
Sir_Cmpwn
The Github description is misleading:

> The single instruction C compiler

I was seriously impressed until I got further down.

~~~
andrewflnr
Well, if it reaches a point where it can compile itself, you'll get your
compiler written with just one instruction (assuming that's what you were
hoping for).

~~~
Sir_Cmpwn
I meant that this software compiles brainfuck, rather than C. I was expecting
a C compiler from the description.

~~~
ackalker
From README.md:

    
    
        M/o/Vfuscator 2.0 is a complete C compiler, and will be available soon.
    

Just be patient, or submit patches :-)

~~~
masklinn
Would be an interesting LLVM code generator too.

------
billforsternz
Forgive my ignorance but I thought being Turing complete was important because
a Turing complete machine can in principle perform any arbitrary computation.
As far as I know the program counter cannot be a destination for an X86 MOV.
So you can't branch with only MOVs. How can you do an arbitrary computation if
you can't branch?

Maybe you have to reorganize if(a) { b; } else { c; } as a kind of permuted b;
followed by c; such that either the permuted b; or the permuted c; are
effectively no-ops depending on the value of a.

~~~
DSMan195276
You're right, they never actually do any jumping. From what I can tell, they
use a variable called `on` to control where every set of commands write their
results too. They do some funky `mov` stuff to do a comparison, and then set
`on` with that comparison. They use a stack of `on` variables and then pop off
the value of `on` from that stack when you exit a loop.

They cheat to get a jump without using anything but `mov`'s by executing an
illegal set of `cs` using `mov`, and then use the sigill generated to execute
a signal handler, which is just the same spot that the code starts at. IMO, it
doesn't really count, so a `mov`-only program requires at least one jmp
instruction to work.

I think loops are achieved by simply running the code to the end without
saving any of the results, and then hitting the singal and going back to the
beginning - ignoring all the code until you hit the loop you're executing.
But, the `mov` code is fairly hard to understand so I could be wrong on that.

~~~
nitrogen
Could one use self-modifying code (such as rewriting the SIGILL handler) to
simplify this a bit?

------
bsder
It's a superset of the One Instruction Computer
[https://en.wikipedia.org/wiki/One_instruction_set_computer](https://en.wikipedia.org/wiki/One_instruction_set_computer)

------
spiritplumber
Could be good to defeat RF snooping attacks.

------
exabrial
"-O" enables optimizations

Replaces movs with less movs??

~~~
makeset
Fewer.

------
gnu8
This is almost surely a dumb question, but what would happen if someone made a
processor that only executed mov instructions, which would be extremely
simple, and presumably would be really goddamn fast. Would that be competitive
with today's fastest CPUs? If it were, what would be the advantages and
disadvantages of this design?

~~~
gizmo686
Well, Turing machines do not have random access memory (which changes the big-
theta complexity of algorithms), and that operations such as multiplications
would be complicated and multi-cycle, whereas conventional CPUs can easily
parralize them.

If you were comparing the performs of mov-only programs, a specialized cpu
would likely be faster. Otherwise, there is a good reason we do not actually
use Turing machines.

------
liotier
Extremely Reduced Instruction Set Computer.

Or is it a Ridiculous Instruction Set ?

~~~
__alexs
One instruction set computer.

[https://en.wikipedia.org/wiki/One_instruction_set_computer](https://en.wikipedia.org/wiki/One_instruction_set_computer)

------
raverbashing
When I read the title I thought "ah of course cmov is turing complete", but
no, it's only mov apparently.

Which is quite amazing (and probably dog slow)

