
Movfuscator: A single-instruction C compiler - franzb
https://github.com/xoreaxeaxeax/movfuscator
======
thesz
Back in the time of FIDO, I've accepted the challenge to crack a program (find
a correct password) that more or less consisted of a loop to simulate three-
address MOV instruction.

The loop jump address sometimes changed for some effectful operations like
printing or for optimizations like executing addition.

It took me about four hours to find the correct password. In the course of
there three hours I wrote 1) an executor that used i386 debug registers to
look for current MOV addresses, 2) a tracer that produced a trace and 3) a
compactor which identified common instruction sequences and presented them as
some macrocommand. It turned out the original source code has used macros in
the opposite way. The final challenge was to write brute force password
finder, which is not that hard at all (for 32-bit checksum).

All in x86 assembler. I guess it was about 1995-96, somewhere there.

Now I'd use the same technique, but on higher level. Instead of peephole
compacting I'd use graph analysis, but that's about it. You can get pretty
much everything from the program trace, I think this way you can get even more
information than from disassembly.

So in my opinion, it is one hell of a cool experiment. But try not to use it
as a real obfuscation device.

~~~
userbinator
I also remember seeing a "forest of MOVs" obfuscation technique attempting to
crack a protection back in the very late 80s, and I remember it so well
because it caused me to change my analysis strategy completely. The fun part
was that the "interesting" MOVs were hidden amongst other instructions that
seemed to perform useful computation, although the results of that were just
thrown away and the MOVs were doing all the work. At the time I was fond of
printing out code and inspecting/annotating it manually, so I think it took
several days and lots of careful documenting of the algorithm before I
realised that it was all useless; and upon tracing back the source of the
actual value used in the decision and crossing out the irrelevant
instructions, imagine my surprise when almost all that was left were MOVs...!

That one taught me it was far better to start by working backward from the
result, although self-modifying code tends to be more difficult that way.

~~~
thesz
And this is exactly why traces are better than assembly. You see instructions
that were executed, not the code. You can restore the decision tree (and then
the graph, most probaly) and figure out what is going on.

~~~
_pmf_
> And this is exactly why traces are better than assembly. You see
> instructions that were executed, not the code. You can restore the decision
> tree (and then the graph, most probaly) and figure out what is going on.

Do you have some illustrative example?

~~~
nialo
I think a good example is the last level (Hollywood) from
www.microcorruption.com, which has self modifying code and is set up such that
the actual execution jumps into the middle of what the disassembler thinks are
actual instructions. Reading the code is pretty useless, but with a trace of
instructions executed and register state at each instruction it's easy to
start at the end and follow backwards to the interesting part of the program.

Getting the trace is the tricky bit, I had to write an msp430 emulator.

(Actually seeing this example requires completing the rest of the levels, but
you should do that anyway, especially if this is the sort of thing you're
interested in)

------
cautious_int
I suggest taking a look at the slides, which show how much trickery is
involved:
[https://github.com/xoreaxeaxeax/movfuscator/raw/master/slide...](https://github.com/xoreaxeaxeax/movfuscator/raw/master/slides/the_movfuscator_recon_2015.pdf)

~~~
patio11
Strongest possible +1 for the slides if you are at all interested in low-level
alchemy. Also see slide 109 for the beginning of a shadow argument that this
might actually have some real-world utility, in that the long list of MOVs is
virtually immune to comprehension by existing reverse engineering tools and
practices.

~~~
ORioN63
On the other hand you could compile it back to regular assembly and use them.

------
pkaye
The Maxim Integrated MAXQ is is one commercial processor that uses a MOV based
instruction set. [http://www.maximintegrated.com/en/app-
notes/index.mvp/id/322...](http://www.maximintegrated.com/en/app-
notes/index.mvp/id/3222)

I've always felt these were more of a trick in being single instruction set
because you are using some of the addressing bits to encode an opcode.

~~~
userbinator
That's a TTA
([https://en.wikipedia.org/wiki/Transport_triggered_architectu...](https://en.wikipedia.org/wiki/Transport_triggered_architecture)
), where effectively the ALU and other computation units become memory-mapped
devices. It's the logical extension of how a lot of microcontrollers which
don't have a multiply instruction in their instruction sets, e.g. 8051, will
instead have a multiplier unit that's accessed by reading/writing special
memory addresses.

That's somewhat different from the move-based code discussed here where the
MOVs are actually performing the computation.

------
foobar2020
The x86 is actually Turing-complete without even executing a single
instruction. Page faulting is enough:
[https://github.com/jbangert/trapcc](https://github.com/jbangert/trapcc)

------
ishtu
Author is the same person who published epic X86 vulnerability
[https://github.com/xoreaxeaxeax/sinkhole](https://github.com/xoreaxeaxeax/sinkhole)

~~~
ericfrederich
I thought those slides looked familiar. Thought it might have been a template
that the conference provided. Should have looked at the author ;-)

------
kazinator
Exact dupe:

[https://news.ycombinator.com/item?id=9751312](https://news.ycombinator.com/item?id=9751312)

------
agumonkey
Previously:
[https://news.ycombinator.com/item?id=9751312](https://news.ycombinator.com/item?id=9751312)
[https://news.ycombinator.com/item?id=6309631](https://news.ycombinator.com/item?id=6309631)

------
ape4
Write your program using the nearly Turing-complete C preprocessor and compile
into mov.

------
jschwartzi
Is ARMv7 mov turing complete?

