
Return-Oriented Programming without Returns (on the x86)  - tptacek
http://cseweb.ucsd.edu/~hovav/papers/cs10.html
======
tptacek
Ok, so security papers on Hacker News never get that much love, but take my
word for it that you want to read this one.

This paper is the moral equivalent of that post from yesterday about how you
could derive all of ASCII from []()+! in Javascript. Here, the authors are
trying to evade runtime defenses that prevent memory corruption exploits from
uploading executable code in target processes. To control execution, they
analyze target binaries and synthesize a turing complete runtime out of scraps
of legitimate basic blocks.

From 34 instruction sequences, they essentially derive a VM with
load/store/move, add/sub/neg/and/or/xor, conditional branches, and call (read
the first couple pages and skip ahead to page 12 for the diagram of what
"call" actually looks like when you scavenge it out of the bblocks of an
unsuspecting program.

Obviously, once you have such a VM defined, you can compile arbitrary exploits
to it. It's like having a scriptable debugger attached to the remote program.

You should read the paper because it's an excellent crash course in runtime
exploit design and mitigation, and a good way to jog your x86 memory. But I
should also note that I haven't done justice to the paper: it's thesis is that
you can not only do all these things, but do them without the "ret"
instruction, or any other non-innocuous sequence of instructions.

~~~
arohner
IANA security expert, but stuff like this makes me believe it is _never_
possible to secure native code. If you want a secure system, build a VM using
a strict language like Haskell.

Is that a fair statement?

~~~
ax0n
I am a security professional, but not a developer. You're half correct. Your
suggestion of a VM with Haskell would result in a system that could still be
made to fail. The methodologies would just work differently.

~~~
ramchip
Could you back up that claim? lambdabot on #haskell runs arbitrary Haskell
code very safely.

~~~
jrockway
I think he means fail as in allowing you to write a function like:

    
    
      deleteAllFiles user = if user == Root
                               then throwError "access denied"
                               else unsafeDeleteAllFiles
    

Even if memory access is safe, you can still type in the wrong algorithm.

~~~
ax0n
Most certainly. But further, it's possible to write unintentionally
exploitable code in any language.

------
dx
Can somebody describe what this document is all about in simple words, for the
average "high level" developer? Thanks in advance!

~~~
tptacek
You've probably heard the term "buffer overflow". A buffer overflow is one
example of a memory corruption flaw. In virtually all modern memory corruption
exploits, the goal of the attacker is to corrupt a victim program in such a
way that they can upload their own code into the victim and run it.

For about 8 years (95-'03, the "summer of worms"), the bulk of all the work in
stopping these attacks was directed at the broken code that allowed the
vulnerabilities (for instance, the use of strcpy() and sprintf(), which are
unbounded). The payoff for this work went asymptotic.

Starting in the late '90s and really picking up in the '00s, research instead
went into hardening runtimes to break exploits. For instance, you can't simply
overwrite the activation record for a function to change the return address,
because there's a random cookie there. The stack and heap aren't executable,
and the text segment isn't writeable. The whole runtime is randomized, so you
can't predict addresses.

What papers like this (and, as Dave Aitel pointed out on Twitter, papers like
John McDonalds from the 90s -- that guy is spooky) aim to show is that even if
attackers can't literally upload x86 instructions into their targets, they can
still take control of the process.

That's because even simple real-world programs are very complex at the runtime
level, and linked to very complex libraries. The CPU can be coerced into
executing any of the instructions in those libraries --- or, indeed, any
sequence of bytes comprising any fragment of those instructions.

So, if you do some basic binary analysis on your target, you can craft a
sequence of addresses in the target which, if executed in the right order,
will have the same effect as any program you could write (it's "turing
complete", but a more helpful way to think about it is that they're
synthesizing a simple VM running out of fragments of code).

As far as we know, it is very hard to stop these kinds of attacks. An attacker
that can write 1-4 bytes to an arbitrary (or even restricted) address in
memory can trick the CPU into looking at attacker-controlled memory (say, the
1024 character login name they just gave you), and once the CPU is looking at
that memory, it starts running sequences of its own code against itself.

This is systems programming, compiler theory, computer architecture, and
computer security. If you're into serious practical CS, you're crazy if you
don't consider security as a career. There are very few other jobs that will
reliably expose you to so much CS (and even EE) at this level.

