
64-bit Linux Return-Oriented Programming - wglb
http://crypto.stanford.edu/~blynn/rop/
======
willvarfar
(Mill team)

Apologies to those suffering Mill fatigue ;)

The Mill CPU is actually immune to these ROP exploits by the simple virtue of
having a hardware-managed stack. The return address is not overwriteable.

We are also very firmly in the exploit mitigation camp generally, with our own
particular twists to champion when the time comes.

So this article is very nice but very definitely focusing on here-and-now
architectures. It'd be throwing the baby out with the bathwater to give up on
hardening the architecture when starting over.

~~~
runeks
How far has development come with the Mill CPU?

This video[1] mentions that "nothing with pins on the bottom" has been made
yet.

[1] [https://www.youtube.com/watch?v=QGw-
cy0ylCc#t=13](https://www.youtube.com/watch?v=QGw-cy0ylCc#t=13)

~~~
Narishma
I've been hearing about it for years but it's still vaporware AFAICT.

~~~
reitzensteinm
Vaporware is a pretty harsh way to describe it. If everything went perfectly
for the Mill team, they'd still not be shipping physical CPUs - it's an
outrageously long and complex process. Giving them the benefit of the doubt
for now doesn't seem unreasonable.

~~~
renox
AFAIK they don't have also any FPGA prototype, nor any working backend
compiler, so of course no OS working.. Their presentations are full of very
nice ideas and I hope that they'll make it, but some are quite skeptical of
their 'single address space' feature for example.

~~~
XorNot
There's a long history of really good ideas on theory and in publications,
that when it comes to actual implementation suddenly tear themselves apart due
to all the practical compromises that _always_ arise.

------
im2w1l
The call trick is no longer necessary in x64. It supports addressing relative
to instruction pointer, so you can do

    
    
        lea rdi, [rip+23]

------
xamuel
Minor nitpick:

>This flies in the face of the theories of Turing and von Neumann, which
define the basic principles of the stored-program computer. Code and data are
the same, or at least they can be.

Turing and von Neumann were very aware of the looseness of the code-data
distinction. That looseness is formally encoded in the S-m-n theorem, but
without even going to those extremes, it's a basic prerequisite for the
statement of the Halting Problem itself.

------
thefreeman
The article mentions that "Sadly, I know of no widespread Linux tool that
searches a file for a given sequence of bytes"

[https://github.com/packz/ropeme](https://github.com/packz/ropeme) offers a
fantastic set of gadget searching tools.

You need to install
[https://code.google.com/p/distorm/](https://code.google.com/p/distorm/)
manually to get it running.

~~~
anon4
Grep can also server you pretty well most of the time. It will happily search
in binary files and the -b option will show you byte offsets.

------
aninteger
Are functions with more than 6 arguments really forbidden on x86-64? How do
things like CreateFile or CreateWindow from the Windows API get compiled then?

~~~
kryptiskt
That's the syscall calling convention, not the C one (though it's more like a
platform ABI than something exclusive to C these days). In that, on Linux x64
all the args after the first six are pushed to the stack, on Windows it's
after four args.

~~~
nkurz
It's probably worth clarifying that on 64-bit Windows, the first first four
arguments are passed in registers regardless of type, and the remainder (if
any) are pushed to the stack. But on 64-bit Linux/BSD/OSX, depending on the
mix of types, you might have as many as 14 register arguments. You get 6
"integer" registers, plus 8 "vector" registers, where "vector" registers are
also used for floating point.

[http://www.agner.org/optimize/calling_conventions.pdf](http://www.agner.org/optimize/calling_conventions.pdf)

------
SixSigma
If you're not a fan of debuggers, you haven’t done Acid

[http://www2.informatik.hu-
berlin.de/~apolze/LV/plan9.docs/ac...](http://www2.informatik.hu-
berlin.de/~apolze/LV/plan9.docs/acidpaper.html)

------
AgentME
>Alas, stack smashing is much harder these days. On my stock Ubuntu 12.04
install, there are 3 countermeasures: ... GCC Stack-Smashing Protector (SSP),
aka ProPolice: the compiler rearranges the stack layout to make buffer
overflows less dangerous and inserts runtime stack integrity checks.

There are certain scenarios that SSP is not used, such as with arrays that are
less than 8 items wide if I remember right. You can also bypass SSP if you
find a separate memory disclosure vulnerability and learn the stack cookie.

>Therefore, I argue executable space protection is worse than nothing [because
ROP attacks can bypass it].

ROP attacks only work if you know what the program's memory space looks like.
If you're trying to attack a server and don't know what specific executables
it is running are, then the attack becomes ridiculously more difficult. (There
are certain scenarios that can be brute-forced through:
[http://www.scs.stanford.edu/brop/](http://www.scs.stanford.edu/brop/)) If
there is no executable space protection, then you can do a more straight-
forward blind attack by sending your executable payload to the server many
times so that the payload is in many places in the server's memory (where it
would not be executable if executable space protection were in place), and
then repeatedly use an exploit to jump to a random memory space until you get
lucky and land in the payload. (Well, this might be less workable on 64-bit
systems which have ridiculously large address spaces. You'd probably need to
pair this with another vulnerability which disclosed some memory addresses so
you had an idea of what ranges to guess in.)

Executable space protection also often prevents not-intentionally-exploited
bugs from naturally wreaking too much havoc with the program's executable
memory, so that the program crashes fast instead of silently rewriting its own
code to fail in mysterious ways later.

>Aside from being high-cost and low-benefit, it segregates code from data.

Separating code from data is a very useful default. Many remote code execution
vulnerabilities come from mixing them together accidentally. (See SQL
injections and XSS attacks. The equivalent protections to executable space
protection against these attacks are prepared statements / parameter binding,
and the HTTP Content-Security-Policy header. Use these!) There are ways to
opt-out of it when it is actually necessary.

>But worse still are its implications for programmers. Executable space
protection interferes with self-modifying code, which is invaluable for just-
in-time compiling, and for miraculously breathing new life into ancient
calling conventions set in stone.

Programs can already allocate regions of memory with read-write-execute
permissions, or switch the permissions of existing memory regions.

>We just saw how trivial it is to stitch together shreds of existing code to
do our dirty work. We barely scratched the surface: with just a few gadgets,
any computation is possible.

Not so much to disagree, but a suggestion: If you want to do anything more
complicated than a syscall or two (an execve call will take down and
cannibalize the entire process!), then using rop gadgets to do all of your
dirty work is unnecessary pain. A good trick is to use rop gadgets to allocate
a memory region with read-write-execute permissions, copy some executable code
into it, and then jump into it. (Then you can have that code repair the
program's stack, accomplish whatever changes you desired, maybe even patch the
program's vulnerability to stop anyone from following you in, and then resume
the normal operation of the process.)

~~~
strcat
> There are certain scenarios that SSP is not used, such as with arrays that
> are less than 8 items wide if I remember right. You can also bypass SSP if
> you find a separate memory disclosure vulnerability and learn the stack
> cookie.

-fstack-protector-strong always does it when there's an array. Many distributions lowered the minimum array size for -fstack-protector to 4 bytes too.

> A good trick is to use rop gadgets to allocate a memory region with read-
> write-execute permissions, copy some executable code into it, and then jump
> into it.

Note that this is why PaX's MPROTECT feature exists. It prevents injecting
code or bypassing RELRO by default and executables need to be marked with
exceptions for stuff like JIT compilation to work.

------
neo_optimus
From what I could understand from the document, isn't ASLR along with NX a
formidable defence? We need an address for gadgets to execute, but with ASLR
enabled we cannot run these gadgets at all. Can somebody elaborate on this?

~~~
AgentME
It's common for many libraries to not opt-in to ASLR (since it requires them
to be compiled as position-independent), and I don't believe ASLR applies to
the program's own code.

