
Retguard: An improved stack protector for OpenBSD - brynet
https://marc.info/?l=openbsd-cvs&m=152824407931917&w=2
======
nneonneo
Cool. If I understand the LLVM code correctly, it's inserting the following
instruction sequence into the code:

    
    
        mov r11, [cookie]
        xor r11, [rsp]
        ...
        xor r11, [rsp]
        cmp r11, [cookie]
        jeq 2
        int 3
        int 3
        ret
    

(where r11 might be some other suitable temp register as needed). cookie
points to an 8-byte chunk of .openbsd.randomdata, a section that is
initialized at binary load time by the kernel to contain random data. The
canary is one of 4000 possible values, named "__retguard_0" through
"__retguard_3999", presumably to avoid having the kernel generate an unbounded
amount of random data - the section is limited to 1MB in size.

This makes ret instructions fairly hard to use for rop purposes. Unlike the
original design, which xor'd [rsp] directly, this new approach preserves
return prediction so it should have a lesser effect on performance. With the
changes to reduce polymorphic gadgets in place, this should make ROP attacks
significantly less palatable. Also, in the original design, an arbitrary leak
made rop attacks feasible as you could just place xor-encrypted return
addresses on the stack. With the new design, you need repeatable register
control too, assuming the temp register isn't spilled, which raises the bar
quite a bit.

~~~
Annatar
What is preventing an attacker from overwriting the last xor and cmp
instruction sequence with nop instructions?

~~~
ben_bai
W^X (W xor X) which means memory is either writeable or executable. Guess
what, OpenBSD maps code only executable and not writeable.

------
brynet
It's also worth mentioning there's previous work to reduce the amount of
polymorphic gadgets in the instruction stream, including a new framework for
clang:

[https://marc.info/?l=openbsd-
cvs&m=151123332419798&w=2](https://marc.info/?l=openbsd-
cvs&m=151123332419798&w=2)

[https://marc.info/?l=openbsd-
cvs&m=152495643720502&w=2](https://marc.info/?l=openbsd-
cvs&m=152495643720502&w=2)

------
amenghra
The post from ~9 months ago by deraadt is worth reading:
[https://marc.info/?l=openbsd-
tech&m=150317547021396&w=2](https://marc.info/?l=openbsd-
tech&m=150317547021396&w=2)

~~~
brynet
That is the previous design which changed significantly from what's been
committed today (see other comments).

Github mirror, for easier review:
[https://github.com/openbsd/src/commit/e688c2b0648a80551cf735...](https://github.com/openbsd/src/commit/e688c2b0648a80551cf7353d9f455c7cbb98f13c)

------
Klasiaster
The underlying assumption is that the XORd value cannot be crafted by the
attacker. If read-access of [rsp] is possible, this scheme is vulnerable to
information leak of the return address (or another program pointer which lets
the attacker derive the return address), because cookie = retaddr^[rsp]. But
still this is better than the original RETGUARD which was easier to attack in
both ways, either a leak of the stack position to get the return address or
vice versa.

------
kernoble
What's meant by the term "gadget"? Does it mean an exploit mechanism? My
searches are mostly showing information about USB device drivers.

~~~
deegles
It's a reference to this class of vulnerabilities:
[https://en.wikipedia.org/wiki/Return-
oriented_programming](https://en.wikipedia.org/wiki/Return-
oriented_programming)

~~~
kernoble
Cool. Thanks!

------
DeepYogurt
Question; does this protection get applied to all languages that use llvm on
openbsd or is it C/C++ specific?

------
trillic
How is this different than any other stack canary? Is it because it is per-
function instead of per-stack like Linux?

~~~
brandmeyer
First off, the normal glibc/GCC stack canary is per-process, not per-stack.

(I started writing a more detailed reply based on the commit description, but
there was too much speculation without seeing their source code).

~~~
brynet
OpenBSD's is per shared library, i.e: libc has a different stack cookie than
your application and/or other libraries.

RETGUARD however, is per function.

------
tptacek
I don't pay enough attention to know how this is different from what PaX team
did with RAP 3 years ago:

[https://pax.grsecurity.net/docs/PaXTeam-H2HC15-RAP-RIP-
ROP.p...](https://pax.grsecurity.net/docs/PaXTeam-H2HC15-RAP-RIP-ROP.pdf)

~~~
PaXTeam
this is basically the xor canary approach originally pioneered by the
Stackguard guys (i'm pretty sure you were already around at the time though
probably forgot such old history as did the rest of the world apparently ;).
the OpenBSD implementation suffers from a few problems, mostly their own
making:

1\. if they can't find a register to load the cookie into, they'll silently
skip instrumentation (i'm not sure how that would happen in practice but the
silent treatment when omitting a security feature is a non-starter).

2\. if they can find such a register then it'll be spilled to the stack and
restored in the epilogue, so a normal buffer overflow can control both the
xor'd retaddr and the retaddr itself and the only thing standing in the way of
exploitation is the secret cookie value - not unlike with Stackguard/SSP.

3\. one would think that a per-function cookie is an improvement but...
they're shared among threads (in userland) or everything (in the kernel) so
infoleaks are just as catastrophic as before (it'd certainly help if someone
described a proper threat model for this defense). at least the kernel side
should use a per-syscall cookie to make it somewhat resemble an actual defense
mechanism (and there's some more described in my presentation).

4\. the int3 stuffing before retn must be someone's joke 'cos it sure as hell
won't prevent abusing the retn as a gadget. it does introduce a mispredicted
branch for every single function return however.

~~~
Mordak
Hey PaXTeam, thanks for having a look! I wrote the implementation, so I can
answer some of these.

1\. We don't silently skip instrumentation. If we can't find a free register
then we will force the frame setup code to the front of the function so we can
get one. See the diff in PEI::calculateSaveRestoreBlocks().

2\. We do spill the calculated value to the stack. This is unavoidable in many
cases (non-leaf functions). It would be an optimization to not do this in leaf
functions, but this would also mean finding a register that is unused
throughout the function. This turns out to be a small number of functions, so
we didn't pursue it for the initial implementation.

3\. I'm not sure what you mean by the cookies are shared. Do you just mean
that they are all in the openbsd.randomdata section? They have to live
somewhere. Being able to read arbitrary memory in the openbsd.randomdata
section would leak them, yes, though this doesn't seem to have been a problem
for the existing stack canary, which lives in the same section. I see that RAP
keeps the cookie on a register, which sounds like a neat idea. I'd be curious
to see how you manage to rotate the cookie arbitrarily.

4\. I'm glad you like the int3 stuffing. :-) We could always make the int3
sled longer if it turns out these rets are still accessible in gadgets that
terminate on the return instruction. Have you found any?

Anyway, I'm happy to see your commentary on this. You guys do some nice work!
If you have other suggestions for improvement I'd be happy to hear them. You
can email me at mortimer@.

~~~
PaXTeam
1\. both insertReturnProtectorPrologue and insertReturnProtectorEpilogue check
hasReturnProtectorTempRegister before proceeding with the instrumentation. so
either the changes to calculateSaveRestoreBlocks are not enough to prevent
that condition from ever triggering or these checks should be asserts at most
or just be eliminated altogether.

2\. sure but then this means that RETGUARD is not an improvement over
Stackguard/SSP which is not how it's marketed...

3\. shared means that entities of a class (threads in a process in userland,
every single process/thread in the kernel) see the exact same cookies so
leaking a cookie from one entity can allow exploitation by another. this is
especially detrimental to the kernel side protection. frequent enough cookie
rerandomization can help narrow this channel (RAP has a per-thread cookie in
the kernel that is updated on each syscall, and there's some more to reduce
infoleaks across kernel stacks, it's all in the presentation).

4\. any normal path leading up to the ret is a gadget and int3 stuffing does
nothing to prevent that (the underlying logic here is that if one can retarget
a return to arbitary addresses then he has already leaked enough information
so bypassing the cookie check is a no-brainer too). not only that but in the
bsd.mp kernel i just checked, of the 32199 ret (0xc3) bytes only 20236 are
actual retn insns, the rest are inside insns. so this int3 stuffing leaves
many other instances available. Red Hat tried similar gadget elimination a
while ago but noone's using the gcc feature as far as i can tell.

------
rurban
Anyone seen this new ReturnProtectorPass also upstream at clang? It was
written end of 2017 AFAIR, that when when we last discussed this here.

CFI still looks more promising to me though. Protecting the CALL part. But
this is better than the old gcc/clang stack cookie of course, protecting RET.

------
keyle
When I first read this, I thought how come no one thought of this before? It
feels like common sense.

Dumb question (potentially): will this make code that is not inlined, calling
a function many times, often, run a lot slower?

~~~
Mordak
It depends on the function. Many things will contribute. If your CPU can keep
the cookie in cache then loading it repeatedly will relatively fast compared
to hitting main memory. If your branch predictor can figure out the jmp over
the int3 instructions quickly then that will also be fast. If the function is
very short then the retguard stuff will add relatively more instructions to
the function so will have a larger impact than if the function was long, etc.

I found that the runtime overhead was about 2% on average, but there are many
factors that contribute.

------
atonse
While I didn't understand most of this, any reason why they're still using
CVS? Genuinely curious, I'm sure they have good reasons.

~~~
brynet
You must be jealous that we're self-hosted. ;-)

~~~
kazinator
I made a program called Meta-CVS in 2002 that stores a versioned directory
structure with permissions and symbolic links along with the files in an
ordinary CVS repository.

[http://www.kylheku.com/~kaz/mcvs.html](http://www.kylheku.com/~kaz/mcvs.html)

Meta-CVS has an import feature (mcvs grab) which detects renamed files. It
fixes up symlinks pointing to moved files too and such.

Meta-CVS didn't catch on widely because by the time I had it stable, CVS
itself was being side tracked by newly emerging projects like SVN.

Plus people were afraid of it being written in Common Lisp.

If you're still using CVS in 2018 and not Meta-CVS, you're ridiculously
backwards though. :)

~~~
nickpsecurity
The need to secure repos was identified by Karger in MULTICS evaluation. It
was a requirement in TCSEC security certification. A great summary of issues
is below by David A. Wheeler:

[https://www.dwheeler.com/essays/scm-
security.html](https://www.dwheeler.com/essays/scm-security.html)

On top of yours, the OpenCM and Aegis programs attempted to meet some of these
requirements. Most ignore them. Big, blind spot in software security.

------
mtgx
Speaking of OpenBSD, does anyone know what happened here and how true it is?

[https://arstechnica.com/information-
technology/2010/12/fbi-a...](https://arstechnica.com/information-
technology/2010/12/fbi-accused-of-planting-backdoor-in-openbsd-ipsec-stack/)

It would certainly jive with John Gilmore's story on how the NSA worked
through the standard bodies to keep IPSEC easily exploitable by making the
design too difficult to implement properly:

[https://www.mail-
archive.com/cryptography@metzdowd.com/msg12...](https://www.mail-
archive.com/cryptography@metzdowd.com/msg12325.html)

Their behavior around Simon & Speck and how they refused to reveal details on
how exploitable they could be also seems to be similar to their previous
tactics.

This is why it's worrisome that Google intends to implement Speck in Android
and have pushed it to the Linux kernel, too.

[https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/lin...](https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=da7a0ab5b4babbe5d7a46f852582be06a00a28f0)

More details on how the NSA has been sabotaging open source projects here:

[https://www.youtube.com/watch?v=fwcl17Q0bpk](https://www.youtube.com/watch?v=fwcl17Q0bpk)

~~~
admax88q
Don't have all my sources on hand, but the last time I looked in to this the
general conclusion I came to was that there's evidence to suggest that someone
was in fact paid to put vulnerabilities into the IPSec stack of OpenBSD. But
there was no evidence to suggest that those vulnerabilities ever got written
or if they were written that they ever made it into the source tree.

I believe OpenBSD conducted an audit of their tree when rumours of an IPSec
backdoor started and didn't find anything alarming.

~~~
sverige
Pretty ancient stuff to bring up, especially in this context. Here's the last
denial I recall by one of the people accused of planting backdoors in OpenBSD.
Note the date.

[https://www.itworld.com/article/2744922/open-source-
tools/op...](https://www.itworld.com/article/2744922/open-source-
tools/openbsd-fbi-allegations-denied-by-named-participants.html)

~~~
reacharavindh
Sorry to indulge in tin foil hattery. But this all seems in conclusive.. has
there been an independent audit of the components involved?

~~~
reacharavindh
Following up my own question. Just read the security and audit details on
OpenBSD page.

[https://www.openbsd.org/security.html](https://www.openbsd.org/security.html)

It appears that there is a continuous audit of source code. So, even if a
malicious hole was planted, it ought to be discovered in the years of repeated
auditing. Cheers to OpenBSD!

~~~
sverige
Yes, it is the best OS for security, and the audits have gone on for the
duration of the project.

