
The Stack Clash - fcambus
https://www.qualys.com/2017/06/19/stack-clash/stack-clash.txt
======
0x0
This looks pretty brutal.

Quote from Kurt Seifried of RedHat [http://www.openwall.com/lists/oss-
security/2017/06/19/2](http://www.openwall.com/lists/oss-
security/2017/06/19/2)

" _I just want to publicly thank Qualys for working with the Open Source
community so we (Linux and BSD) could all get this fixed properly. There was a
lot of work from everyone involved and it all went pretty smoothly._ "

Debian security advisories rapid fire:

glibc [https://lists.debian.org/debian-security-
announce/2017/msg00...](https://lists.debian.org/debian-security-
announce/2017/msg00146.html)

linux [https://lists.debian.org/debian-security-
announce/2017/msg00...](https://lists.debian.org/debian-security-
announce/2017/msg00148.html)

exim4 [https://lists.debian.org/debian-security-
announce/2017/msg00...](https://lists.debian.org/debian-security-
announce/2017/msg00147.html)

libffi [https://lists.debian.org/debian-security-
announce/2017/msg00...](https://lists.debian.org/debian-security-
announce/2017/msg00149.html)

And just a sample from the Qualys announcement (out of many more), "local-root
exploit against Exim", "local-root exploit against Sudo (Debian, Ubuntu,
CentOS)", "local-root exploit against /bin/su", "local-root exploit against
ld.so and most SUID-root binaries (Debian, Ubuntu, Fedora, CentOS)", "local-
root exploit against /usr/bin/rsh (Solaris 11)", as well as proof of concepts
for OpenBSD, FreeBSD and so on.

~~~
kseifried
Yup, it is, Qualys did some amazing work, and I suspect there are still more
problems like this. So if you find stuff like this please let us (Red Hat /
etc.) help you help us help everyone (I think that parses correctly =).

------
jtchang
That is a crazy number of CVEs. At a quick glace I am seeing a lot of local
root exploits. Generally speaking if a attacker has an account on your system
you are already hosed. But this doesn't bode well for more vulnerabilities of
this nature that don't require local root.

~~~
bjornsing
> Generally speaking if a attacker has an account on your system you are
> already hosed.

Isn't that just saying "I don't believe in multiuser systems and/or their
security models"...? If so, what specifically do you have against them?

~~~
fictioncircle
> Isn't that just saying "I don't believe in multiuser systems and/or their
> security models"...? If so, what specifically do you have against them?

[Core Services] + SSH is generally something you can harden effectively
against attacks.

[2903429034902323094230 binaries] is something you generally struggle to
maintain security patches/etc on.

The simple fact is, there is just too much attack surface on a vanilla Linux
box once you have an account that you can reliably do _EVERYTHING_ you need to
do to secure it 24/7/365.

At least imho, given my time constraints/budget.

~~~
bjornsing
> [2903429034902323094230 binaries] is something you generally struggle to
> maintain security patches/etc on.

Does a modern Linux come with that many binaries SETUID?!?

Sorry, I couldn't resist. :P But yea, I see your point. It's a bit sad though;
that all these people's hard work (e.g. OP) is all in vain...

~~~
rodgerd
> that all these people's hard work (e.g. OP) is all in vain...

Part of the reason people's hard work is in vain is because any time the topic
of doing things better comes up, a cluster of developers will insist there's
no point in improving e.g. filesystem because "once they're on the box, you're
screwed." So it becomes a self-fullfilling prophecy.

~~~
bjornsing
Yea, I get a similar feeling...

In general there seems to be quite diverse opinions out there about
"security", and a lot of the space seems occupied by "extreme pragmatics" (or
even "anti-intellectuals"). E.g. lots of people feel it's warranted to peddle
(simple) falsehoods instead of trying to understand (complex) problems. I can
understand it may be the right approach from a day-to-day IT management
perspective, but I'm not so sure it's the most viable path towards better
security long-term.

~~~
fictioncircle
> I can understand it may be the right approach from a day-to-day IT
> management perspective, but I'm not so sure it's the most viable path
> towards better security long-term.

Yeah, this is why I had the caveat:

> At least imho, given my time constraints/budget.

The "best" long term path is to have larger security budgets that allow for
the objective you and the other folks who dislike my response want. The
problem, frankly, is we just aren't there yet.

For instance, our budget for maintaining security is ~5% of the IT budget. A
large portion of that goes to perimeter defense appliances (firewalls,
barracuda antispam/antivirus filters, etc.) as well as making sure ublock,
anti-malware, etc are installed on every machine. The other major chunk ends
up in securing WAN-facing services that can be exploited remotely. The last
major chunk is user training to get them to stop doing things like pay bills
for services we never purchased, clicking on strange links, running strange
attachments, etc.

After that, we have no resources to do more than run apt-get update && apt-get
upgrade -y for protecting the attack surface once an account is breached.
We've got a few things we had to re-compile ourselves manually and break with
that process so we moved them out of the package manager for the OS. Our
actual applications we build internally also likely have exploitable
vulnerabilities if attacked from a local account. Those items never have the
budget to be maintained and we certainly wouldn't survive someone taking over
a local shell account.

I suspect given this is (roughly) the situation every place I've worked at,
its simply too common to be an issue.

------
starmole
I'm trying my hardest to understand how this is a novel problem. Maybe
somebody can help me?

If I control the stack pointer and can write to where it points, I can write
to arbitrary in process memory. Sure!

Is that just valuable as a ROP trick?

But if I have that, isn't just writing to the actual stack more valuable? Why
does stack growth matter at all besides being a complication where one can not
write to one specific page?

How does this get you to write to out of process memory?

~~~
caf
Firstly, it's not exactly a novel problem - as the advisory points out, there
were earlier public examples of the bug class in 2005 and 2010.

That aside, the issue here is that you can have a program that correctly
writes only to properly allocated objects on the heap, and to properly
allocated objects on that stack: but if you can get the stack to grow down
into the heap without it being detected, now your properly-allocated objects
on the heap alias part of the stack (and your properly-allocated objects on
the stack alias part of the heap). So now the correct writes done by the
program can write to unintended places, like return values on the stack or
function pointers in the heap.

The core trick is the _" without it being detected"_ part. What they've done
is find places - some of them in library code - that the stack is grown by
more than the size of the guard page in one go, and where writing to the guard
page itself can be avoided (or in the BSD case, they've found ways that the
guard page itself can be disabled). There's also some other clever tricks
around expanding the stack and the heap.

------
age_bronze
I'm very surprised. I was sure -fstack-check was on by default. The fact that
it isn't secure without it is known for years. Windows compilers have had that
check for years. The bug isn't in any executables, gcc and all other compilers
should have -fstack-check on by default, with optional disable. I'm even more
suprised that people who are supposed to know what they are doing don't
compile with it.

------
dgellow
Could someone explain the situation like if I was 5?

~~~
brohee
On Linux and most Unices (at least on ix86), the heap starts at the bottom of
the address space, and the upper bound grows up as you allocate more memory,
while the stack starts at the top of the memory space and grows downward as
you allocate more stack (stack allocation : mostly function calls and the odd
alloca()). malloc() who manages the heap as no idea of where the stack
currently ends, and thus can allocate memory at an adress that is also claimed
by the stack. When writing to this address, this permit to smash the stack
without needing an actual buffer overflow in the attacked code.

There is good news, this looks way harder to attack on 64 bit systems (all the
CVE are on 32 bit OS), possibly because the adress space is so huge, and
standard OS counter measure like ASLR, stack gap etc all help protecting
against the attack.

~~~
clarry
> malloc() who manages the heap as no idea of where the stack currently ends,
> and thus can allocate memory at an adress that is also claimed by the stack

The other way around. The kernel has already mapped pages for the stack, and
would never hand such pages to malloc unless there's a serious bug in the vm
subsystem.

However, code that uses the stack has no idea where the stack ends. And
reaching out of the stack is only detected via page faults. If that access
happens to land on a mapped page with the right permissions, there is no fault
and the code can effectively grow the stack into heap region.

~~~
geofft
Right. The malloc heap has defined starting and ending points. (I say "points"
because it's common to use mmap to get a bunch of new pages, which might be
discontiguous from the existing heap, but you still know exactly where those
pages are.) The stack, on the other hand, is just a pointer.

In theory, you can always decrement the stack pointer for your variables. If
you find an unallocated page, the kernel will notice that you're right below
the stack and give you more stack pages. There's no other to request more
stack memory, the way you can use brk() or mmap() to request more heap memory:
you're _supposed_ to page-fault and let the kernel come up with more stack.

In slightly less theory, you can decrement the stack pointer, and if you reach
more memory than the kernel is willing to give you, the page fault will turn
into an actual segfault, because you'll hit a specially-defined guard page
that prevents you from infinitely growing the stack.

In practice, you can decrement the stack pointer by any arbitrary amount and
now you just have a pointer somewhere and you have to hope it's either within
the stack or in the guard page....

~~~
AnimalMuppet
Slightly more specifically, if the stack is growing into space that the heap
already owns, you don't get a page fault. So the OS assumes that the stack
already owns that memory, since there was no page fault. Now if you write to
the heap, you can corrupt the stack.

~~~
clarry
> Slightly more specifically, if the stack is growing into space that the heap
> already owns, you don't get a page fault.

Exactly. But the stack is only growing in the program's view of the world.

> So the OS assumes that the stack already owns that memory

Nah, the OS is blissfully unaware that the program has moved its stack pointer
to point off the stack. If anything, the OS wrongly assumes the program is
still operating within the stack space explicitly and rightly reserved for it.

And the program in turn wrongly assumes this memory newly referenced via the
stack pointer has been reserved for its stack, because it wasn't killed for
accessing it.

> Now if you write to the heap, you can corrupt the stack.

That is right. You will corrupt what the program believes to be stack.

~~~
AnimalMuppet
From TFA:

The user-space stack of a process is automatically expanded by the kernel:

\- if the stack-pointer (the esp register, on i386) reaches the start of the
stack and the unmapped memory pages below (the stack grows down, on i386),

\- then a "page-fault" exception is raised and caught by the kernel,

\- and the page-fault handler transparently expands the user-space stack of
the process (it decreases the start address of the stack),

\- or it terminates the process with a SIGSEGV if the stack expansion fails
(for example, if the RLIMIT_STACK is reached).

Unfortunately, this stack expansion mechanism is implicit and fragile: it
relies on page-fault exceptions, but if another memory region is mapped
directly below the stack, then the stack-pointer can move from the stack into
the other memory region without raising a page-fault, and:

\- the kernel cannot tell that the process needed more stack memory;

\- the process cannot tell that its stack-pointer moved from the stack into
another memory region.

~~~
e12e
This is crazy. I remember thinking when I first heard about stack and heap
growing towards each other: uh-oh. But the problem was so blindingly obvious
that I just assumed any system written for anything beyond co-operative
multitasking had a fix - because if not there was obviously no actual memory
safety...

~~~
tptacek
What would the fix be? The fact that you can point the stack pointer at
arbitrary memory and the CPU will treat that memory as the stack is a feature,
and an important one, of the ISA.

The real issue here is that writing programs in memory-unsafe languages is
inherently difficult and risky, and fewer programs should be written that way.

~~~
e12e
The design might be fundamentally broken, but if so you're saying that even
with an mmu, we can only have co-operative multitasking. That's not the
promise of a multi-user/multi-prosess system.

If "all programs are safe", we could just use the Amiga OS kernel and no
longer need an mmu, or a similar design.

[ed: apparently windows NT takes steps to avoid this according to a sibling
comment. Not clear, but I assume it implies a performance hit for certain
heavy stack usage?]

~~~
geofft
I don't entirely follow what you're saying in your first sentence, but I'm
going to try to respond to what I think you're saying. If I'm off base please
let me know!

1\. Memory-safe languages are about making sure that the programmer's intended
behavior matches the actual behavior, that is, eliminating a class of bugs
related to memory unsafety. They are a security scheme insofar as these bugs
are security bugs, but they're not an _interprocess_ security scheme. In
particular, you can write a memory-safe debugger that goes and makes arbitrary
modifications to other processes the OS gives it access to. You can even write
a memory-safe program in Rust that goes and edits /proc/self/mem. But in these
cases, the programmer is _intending_ to mess with process memory directly, so
the language isn't obligated to stop the programmer. It is obligated to stop
the programmer from, say, overflowing a string and overwriting the return
address.

2\. It is certainly possible to design a memory-safe language that is usable
for interprocess memory protection. Microsoft had a research OS called
Singularity that did exactly this: [https://www.microsoft.com/en-
us/research/wp-content/uploads/...](https://www.microsoft.com/en-
us/research/wp-content/uploads/2016/02/osr2007_rethinkingsoftwarestack.pdf)
But it's another step _on top of_ memory safety.

3\. Preemptive multitasking and protected memory aren't inherently related
(although, yes, in the market, most cooperatively multitasked OSes lacked
memory protection, and most OSes with memory protections were preemptively
multitasked). You can have a preemptively multitasked system with no MMU at
all; you just need to respond to timer interrupts and switch tasks.

~~~
e12e
You're right, I leapt over some points, and landed slightly outside the
discussion - I guess I think of growing the stack and allocating heap memory
as something the kernel should be the arbiter of - and that the api should
never allow you to grow into your own (or another process') memory.

I suppose it's fine to say mallloc will return memory, but it's up to the
process to check if there are any overlaps - but that sounds a little crazy?

~~~
caf
That _is_ basically how it works, with the caveat that if you want the kernel
to arbitrate your stack expansion you must only expand by a page at a time
(and that's what gcc's -fstack-check does).

If you decrement your stack pointer by a large value and then offset it -
which is essentially what's happening in these cases - the kernel _can 't_
arbitrate that because if the access lands in otherwise allocated memory it
doesn't fault and so the kernel _never sees the access at all_.

~~~
e12e
I meant that the equivalent for mallloc would be that if you allocate 1mb
buffer and a 2mb buffer, the kernel might return a 2mb buffer _overlapping_
your earlier 1mb buffer - and be all like: "you asked for 1, you asked for 2 -
and you've got 2 - if you wanted 3, you should've asked for 3". Afaik mallloc
doesn't work like that - it assumes that you want more memory (and can fail or
succeed etc).

I can see how the current stack/heap thing evolved - but I still think it's
crazy :-)

~~~
caf
The stack doesn't work like that either.

All that's happening here is that userspace is moving its stack pointer into
the heap it had previously allocated. Note that "moving the stack pointer" is
not a kernel-mediated operation.

~~~
e12e
No of course, but the fact that you can "ask for more memory" by growing the
stack onto your heap (rather than say, having the two start somewhere together
and grow apart) - means that there's an asymmetry: mallloc will give you more
ram or fail; growing the stack - can make your allocated memory overlap.

~~~
caf
Your stack has to grow towards _something_. Sure, you can have it grow towards
the bottom of the address space (which, due to wraparound, is also the top -
where it will safely collide with the kernel addresses) but that only works
for one stack - as soon as you create another thread, its thread has to grow
towards something else.

------
staticassertion
Take note of the Grsecurity section. We already have the technology necessary
to mitigate or significantly reduce the impact of these vulnerabilities.

Solid writeup.

------
cbhl
It's not clear to me why compiling all userland code with -fstack-check would
help. Couldn't you work around that by copying or creating an executable in
assembly that doesn't write every 4 KB?

~~~
sloppycee
The issue isn't that a user can run their own code that can stack smash,
rather that a user can exploit the smash to run their code in a _privileged
context_ (set-guid, i.e. sudo, su, etc.).

------
0x0
On 32bit x86, couldn't the SS segment selector be mapped to a completely
different set of memory compared to CS/DS/ES and thus remove the possibiliy of
the stack and the heap clashing?

~~~
jwilk
How would you distinguish between stack pointers and non-stack pointers?
Making pointers longer than 32-bit doesn't sound appealing.

~~~
0x0
Good point, I forgot about C allowing passing around pointers to stack
variables. Hmm.

~~~
0x0
What if you set up SS to mirror DS, but with a limited range so that any
attempt to access memory outside the stack via SS: causes a page fault?
Wouldn't any exploit running ESP down into the heap be thwarted by any stack-
related instruction (push, pop, or an interrupt?)

~~~
ajenner
The accesses causing issues here are not guaranteed to use SS - that only
happens for effective addresses [ebp+...] and [esp+...]. If ESP is copied into
another register first (which in practice will almost always be the case) then
the access will use DS. PUSH will always use SS but that's not the issue here
(that only moves ESP by 4 bytes so it'll always hit the guard page). And in
modern OSes, interrupts don't use the user mode stack at all - the CPU will
switch to kernel mode and use a kernel stack since the user mode stack isn't
guaranteed to be valid.

~~~
0x0
Interesting. I was just curious if it would be impossible to write shellcode
without triggering an SS:ESP access (via call,push,pop,ret) that would page
fault due to protection/selector limits, because that seemed like a neat way
to mitigate.

------
busterarm
A classic always worth reading:
[http://insecure.org/stf/smashstack.html](http://insecure.org/stf/smashstack.html)

