Quote from Kurt Seifried of RedHat http://www.openwall.com/lists/oss-security/2017/06/19/2
"I just want to publicly thank Qualys for working with the Open Source community so we (Linux and BSD) could all get this fixed properly. There was a lot of work from everyone involved and it all went pretty smoothly."
Debian security advisories rapid fire:
And just a sample from the Qualys announcement (out of many more), "local-root exploit against Exim", "local-root exploit against Sudo (Debian, Ubuntu, CentOS)", "local-root exploit against /bin/su", "local-root exploit against ld.so and most SUID-root binaries (Debian, Ubuntu, Fedora, CentOS)", "local-root exploit against /usr/bin/rsh (Solaris 11)", as well as proof of concepts for OpenBSD, FreeBSD and so on.
Keep in mind that software is moving towards a sandboxed world, especially with containers. Local attack surface is becoming increasingly valuable.
(That said, the same class of attack is in theory possible in the kernel, although the kernel tends to be way more disciplined about stack usage than userspace software.)
Another perfectly valid way of working around the exploits is to just have an installation with no setuid binaries or file capabilities. This is difficult in a general-purpose OS that needs to be backwards compatible with traditional UNIXy things, but usually totally achievable if you're building a machine to run some specific service. (And in particular it should be really easy within a container, such that these attacks would get you neither container root nor host root.)
A contained-process-to-actual-root exploit is basically either going to be an arbitrary code execution vulnerability in the kernel itself, which tends to be referred to as something more dramatic as "local root", or a logic flaw in the container security mechanisms of the kernel itself.
It seems to me that "local root" is quite commonly used for kernel code execution vulnerabilities. There's also the possibility of a flaw in a service running in the init namespace that is exposed to the container.
I don't think these should be written off as "Well, you're screwed anyway" like we tend to do when discussing security against an attacker who has direct physical access to the hardware. Isolated user accounts are widely used as the model for running services because they provide significant (if non-total) benefit.
Isn't that just saying "I don't believe in multiuser systems and/or their security models"...? If so, what specifically do you have against them?
[Core Services] + SSH is generally something you can harden effectively against attacks.
[2903429034902323094230 binaries] is something you generally struggle to maintain security patches/etc on.
The simple fact is, there is just too much attack surface on a vanilla Linux box once you have an account that you can reliably do EVERYTHING you need to do to secure it 24/7/365.
At least imho, given my time constraints/budget.
Does a modern Linux come with that many binaries SETUID?!?
Sorry, I couldn't resist. :P But yea, I see your point. It's a bit sad though; that all these people's hard work (e.g. OP) is all in vain...
Part of the reason people's hard work is in vain is because any time the topic of doing things better comes up, a cluster of developers will insist there's no point in improving e.g. filesystem because "once they're on the box, you're screwed." So it becomes a self-fullfilling prophecy.
In general there seems to be quite diverse opinions out there about "security", and a lot of the space seems occupied by "extreme pragmatics" (or even "anti-intellectuals"). E.g. lots of people feel it's warranted to peddle (simple) falsehoods instead of trying to understand (complex) problems. I can understand it may be the right approach from a day-to-day IT management perspective, but I'm not so sure it's the most viable path towards better security long-term.
Yeah, this is why I had the caveat:
> At least imho, given my time constraints/budget.
The "best" long term path is to have larger security budgets that allow for the objective you and the other folks who dislike my response want. The problem, frankly, is we just aren't there yet.
For instance, our budget for maintaining security is ~5% of the IT budget. A large portion of that goes to perimeter defense appliances (firewalls, barracuda antispam/antivirus filters, etc.) as well as making sure ublock, anti-malware, etc are installed on every machine. The other major chunk ends up in securing WAN-facing services that can be exploited remotely. The last major chunk is user training to get them to stop doing things like pay bills for services we never purchased, clicking on strange links, running strange attachments, etc.
After that, we have no resources to do more than run apt-get update && apt-get upgrade -y for protecting the attack surface once an account is breached. We've got a few things we had to re-compile ourselves manually and break with that process so we moved them out of the package manager for the OS. Our actual applications we build internally also likely have exploitable vulnerabilities if attacked from a local account. Those items never have the budget to be maintained and we certainly wouldn't survive someone taking over a local shell account.
I suspect given this is (roughly) the situation every place I've worked at, its simply too common to be an issue.
The other one I see touted everywhere far too often is "false sense of security." Because improving the security of one thing can't possibly help when there could be a dozen other things that may yet be vulnerable. No, instead of helping, fixing that one thing lulls you into believing you are safe and that alone makes it all less secure. :-)
If I control the stack pointer and can write to where it points, I can write to arbitrary in process memory. Sure!
Is that just valuable as a ROP trick?
But if I have that, isn't just writing to the actual stack more valuable? Why does stack growth matter at all besides being a complication where one can not write to one specific page?
How does this get you to write to out of process memory?
That aside, the issue here is that you can have a program that correctly writes only to properly allocated objects on the heap, and to properly allocated objects on that stack: but if you can get the stack to grow down into the heap without it being detected, now your properly-allocated objects on the heap alias part of the stack (and your properly-allocated objects on the stack alias part of the heap). So now the correct writes done by the program can write to unintended places, like return values on the stack or function pointers in the heap.
The core trick is the "without it being detected" part. What they've done is find places - some of them in library code - that the stack is grown by more than the size of the guard page in one go, and where writing to the guard page itself can be avoided (or in the BSD case, they've found ways that the guard page itself can be disabled). There's also some other clever tricks around expanding the stack and the heap.
But you don't usually have that sort of control. Normally you'd use something like a buffer overflow on a stack allocated buffer.
This thing is a problem even if you have no buffer overflows on stack and do not have arbitrary write access to anywhere on stack.
What's happening here that the program gets confused about how large its stack is, and keeps utilizing more memory than it should for the stack. But that memory is allocated for heap objects, so a simple write to one of these (not requiring any sort of buffer overflow or other such bug to exploit) could be used to smash the stack.
"The Stack Clash is a vulnerability in the memory management of several operating systems. (...) It can be exploited by attackers to corrupt memory and execute arbitrary code."
"If you are using Linux, OpenBSD, NetBSD, FreeBSD, or Solaris, on i386 or amd64, you are affected. Other operating systems and architectures may be vulnerable too, but we have not researched any of them yet: please refer to your vendor’s official statement about the Stack Clash for more information."
There is good news, this looks way harder to attack on 64 bit systems (all the CVE are on 32 bit OS), possibly because the adress space is so huge, and standard OS counter measure like ASLR, stack gap etc all help protecting against the attack.
The other way around. The kernel has already mapped pages for the stack, and would never hand such pages to malloc unless there's a serious bug in the vm subsystem.
However, code that uses the stack has no idea where the stack ends. And reaching out of the stack is only detected via page faults. If that access happens to land on a mapped page with the right permissions, there is no fault and the code can effectively grow the stack into heap region.
In theory, you can always decrement the stack pointer for your variables. If you find an unallocated page, the kernel will notice that you're right below the stack and give you more stack pages. There's no other to request more stack memory, the way you can use brk() or mmap() to request more heap memory: you're supposed to page-fault and let the kernel come up with more stack.
In slightly less theory, you can decrement the stack pointer, and if you reach more memory than the kernel is willing to give you, the page fault will turn into an actual segfault, because you'll hit a specially-defined guard page that prevents you from infinitely growing the stack.
In practice, you can decrement the stack pointer by any arbitrary amount and now you just have a pointer somewhere and you have to hope it's either within the stack or in the guard page....
Exactly. But the stack is only growing in the program's view of the world.
> So the OS assumes that the stack already owns that memory
Nah, the OS is blissfully unaware that the program has moved its stack pointer to point off the stack. If anything, the OS wrongly assumes the program is still operating within the stack space explicitly and rightly reserved for it.
And the program in turn wrongly assumes this memory newly referenced via the stack pointer has been reserved for its stack, because it wasn't killed for accessing it.
> Now if you write to the heap, you can corrupt the stack.
That is right. You will corrupt what the program believes to be stack.
The user-space stack of a process is automatically expanded by the kernel:
- if the stack-pointer (the esp register, on i386) reaches the start of the stack and the unmapped memory pages below (the stack grows down, on i386),
- then a "page-fault" exception is raised and caught by the kernel,
- and the page-fault handler transparently expands the user-space stack of the process (it decreases the start address of the stack),
- or it terminates the process with a SIGSEGV if the stack expansion fails (for example, if the RLIMIT_STACK is reached).
Unfortunately, this stack expansion mechanism is implicit and fragile: it relies on page-fault exceptions, but if another memory region is mapped directly below the stack, then the stack-pointer can move from the stack into the other memory region without raising a page-fault, and:
- the kernel cannot tell that the process needed more stack memory;
- the process cannot tell that its stack-pointer moved from the stack into another memory region.
The real issue here is that writing programs in memory-unsafe languages is inherently difficult and risky, and fewer programs should be written that way.
It looks like the fix is going to involve adding some code to LLVM to probe each stack page when you make a large stack allocation, but once that happens, it's straightforward for clang to implement -fstack-check for C and C++ programs, too.
This is a weird emergent problem from the fact that stack and heap memory are part of the same address space and that the stack is designed to implicitly grow as needed. It's not clear that it's a language or compiler's fault for relying on the stack doing that, nor that it's the platform or kernel's fault for making that approach possible.
I'm not totally sure how you would design a language so that you don't have this problem. I guess you could forbid a function from using more than some small amount of stack, and make sure your stack guard area is at least that big, but that seems like an unfortunate restriction. Maybe something like the Stackless Python approach, where all variables are heap-allocated, would work?
Also, all that said, note that Windows gets this right: MSVC inserts calls to _chkstk() to do stack probing. I believe this entire class of vulnerability doesn't exist on Windows, and probably some people at MS who spend their entire lives in memory-unsafe languages are feeling very smug today.
At a source level, these programs may as well be 100% bug free.
This is nothing more than a quirk of implementation. And one that affects all languages that use the stack (and by which I mean the stack, not some heap-allocated structure the language provides stack-like operations on). Memory safety doesn't really enter the picture.
If "all programs are safe", we could just use the Amiga OS kernel and no longer need an mmu, or a similar design.
[ed: apparently windows NT takes steps to avoid this according to a sibling comment. Not clear, but I assume it implies a performance hit for certain heavy stack usage?]
1. Memory-safe languages are about making sure that the programmer's intended behavior matches the actual behavior, that is, eliminating a class of bugs related to memory unsafety. They are a security scheme insofar as these bugs are security bugs, but they're not an interprocess security scheme. In particular, you can write a memory-safe debugger that goes and makes arbitrary modifications to other processes the OS gives it access to. You can even write a memory-safe program in Rust that goes and edits /proc/self/mem. But in these cases, the programmer is intending to mess with process memory directly, so the language isn't obligated to stop the programmer. It is obligated to stop the programmer from, say, overflowing a string and overwriting the return address.
2. It is certainly possible to design a memory-safe language that is usable for interprocess memory protection. Microsoft had a research OS called Singularity that did exactly this: https://www.microsoft.com/en-us/research/wp-content/uploads/... But it's another step on top of memory safety.
3. Preemptive multitasking and protected memory aren't inherently related (although, yes, in the market, most cooperatively multitasked OSes lacked memory protection, and most OSes with memory protections were preemptively multitasked). You can have a preemptively multitasked system with no MMU at all; you just need to respond to timer interrupts and switch tasks.
I suppose it's fine to say mallloc will return memory, but it's up to the process to check if there are any overlaps - but that sounds a little crazy?
If you decrement your stack pointer by a large value and then offset it - which is essentially what's happening in these cases - the kernel can't arbitrate that because if the access lands in otherwise allocated memory it doesn't fault and so the kernel never sees the access at all.
I can see how the current stack/heap thing evolved - but I still think it's crazy :-)
All that's happening here is that userspace is moving its stack pointer into the heap it had previously allocated. Note that "moving the stack pointer" is not a kernel-mediated operation.
Conversely, if you grow your stack by some value on the order of gigabytes, you're basically coming up with a pointer that appears to have no relation to the stack, and dereferencing it. So the platform is going to do exactly what it does if you were to dereference the same pointer value with no stack involved: read/write memory if it's mapped and segfault if not.
You could totally imagine a platform where growing the stack were a more well-defined operation. You want to avoid each function call and local allocation having the overhead of a system call, though: the nice thing about the current scheme is that it's zero-overhead if there's a mapped stack page. So the scheme was designed (or probably emerged more than was intentionally designed) for the case where syscalls are very slow, MMUs work fine, and perfect memory safety isn't the goal, i.e., the original UNIX target audience. :-)
You could keep a thread-local variable somewhere indicating the current stack limit, and make a system call when you need to increment it. That doesn't require an MMU at all: the userspace API is that you call some system routine when you need to expand the stack, and it says yes or no (or it either says yes or kills you with a segfault, or whatever). In an MMU-less system, you can just keep track of the amount of heap allocation, and have the system routine fail when you're too close to your heap.
Or you could do stack probing, which works but requires an MMU.
If performance were a concern (and I don't really think it is), it should be possible to reduce the impact by assuming a sufficiently large stack guard that the majority of functions with fixed size stack frames could never leap over. Then these functions wouldn't need any runtime checks. The only checking you'd have to do is on code that uses VLAs, alloca, or such, along with the few outlier functions that use ridiculously large fixed size buffers. I don't see why you should need to touch every page.
Section IV.1.7 ("64-bit exploitation") mentions CVE-2017-1000379.
These are remarkable because they are conceptually straightforward but the power of the exploits was potentially substantial.
edit: original first sentence was "told Red Hat, Red Hat fixed them." Apparently that was wrong and people (appropriately!) want credit attributed to the right parties.