Hacker News new | past | comments | ask | show | jobs | submit login
There’s a Hole in the Bottom of the C: Effectiveness of Allocation Protection [pdf] (mit.edu)
122 points by ingve 4 months ago | hide | past | web | favorite | 88 comments

Short version: yet another hokey scheme for fixing buffer overruns in C fails for

   typedef struct {
        char buf[124];
        void* fptr;
    } mystruct;
because if you overrun "buf", you can overstore "fptr" without going outside the bounds of "struct". Exploit for Ngnix found.

We've got to get more code out of C and into something with subscript checking. None of the kludges for fixing C work.

One (non-generalizable) solution is to avoid using writable function pointers, to help maintain control flow integrity.

For example, in libopus we have platform-specific SIMD optimizations that are chosen at runtime.

The common solution is just to have a function pointer that points to the accelerated routine. But there are only a limited number of choices for each routine. So we put pointers to them in a static const array, which gets stored on a read-only page. Then instead of storing a pointer, we just store an index into this array. This index is chosen based on the available instruction sets detected at runtime. This lets us use the same index for every accelerated function.

Then, we pad the array size out to a power of two. To call an accelerated function, we mask off the upper bits of the index. Even if the index gets corrupted, we still call one of the functions in the array. So there's only so much damage it can do.

Obviously this doesn't apply if the set of functions you need to call is open-ended (e.g., a callback, like nginx was using). But it seems like a good pattern to follow when it applies.

Is there a reason why you did that that way other than to avoid writeable function pointers?

Also, that just sounds like how switch/case is sometimes implemented. Have you considered just using a switch/case statement instead of manually managing function pointers and the like?

That was the main reason.

It also means that you don't need to pass around (a pointer to) a big table of function pointers throughout all of your functions that need to use accelerated routines. Instead you just pass a single integer index. But that is pretty minor.

It can be pretty similar to a switch in practice, but I think there are a few differences. One is function argument handling. Each version of an accelerated routine has to be an independent function, compiled in a separate compilation unit, because you have to limit the instruction sets available. So you will duplicate all of the function call setup overhead in each branch of the switch. This probably gets optimized away by a decent compiler. Even if it does, you still wind up essentially duplicating the function table at every call site. That does not seem likely to get optimized away. You could dispatch to a common thunk which selects the accelerated routine to call, but now you have two function calls instead of one. I am also not sure that switch statements handle the default case as cheaply as doing a single bitwise AND with a small, compile-time constant.

But if you have a use of function pointers that can be replaced by a simple switch statement, that is probably the better approach.

Your approach is kind of how indirect function calls are implemented in WebAssembly, if I am not mistaken.

Using a switch requires that you statically determine and list every target. Implementing your own jump table let's you build it at runtime.

Even in the dynamic case, you can still use an possibly malloced array and verify the index is valid, which is a nice improvement over leaving function pointers lying around in the vicinity of buffers.

I don’t think they actually found an exploit for Nginx: they state that “we assume such a memory bug exists”, rather than saying outright what bug they exploited. It almost sounds like they planted a bug and then exploited it in a binary that was hardened (using this non-standard hardening technique, i.e. not exploiting a binary that was ever used in a real deployment of Nginx).

Exploit != vulnerability. You develop an exploit to take advantage of a specific vulnerability. An exploit for a hypothetical vulnerability is a valid concept, and their conjecture that such a vulnerability may exist is justified in the paper.

I always thought this is example of basic buffer overflow in C - what's different with this example? If somebody asked me to write buffer overflow example in C I'd write something that looks exactly like this, this must have been known for years, what am I missing here?

The "classic" buffer overflow involves overwriting the return address by overflowing an array on stack. Low-fat pointers can protect against that problem [1].

[1]: https://www.comp.nus.edu.sg/~gregory/papers/ndss17stack.pdf

It would be nice if the betterC mode of Dlang would get traction. That fixes the worst weaknesses of C without having to create a whole new ecosystem.

> without having to create a whole new ecosystem

But it does, doesn't it?

You're suggesting we replace C, which isn't necessarily a bad idea, but there are considerable hurdles, as it's a lingua franca. C has an excellent suite of compilers for every platform you can name, excellent development tools, every serious programmer already knows the language, and it easily hooks in to platform-specific functionality.

If we want to replace C, we have a few languages to choose from: BetterC, Zig, Rust.

If BetterC were a strict subset of C, things would be quite different, but it's not, so it has the just another language problem.

If we want to go the subset route, we already have MISRA C, which blocks various categories of bugs and undefined behaviour. Crucially, it's possible to mechanically verify that a program is MISRA-compliant.

No, it's more than that. While that struct might be a generic overrun, the question is how you check bounds on filling buf. The paper is discussing that schemes to constrain buffer size using a global hardware or software address-based size constraints ("low-fat") as bounds checks are less safe than claimed.

I don't know why they didn't put this in the abstract.

That will never happen with UNIX based OSes around, so what we really need, as much as I would like to nuke C, is to have developers accept some variant of "Safe C".

Solaris has it thanks to SPARC tagged memory.

Problem is when would we ever get it as mainstream, always enabled, in other platforms.

Rust is getting some traction, and much web infrastructure has moved away from C/C++ to C# or Go or Java. After decades of little progress, we're seeing forward motion.

True, myself have moved definitely into Java/.NET worlds back in 2006, although C++ is still an option for what is unavoidable to do without it. Maybe in the future their canonical implementations can even be fully bootstraped or have replaced the C++ layer with something else.

But even if all userspace developers move to safer languages, we still have the UNIX derived kernels, PIC and other microcontrollers, routers and plenty of other embedded stuff to worry about.

Even Microsoft with their Azure Sphere project, which is supposed to be all about secure IoT, still bundles a Linux kernel with it.

Hence why we really need to fix C somehow.

There's no way to "fix C" in a way that saves you from the cost of fixing all those broken programs written in C. The best a language-level solution can do for an existing project is make it easier to figure out what specifically needs fixing. Or blow up your code and run time with checks, I guess, but if that was technically feasible, to say nothing of political challenges, those projects wouldn't have been in C to start.

I feel like Elrond at the council of Rivendell: "you have only one choice." Except throwing a ring into a hole is a comparatively well-defined task relative to rewriting an unknown number of crappy pieces of software. Ed: also there's no conveniently situated group of people to whom to assign the project.

> Or blow up your code and run time with checks, I guess, but if that was technically feasible, to say nothing of political challenges, those projects wouldn't have been in C to start.

Most of the infrastructural projects you're referring to predated modern languages. There is no reason, for example, for an MTA to be written in C today. We use MTAs written in C because MTAs are old.

They had pascal and friends, didn't they? Which are safer, safe-ish, right? But I guess today you could eat the cost of automatically inserted runtime checks on software designed to push the limits of 30yo hardware. Still won't fly for Linux or embedded systems...

There is this expression, a language is a dialect with an army.

The same applies to system languages, just that the army is the OS that bundles them as THE official language.

Pascal based OSes unfortunately lost their place on the market.

To pick a more modern example, how many people would have bothered with JavaScript if it wasn't THE language on the Web OS?

Schemes for memory safety of C code have already been applied to UNIXy OS's. SVA-OS with SAFEcode used FreeBSD.

Far as embedded, there were some done in Ada and a Java subset. There's even real-time GC's for that kind of stuff. One or two companies sell Java processors that run it natively. CompSci stays doing CPU extensions that could work with MCU's.

The versions of Pascal that are practical aren't any safer than C.

With a properly designed programming environment, you don't need many bounds checks to begin with. Almost every embedded system can easily afford the cost of the few you can't eliminate.

On the contrary.

While they aren't Rust level safe, they eliminate like 80% of the common C safety errors.

Thanks to explicit conversions, proper enums, reference parameters, actual string and vector types, memory allocation based on types instead of arbitrary numbers, type safe linkage thanks modules.

So while use after free or double frees would are still an issue, many other classes of typical C errors would be gone.

Conversions and such aren't where the memory-safety-related vulnerabilities in C++ come from nowadays. They come from memory management problems, which Pascal doesn't help with.

Moving goal posts here? I thought we were speaking about C.

Yes Rust is great and all, sometimes getting people half-way is better than not at all.

He probably meant to type C/C++ or something but typo'd it. The set of vulnerabilities is exactly the same. If Pascal still lets you do heap corruption, it's only a matter of time before RCEs become common. So yeah, I'm convinced on that point.

Basically, until the world gets rid of UNIX derived OSes, we call it a defeat and go back home.

If you're trying to characterize what I said, no. I'm saying we need to work on convincing the world to frigging drop Unix-derived OSes already.

Agreed, but it seems we are loosing on that front from the last 20 years, right?

Including the commercial failure of Inferno.

How things are going even Microsoft could eventually have again their own version.

So from the business perspective it seems the only way to win is to fix the language UNIX is written on.

> userspace

Pure userspace software is the easiest to fix; there has already been a huge shift to {safer,easier} languages for most new(-ish) software. C/C++ is mostly used for existing software where other options we not {possible,practical}. Porting to another language will probably happen now that we have options like Rust, but it will take a massive amount of work.

However, one place where we need more (and better) alternatives to C is the library ABI. If you want to make a library that can be used by as widely as possible you either use C or something that can generate libraries with a standard C-style ABI. That's the FFI tools of most other languages expect. (C++ ABI is also used, but name-mangling complicates interfacing with other languages) Rust's #[no_mangle] helps, but what we really need is standardized ABI with a safer calling conventions and memory handling semantics that is still very generic, without any language specific baggage or other complications. Until that exists, key parts of the infrastructure will stay with C out of necessity.

> kernels

That's one of the infrastructure pieces I was referring to. Other parts that depend critically on a generic C-like ABI interoperability might include driver-based stuff like libGL.so.

> microcontrollers

While I needed the low-level properties of C (w/ assembly) when I was implementing {802.3,IP,TCP,HTTP,SNMP} on a Z80-clone with 16kB of RAM, modern "microcontrollers" often aren't very "micro" and basically a lightweight linux host. They will move to other languages following after regular userspace software.

However, this is assuming there is even a need for safer languages on the micro, which is true when it's accepting data from a hostile source like a network. A much better idea is to simply not connect your embedded device to a network. A safer language isn't necessary if you aren't controlling anything dangerous, your input is limited to reading buttons states with GPIO, and any bugs can be fixed with a reset from a hardware watchdog timer.

> other embedded stuff to worry about.

There is defiantly a lot of embedded crap that need to be junked and replaced asap, but that's not really a C problem. Buffer overrun safety is great but it doesn't do much when the device has a backdoor password clearly visible in the firmware, ships with the admin password "admin", or grants remote admin access when the password is "" (empty string).

We do need to fix C, but there are, unfortunately, far worse problems with embedded devices.

Regarding ABIs, ChromeOS, Android, Windows (COM/UWP/.NET) and the old guard mainframes are living proof that one can have a library ABI completely unrelated to C.

On the microcontrollers side, there are a couple of vendors that I usually advocate, which do sell Basic, Pascal, Oberon, Java implementations, however if it surprising and refreshing as well, that they still manage to keep in business, given how hard is to move most embedded devs away from C89 + extensions/Assembly duo.

Talks about how to advocate C++ to such devs are a now a recurring theme at C++ conferences.

Even Microsoft which uses a mix of C# and Rust for IoT Edge, they went with Linux and C on Azure Sphere, which is supposed to be their mark on secure IoT, while I was expecting that at very least they would use the opportunity to push their Checked C project, given that Azure Sphere is being pushed by the same MSR group from Singularity and friends.

Check out CHERI - it might get practical some day, if CPU vendors decide to implement it: https://www.cl.cam.ac.uk/research/security/ctsrd/cheri/

(Disclaimer - I work there.)

Thanks, I do follow the project. Good luck keeping it relevant.

> We've got to get more code out of C and into something with subscript checking. None of the kludges for fixing C work.

shameless plug: https://www.c2rust.com

In such cases you can insert a barrier with a magic value to detect any array overflow.

The other thing to do is to enforce checks on array indices.

You're referring to canaries, which are special values used to detect if overflows occur.

Canaries are usually checked on heap deallocation (resp. stack canaries are checked on stack frame deallocation, i.e. function return). Checking them at any other time is expensive and incurs high overhead. They also don't catch all overflows - an overflow with an attacker-controlled index could easily skip the canary and overwrite whatever lies past it.

The problem of memory bounds checking is solvable if you decide to forego memory, processing time, and/or compatibility with existing code. Trying to achieve memory safety while maintaining those three properties is very hard.

There's no way around it.

What I described is what higher-level languages do if they provide this sort of protection and it's what must do done if such protection is required.

Most higher-level languages simply perform bounds checks rather than relying on canaries for safety. Although they can be combined, if your canaries are hit it means your HLL isn’t doing a good enough job of enforcing bounds.

That's what I wrote...

> None of the kludges for fixing C work.

Turn on ASAN in production. HWASAN may make that more palatable.

Honest question: Can't you just swap around the order of fptr and buf ?

Won't work if you have more pointers and buffers in the structure.

or you could move all pointers to the beginning of the struct, and could even make the pointer a const (read only).

typedef struct {

        const void* fptr; 

        char buf[124];

    } mystruct;

Actually the structure in Nginx that the authors attacked is laid out exactly in this way:

    struct ngx_http_request_s {
        uint32_t signature;
        ngx_connection_t connection;
        ngx_http_log_handler_pt  log_handler;
        u_char    lowcase_header[NGX_HTTP_LC_HEADER_LEN];
        unsigned http_minor:16;
        unsigned http_major:16;
They underflow lowcase_header (or rather, assume the existence of an underflow bug) to overwrite log_handler, a function pointer that is called when an error occurs.

I do find the authors' presentation a little disingenuous, because the purported underflow bug does not actually exist - the authors assume it does, then proceed as if they have full control over the function pointer (including the ASLR leak necessary to obtain proper gadget addresses).

The const keyword really doesn't mean anything. It is not checked at runtime, so a buffer-overrun could change the const member regardless

and even if it were, it is in the wrong position (as defined, fptr is a mutable pointer to const data).

What if mystruct ends up in an array? buf[125] will overwite fptr all the same. Furthermore, what if you need two arrays in your struct?

Wouldn't Valgrind catch this one?

Valgrind is basically an implementation of the defense they are saying doesn't work. It works on an allocation basis.

To my knowledge, valgrind doesn't catch overflow errors that overflow inside of an object.

Here's my proposal for fixing buffer overflows in C:


It's simple and it works, we've got 15 years experience with this in D.

I mean, sure, fat pointers are the "right" solution, but the point of "low-fat" pointers in the original paper [1] is for greater compatibility with existing C code, including at the ABI level. Adding fat pointers to C (which is what your proposal is) is incompatible with those goals. Page 2 of the paper discusses this.

[1]: https://www.comp.nus.edu.sg/~gregory/papers/ndss17stack.pdf

My proposal does not impact existing C code, and does not make existing C pointers into phat pointers.

It's purely additive, which means you can incrementally adopt it in places where it matters.

It makes a subset of what are currently C pointers—arrays in argument position—into fat pointers.

The phat pointers only appear when they are specifically declared as phat pointers. Existing syntax does not change behavior.

Yes, and that's a laudable goal, but it's not the same problem as the one the "low-fat" pointers proposal is trying to solve. The "low-fat" pointers proposal attempts to improve the safety of existing code.

Is your complaint that because we don't have a perfect automatic solution we shouldn't bother offering any solutions, not even incremental opt-in ones?

I never said fat pointers shouldn't be added to C. The point is that they're irrelevant to this paper, which demonstrates a weakness in a mitigation designed to improve safety of existing code.

No, pc is simply saying that the fat pointer solution is not in line with the goals of the solution presented in the paper.

I still think that the Deputy's project (from UC Berkeley) approach (https://barnowl.org/research/pubs/07-hotos-linux.pdf) is the most practical for adding bounds checks to existing C code. Basically annotate functions and data structures with the, usually available, bounds information, e.g.:

  f(int *a, int n) -> f(int *count(n) a, int n);

  struct { int len; int *values; }

  struct { int len; int *count(len) values; }
And other annotations for unions, etc.

This actually works quite well in practice - the cited paper involved us applying these annotations to a complete, bootable linux kernel. You do have to be willing to tolerate limited code changes and a few "trust me" casts

Disclaimer: I worked on the overall project, though not significantly on the Deputy part.

Depends on if one would rather write:

    int s[3];
    int *p;
    f(s, sizeof(s)/sizeof(s[0]));
    f(p, n);

    f(int *count(n) a, int n);
(which has a mistake, n should be size_t not int), or:

    int s[3];
    int* p;
    int[] a;
    f(p[0..n]);   // convert pointer to phat pointer
    f(a);         // or use existing phat pointer
    f(s);         // static arrays just work

    f(int[] a);   // the declaration of f()

The point was not to design the "nicest" way of adding safe arrays to C, but a way that captures existing uses and preserves binary compatibility, including preserving memory layouts. For instance:

    uint16_t count;
    uint16_t *COUNT(count) buffer;
    uint16_t *BND(buffer, buffer + count) pos; // pos is within the specified bounds
which captures that buffer is a known-size array, and that pos ranges within buffer's contents. Of course, writing from scratch, one might prefer

  uint16_t[] buffer;
  uint16_t pos; // An offset within buffer
but that's not much help when incrementally modifying an existing code base, or interacting with some external library one either doesn't have source for or doesn't wish to modify.

> that's not much help when incrementally modifying an existing code base

I am doing exactly that now that I converted the Digital Mars C++ and the DMD D programming language compilers to D - converting the pointer based array code to [] based arrays. Doing it incrementally is very doable and easy.

It's not viral like const is. A slice can be trivially created:

    a[] = p[0 .. length];
and the other way:

    p = a.ptr;
    length = a.length;

I agree with the framing here, and I agree a small bit of syntax would go a long way.

FWIW, you will see a class called StringPiece copied all over a lot of open source Google projects, including Chrome, Ninja, RE2, probably Android, etc.

That is essentially a pointer-length pair. I agree it would be nice to have a dedicated syntax for it, and not a class that everybody has to copy around.

C++ 17 names this string_view (long overdue!)

I just Googled and apparently there was a problem with array_view ? That's a shame.


The trouble with a library type is the compiler is not aware of it and cannot make semantic use of it.

The compiler does this with other “library” functions like malloc and printf. There’s no reason a compiler couldn’t convert to a built in for this new type.

C doesn't have member functions and doesn't do implicit conversions to/from struct types.

That’s obvious and also completely unrelated to what I was saying.

The optimizer can convert a library type to a builtin type if it still behaves as specified in the standard.

C isn't powerful enough to have the library type behave as desired (with implicit conversions, etc.).

aye, if it is defined in the same standard, there is really no technical difference between a language feature and a library feature.

I mean, C++ kind of solves this issue with templates:

  template<size_t N>
  void foo(int (&array)[N]) {}

How does this fix it? It automatically includes a size_t with all array types but does it also rely on the programmer setting the size_t value correctly? If so there will be the same problems. It seems like this is exactly the same as just passing a size argument along with all array-type arguments, which is already common practice.

> does it also rely on the programmer setting the size_t value correctly?

For static arrays, no, the compiler does it. For turning pointers into phat pointers, yes, the user has to supply the length. But he only has to do that once.

> exactly the same

It isn't, for the reasons:

1. The compiler knows about it, so it can be used for optimization purposes, and to optionally insert bounds checking or not.

2. Consider why C has structs. It's because it's a lot less error prone to group related values together, rather than having a collection of independent variables. It makes the code easier to read, too.

I wish somebody could bring Walter into C standard committee

I remember when C added function prototypes. It was an unmitigated and welcome improvement. If I was to add two features with the most impact, simple to implement, and no backwards compatibility issues, it would be:

1. phat pointers for arrays

2. nested functions

(Nested functions have a surprising ability to untangle ugly control flow, including getting rid of most uses of goto's.)

They do not want to break single-pass compilers. Nested functions with no forward declarations will require a multi-pass compiler. I think your D compiler is multi-pass already.

> They do not want to break single-pass compilers. Nested functions with no forward declarations will require a multi-pass compiler.

Why is that exactly?

Nested functions are done in one pass. As you might infer, they cannot be forward referenced within the function.

Just simply slowly substitute C code, piece by piece into Rust. Building userspace less depending from C ecosystem.

Yes, at least as long as you're working on a target supported by the Rust compiler.

> In this paper, we have illustrated a new type of attack that can bypass bounds checking spatial memory safety techniques that protect allocations.

No they have not. Bounds checking only outer structs doesnt do bounds checks on inner buffers. They haven't analyzed traditional bounds checks in C: asan, fortify, c11 Annex K, Intel MPX, only a broken application, which does none of these, and uses a wrong bounds check.

Without protection function pointers themselves, this only covers a small part of the issue.

I'm more interested in seeing how CPI and CPS pans out in the end:


tl;dr: the authors overflow from one struct field into another struct field, thus avoiding any allocation protection scheme that protects individual heap allocations from each other. Most of the paper is just spent implementing basic ROP-style gadget hopping.

I’m not actually clear on what’s new or novel here. Allocation protection mechanisms exist to prevent one allocated block from overflowing into another allocated block. They usually don’t protect allocated blocks from themselves.

Hard to read a paper that repeats “legacy languages” many times just on the first page.

There's a hole in the bucket... dear liza dear liza.

I suspect the circular never ending problem song might not have been what they were going for (or maybe it was).


Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact