Hacker News new | past | comments | ask | show | jobs | submit login
Accidentally Overwriting Another Local Variable in C and C++ (petrzemek.net)
42 points by ingve on Dec 5, 2019 | hide | past | favorite | 51 comments

As I started reading the code my very first thought was "Well obviously the function we aren't being shown is free to wreak havoc". And it duly transpired that thirdparty_process() was indeed wreaking havoc. Any C or C++ function with a parameter can take the address of that parameter and use it to overwrite random stuff on the stack frame if it wants too. In this case it's even less surprising as the parameter is already the address of a stack frame variable. Nothing really surprising here.

Thinking about it more you don't even need the parameter, just declare a local variable, take its address and go crazy. With great power (it's a systems programming language) comes great responsibility.

Wow, this is the most Hacker News comment I think I've ever posted, maybe I need to quit Hacker News :-).

> Thinking about it more you don't even need the parameter, just declare a local variable, take its address and go crazy.

If you go one layer below with some assembly, you don't even need variables. Just take the stack pointer register (rsp on amd64) and start writing!

Or maybe no assembly needed, just cast an arbitrary hexadecimal value you think is likely to be in range to be your stack pointer [may vary by platform, ASLR, how many frames are on the stack, etc.] into a pointer and go to town.

This kind of stuff is why I'm glad that the brogrammer fad of "ergonomic" c library interfaces with heavily variadic function signatures has died down a bit.

You need either a static type system, or a runtime that'll catch weird accesses and throw immediately. A dynamic but happy to corrupt memory interface boundary is just crazy town.

I was looking for a small image processing library (as opposed to libMagickWand) and came across libvips [1]. I found it absolutely bizarre that the public API in a C library was full of variadic functions to support optional arguments. The last argument to most function calls is required to be NULL to denote the end of the argument list, e.g., https://libvips.github.io/libvips/API/current/using-from-c.h... (vips_image_new_from_file, vips_invert, etc.).

[1]: https://libvips.github.io/libvips/

A fun one relating to 64-bit and NULL in varargs:

If your system defines NULL as 0, which is legal in C++, then NULL in the vararg list only pushes an int-sized (4 bytes) zero, and the vararg reader pulls a pointer-sized (8 bytes) value, which may or may NOT equal zero because of the 4 extra bytes...

If using modern c++ you should use nullptr instead

If using modern C++, you shouldn't be using varargs either.

Yeah, that is unfortunate. Even if you want to do that (in C, the correct way, of course, is to name them differently), should use macro to select correct function prototype to dispatch to, that still enables type checking and avoid the variadic function non-sense. Variadic function is pure evil (seriously, who can remember these random rules that in variadic function, a char will be promoted to int and a float will be promoted to double ...)

Was it Motif, or some other UI system back in the day over-using this technique... There was time when I thought it was good idea, but rejected it for other reasons - no easy way to pass-through arguments (unless you've provided for va_list function beforehand, or use some trampoline hacks). But then the overall safeness... yet people to this day (me included) use "printf" - it's like beer & cigaretes - it's bad, but comfortable, and sexy :)

I always cringe when I encounter this since there are often far better workarounds. Static analysis tools can't really catch issues with these variadic interfaces easily either.

What does "brogrammer" mean, in this usage?

A pejorative for "I can't let any types or anything get in the way of my productivity as I crush some mad code, braugh" types.

> I can't let any types or anything get in the way of my productivity

Okay, sure.

> ...as I crush some mad code, braugh

What? I've never run into this. How does this have anything to do with opinions on types again? It sounds like you have an axe to grind.

This is pretty close to what you'll see in the first paragraph of the Wikipedia article on brogrammer [1]. Klout posted an advertisement at the Stanford career fair saying "Want to bro down and crush some code? Klout is hiring."

[1] https://en.wikipedia.org/wiki/Brogrammer

Hahaha, omg.

> The company later described it as a joke and as an unfortunate misstep.

"It's just a joke, bro"

lol, I'd say it is definitely an overgeneralization, but I've seen the culture they are referring to and it is very real and very irritating to deal with the messes they create. Don't read too much into it.

This isn’t exactly surprising. With functions like printf and scanf you always have to be super careful to pass the right data types. It may work with one OS or compiler but not with the other. You are actually lucky if your stuff crashes. Otherwise it will just keep going now with wrong values.

At least for printf, there's normally a way to tell the compiler that it's a printf style interface and get something approaching typesafety back. On gcc this is

  extern int
  my_printf (void *my_object, const char *my_format, ...)
  __attribute__ ((format (printf, 2, 3)));
(Now I like rust's solution better, a sane macro system so this interface doesn't have to be special cased by the compiler to be safe).

This is best-practice. Varargs will bite you if you don't have the compiler check your work, but fortunately they're happy to do so these days.

C++20 is adding a type-safe alternative to printf, called std::format, though it doesn't provide warnings at compile time.

Here's one you can use with (pretty much) any version of C++:

    __attribute__ ((format (printf, 1, 2))) 
    static inline std::string format(const char* fmt, ...) {
        va_list args;
        va_start(args, fmt);
        size_t bytes = vsnprintf(0, 0, fmt, args);
        va_list again;
        va_start(again, fmt);
        std::string result(bytes, '\0');
        vsnprintf(&result[0], bytes + 1, fmt, again);
        return result;

They should have gone step further and done string interpolation how C# and JS can do. Really nice to use and avoids errors.

A related problem harder for C++ to prevent is passing a pointer to a local (say a local int) and having the callee increment the pointer. This is a form of the sprintf bug.

C++ has references which won't suffer from this, and containers and such to avoid sprintf()-class bugs. But if you're still writing C++ code while thinking of C there isn't much C++ can do about it.

Yawn. Yet another person discovers how easy it is to create undefined behavior in C and C++.

I wonder if Address Sanitizer or Undefined Behavior Sanitizer would catch this.

Or if one of the canary options would help (-fstack-protector-full or whatever it's called).

I’m not confident that would help in this case, since the return address is not overwritten.

I'm pretty sure that valgrind would have at least.

According to the post, Valgrind seemed to miss the issue.

the default valgrind engine, memcheck, is a heap checker and doesn't find stack overruns.

In valgrind 3.7.0, you can use the experimental engine exp-sgcheck to look for stack and global overruns. ( valgrind command line parameter --tool=exp-sgcheck ). It is experimental and doesn't always work, but might have in this case.

Huh, crazy. It's def tracked accesses straddling a stack frame boundary before for me.

It's not a stack frame boundary, though; it's just jumping between two objects in the same frame.

that's surprising because there is a leaked pointer

my favorite flavor of this bug is: printf("%f %f %f",0,0,0);

As the author said: an oldie but goodie.

Yet another reason to cover your codebase in assertions. you would have caught this in a few seconds

What assertation would have caught this?

1. assert the integer's post condition. works best if the integer is filled with an uninitialized pattern first.

2. something like CRT or dmalloc to catch the leak. not perfect but a good hint

So you would call thirdparty_process("whatever", &res); then assert that res == 1? On little endian architectures, it is 1. On big endian architectures, it's 0 which seems like a reasonable value as well.

depends on context, but you could either assert a range of possible values or assert not dead

What do you mean "not dead". It's a stack allocated variable.

using a sentinel value in debug mode. in this case the assertion would probably not catch the sentinel, but you would recognize it in gdb

The integer was set correctly, so it wouldn't have triggered an assertation.

And valgrind didn't find the leak, so dlmalloc or CRT def would not have.

I'm very surprised that memcheckers wouldn't catch this leak

And yet, that's what it says on the article.

Reasons I hope I never have to write C(++) in the real world

Eh, like anything there are good C++ codebases and bad C++ codebases. As much as HN loves to hate on C++, it's definitely possible to write good, clean, modern C++ that avoids the majority of these issues. For instance, variadic templates in C++ wouldn't have the issue described in this article.

In practice, I've had far less issues working with modern C++ codebases than with modern Java codebases.

As the blog post points out, in C++ this is actually remedied with variadic templates (parameter packs). The idiomatic C++ equivalent would not have this bug.

Except libraries like the "third_party" example probably can't use parameter packs just yet. They tend to have to keep compatibility with older compilers.

I used to work on a library. You wouldn't believe how long some people held onto RHEL 5... And I had to maintain ABI compatibility with the standard system compiler on each supported OS.

At work we subcontracted a library. It's written in fairly idiomatic C. The person I replaced wasn't comfortable with C so he asked the sub to write a C++ wrapper.

Lots of stuff is just a typedef if the C struct in a namespace. Classes that do work all have private constructors and static member functions that return a new'd pointer to the object. They also have static member functions to deallocate the objects.

Thanks for the C++ wrapper assholes. Your C++ wrapper looks more like C than the C library it's wrapping.

Greenfield C++ (or anything that was greenfield with C++11 in mind) is a joy to write. Idiomatic C in C++ or idiomatic Java in C++ is terrible. The fact that you can write so many different styles is probably the reason the language is so ubiquitous, but it sure sucks a lot more to deal with other people's shitty code in C++ than in many other languages.

It seems like C programmers uncomfortable with C++ made a wrapper for C++ programmers uncomfortable with C.

I don't know why you call them assholes, they are just being themselves, limitations and all, and providing a fairly amusing story for you to tell.

Applications are open for YC Summer 2021

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact