The standards committee invented the concept of “undefined behavior” to capture this range of behavior. Essentially, it’s a license for the implementation to do anything it wants. And that’s reasonable; it’s hard to imagine nailing down the behavior any further without compromising performance or the fundamental nature of the problem. But given this hammer, the committee applied it far more broadly.
However, compiler authors got bolder over time, feeling that everything allowed in the standard was fair game, at the same time getting more sophisticated in what optimizations could be done with stronger assumptions.
Are "compiler authors" and the "standards committee" really two distinct groups? I've presumed that the expansion of "undefined behavior" in C is an example of "regulatory capture". That is, the specifications of the language are being decided by those who see "undefined" as yet another opportunity for optimization, as opposed to the users of compilers who tend to be more worried about broken programs.
In other words, I wouldn't blame the standards committee; they just set out to define a minimal "subset" language which would be suitably portable so as to be widely applicable, and left the rest to the implementers. They basically said "here is what we have standardised. The rest is up to you to define what is reasonable."
The "adversarial" attitude is entirely attributed to dogmatic compiler authors who are treating the standard as gospel in their theoretical world, and completely ignoring the real-world implications --- including the users of their products. What they don't seem to understand is the standard is merely a starting point, not the be-all-and-end-all of C. "Just because the standard allows, doesn't mean you should."
Linus has some more (less ranty) posts on the topic here: https://yarchive.net/comp/linux/gcc.html
Said another way, the committee doesn't just come up with useful features, and then leave it to the compiler authors to implement them: the accepted features usually have already a specific high level implementation in mind, and you'll sometimes find various restrictions and quirks in the feature definition that is specifically an accommodation for how compilers are going to implement it. There are lots of examples, but one that comes to mind: RTTI simply doesn't work for objects without a virtual method. Since there is no obvious link between the two it seems like an arbitrary restriction, but it is a reflection of the fact that compilers mostly all want to implement RTTI as a special entry in the vtable, so allowing RTTI on objects w/o a virtual method would require a vtable in every object or some other expensive mechanism. So you get the RTTI you can practically implement.
I think that's all fine and reasonable: a total disconnect between implementation just wouldn't work here (this is true even in higher level languages, but there is arguably more wiggle room there because the performance baseline is different).
> is an example of "regulatory capture". That is, the specifications of the language are being decided by those who see "undefined" as yet another opportunity for optimization, as opposed to the users of compilers who tend to be more worried about broken programs.
My view is different. At least most of the classic examples of "annoying" or "unexpected" undefined behavior seem to quite old and have underlying reasons based on the hardware of the day:
1) signed overflow being weird due to non 2s complement machines
2) signed shifts being weird for the same reason
3) aliasing rules due to all sorts of interesting concerns about storing objects in different memory spaces, segmentation, different pointer representations, alignment (this one is the most iffy)
4) pointer arithmetic rules same as (3) but less iffy
5) reading uninit memory: hardware with trap representations, etc
The list goes on, but most of those have been with us for a while. The most aggressive use of most of these things, that we hear about now as some kind of terrible abuse of the rules by the compiler authors has mostly occurred many years after the rules were in place.
Certainly there is probably resistance to changing some of these rules to become defined, from the compiler authors - but I don't think it explains (mostly) their existence in the first place. Some of the rules are kind of papering over things that could actually be done more efficiently if higher level semantics were available. For example many of the "signed overflow allows loops to be optimized/vectorized more efficiently" would not be needed if there were some built-in way to iterate over things like arrays (arrays largely don't even exist in C as you mostly end carrying around a decayed pointer and size). Another example is the aliasing system: with stronger rules about aliasing memory areas, or better support for first-class arrays, or even just earlier introduction of restrict into the language, you'd wouldn't need to lean on the aliasing rules as heavily to get the effect through the strict aliasing "backdoor".
I see the most of the problems with UB in C and C++ as stemming from three main factors:
1) C is more than 40 years old, and C++ is more than 30. Many of the UB rules have been with us, formally or informally for a long time, and a long time ago there was both more weird hardware in actual use, and more uncertainty about what type of hardware would dominate in the future.
So a lot of the things that were left as UB, as well as lots of implementation defined things like the sizes of all the various primitive types, may have made actually made sense decades ago as a strategy for wide adoption and future trends.
More recent languages have the advantage in a dramatic stabilization in hardware: everyone has really settled on 2's complement, and non-weird word sizes, etc. It is very hard (I think) for C to roll back, practically and ego-wise, to roll back of the very old decisions which may have made sense when they were made.
2) C and C++ are both designed by committee. The committee is full of representatives from various hardware and software companies. These representatives are obviously going to "talk their own book" and make sure the language works well on their hardware. This is why we have all sorts of concessions to Itanium in the standard - do you think the committee could have said "no" to what seemed like the future architecture at the time? Itanium failed but we are left with its legacy in the standards. Similarly we have the "consume" stuff in the C++11 (and C11?) memory model. Probably have of the complexity in the entire memory model is just related to this concept. The model is _much_ more difficult to understand and many different concepts that wouldn't have existed needed to be introduced to support it. Not only that it affected all sorts of stuff outside of the memory model such as the whole concept of "carries dependency" which needs to be pervasive in the language spec.
That's _all_ there to support the POWER memory model in some scenarios. It seems like a shame to me. If you consider the amount of code in the world, on a cycle-weighted basis, that is running on power and would be affected by leaving this out it must be a small faction of 1%, but how can the committe say "Sorry IBM, you just aren't important enough anymore, you'll have to live without this crazy consume stuff?". It's just not going to happen. As some kind of consensus based body, rather than say a private, profit driven business, a weakness it that it's hard to say "no" and everyone's little idea or weakness gets accommodated.
3) C and C++ were very reluctant to make things "implementation defined" and almost always favored performance over complexity. I think there is some method to that madness: implementation defined is kind of ugly. How do you ensure compilers really define it? Will they use the same kind of precise language as the standard? What happens to portability in a language with many impl-defined aspects? Every compiler would essentially define their own sub-language.
Still I think it was worth trying to lean on this a bit more. For one, you could define some "default" way of doing things, like 2s complement arithmetic or pointers-that-are-just-integers-and-otherwise-work-like-you-think and have some standard macro you can check to see if that's the case. Almost like C did with `__STDC_IEC_559__`: if you define that you are using standard IEEE 754 floats (it didn't work out that well because some deviations from the large IEEE standard means that many compilers don't define it).
Given that, you could simply write programs that check the expected macros and error out if not supported and still have "portable" code as long as the target architecture is non-weird. You could just up-front check that you aren't running on a deathstation 9000.
If you wanted, you could also have conditional compilation: do the fast/normal thing if the hardware worked as expected (e.g., sign-extending shifts), and then fall back to some slower hack if not.
Personally I think this was a mistake. A lot of things could be left as implementation-defined instead of undefined: most arithmetic operations, for example. The issue with undefined behavior is that you can’t have it in your program, period. And if you do, your entire program is technically invalid, and the compiler could do unexpected things to it without warning. However, with implementation defined behavior, you can use language constructs (e.g. #ifdef) to choose the correct implementation.
That said, I think defining things as implementation-defined is only really useful when you have specific feature-check macros you can use to determine the behavior.
Otherwise you are kind of just trading one type of portability for another: rather than relying on the way your compiler handles UB directly, you are relying on the way your compiler does some ID (implementation defined) behavior: if you run your code on a different compiler, or a different version of the current compiler, the ID could change completely and the unexpected result probably quickly leads to either UB or some functional bug anyways.
int add(int a, int b)
return a + b;
Overall though, this isn't how compilers work: they aren't "fucking around" with code they know for sure is broken and will be executed. They are just applying a large series of optimization passes which includes passes like removing unreachable code, and that interacts with an analysis of paths that cannot be take for various reasons (including that end in UB) to remove "big chunks of code".
The same passes that screw you over in any of these famous examples are the ones that help code generation in the 99% of remaining cases including many "obvious" optimizations that you'd want the compiler to perform.
I know the situation with undefined behavior is distressing and the examples make it look like the compiler writers are out to get you, but that's not really the case.
Hmm. I’m ignorant of the actual history of the standardization process, but these days wouldn’t memory_order_consume benefit ARM systems equally (that is, if compilers were able to actually implement it)? The ARM architecture similarly provides an ordering guarantee for loads whose address depends on previous loads, in what is otherwise a weak memory model.
Suppose I've got some nasty DSP algos that depend on undefined behavior of type-punning in C. To address this I use the union member access trick which is supported by a gcc extension.
If my only target is web assembly, then what are the reason that I should care that my binary is actually a gcc dialect of c and not standard c?
But if you're being careful, you can do type-punning. The "modern" way is generally with memcpy, but union can be valid if done carefully. The clearest statement I was able to find of this is . I find it frustrating that it's so hard to get a clear read on this. For example, if as Regehr says union-based punning is reliable, then what was the concern that the linked Linux kernel patch trying to address? I haven't dug into it in super detail.
> The practice of reading from a different union member than the one most recently written to (called “type-punning”) is common. Even with -fstrict-aliasing, type-punning is allowed, provided the memory is accessed through the union type. So, the code above works as expected.
Also, it adds a denormal and inf check which I don't need in this particular hot loop.
To answer the second part of your response, the correct way to do this yourself would be to memcpy the double and then inspect the result directly. Any compiler worth its salt will end up generating the assembly code you want (namely, no call to memcpy, just a couple of arithmetic shifts and masks).
What does a compiler "worth its salt" do with the memcpy at "-O0"?
So I only see two options:
1. Use memcpy, blindly hope that gcc optimizes it to the same bitmath I'm currently using, and blindly hope that gcc supports that optimized path for the foreseeable future.
2. Use the union type-punning trick, and blindly hope gcc supports their extension for the foreseeable future.
#1 above requires me to blindly hope twice, whereas #2 requires only one instance of blind hope. Furthermore, I already know how the blind hope of gcc optimizing memcpy into bitmath turns out-- gcc does not optimize it in this way.
So I cannot see any alternative to relying on the union type-punning trick in this case.
How much do you trust the GCC implementors to a) maintain the behaviour that's relevant to your program b) maintain GCC at all? How much do you trust that the rest of the ecosystem will continue to support GCC and/or this GCC dialect?
Think of "the gcc dialect of c" the same way you'd think of any semi-obscure programming language, and ask the same questions you would: what kind of debuggers/profilers exist for it? How easy is it to hire people with experience in it? How much momentum is there behind it?
The T-shirt is good though :)