
With Undefined Behavior, Anything Is Possible - kibwen
https://raphlinus.github.io/programming/rust/2018/08/17/undefined-behavior.html
======
nkurz
Nice article! From the outside, it seems so obvious that the fix for C is to
reduce the scope of undefined behavior, either by simply defining it or making
it "implementation defined". But it seems equally obvious that this is not
going to happen.

 _The standards committee invented the concept of “undefined behavior” to
capture this range of behavior. Essentially, it’s a license for the
implementation to do anything it wants. And that’s reasonable; it’s hard to
imagine nailing down the behavior any further without compromising performance
or the fundamental nature of the problem. But given this hammer, the committee
applied it far more broadly._

 _However, compiler authors got bolder over time, feeling that everything
allowed in the standard was fair game, at the same time getting more
sophisticated in what optimizations could be done with stronger assumptions._

Are "compiler authors" and the "standards committee" really two distinct
groups? I've presumed that the expansion of "undefined behavior" in C is an
example of "regulatory capture". That is, the specifications of the language
are being decided by those who see "undefined" as yet another opportunity for
optimization, as opposed to the users of compilers who tend to be more worried
about broken programs.

~~~
BeeOnRope
I think the "compiler authors" and "standards committee" are in fact two
mostly distinct groups. Now, there is some overlap, but more importantly
feedback certainly flows both ways between the standards group group and
compiler authors, and the standards committee is usually well aware of the
concerns of the compiler authors. If it were any other way, it probably
wouldn't work since it would be easy to ask for a feature that couldn't be
implemented efficiently.

Said another way, the committee doesn't just come up with useful features, and
then leave it to the compiler authors to implement them: the accepted features
usually have already a specific high level implementation in mind, and you'll
sometimes find various restrictions and quirks in the feature definition that
is specifically an accommodation for how compilers are going to implement it.
There are lots of examples, but one that comes to mind: RTTI simply doesn't
work for objects without a virtual method. Since there is no obvious link
between the two it seems like an arbitrary restriction, but it is a reflection
of the fact that compilers mostly all want to implement RTTI as a special
entry in the vtable, so allowing RTTI on objects w/o a virtual method would
require a vtable in every object or some other expensive mechanism. So you get
the RTTI you can practically implement.

I think that's all fine and reasonable: a total disconnect between
implementation just wouldn't work here (this is true even in higher level
languages, but there is arguably more wiggle room there because the
performance baseline is different).

> is an example of "regulatory capture". That is, the specifications of the
> language are being decided by those who see "undefined" as yet another
> opportunity for optimization, as opposed to the users of compilers who tend
> to be more worried about broken programs.

My view is different. At least most of the classic examples of "annoying" or
"unexpected" undefined behavior seem to quite old and have underlying reasons
based on the hardware of the day:

1) signed overflow being weird due to non 2s complement machines 2) signed
shifts being weird for the same reason 3) aliasing rules due to all sorts of
interesting concerns about storing objects in different memory spaces,
segmentation, different pointer representations, alignment (this one is the
most iffy) 4) pointer arithmetic rules same as (3) but less iffy 5) reading
uninit memory: hardware with trap representations, etc

The list goes on, but most of those have been with us for a while. The most
aggressive use of most of these things, that we hear about now as some kind of
terrible abuse of the rules by the compiler authors has mostly occurred many
years after the rules were in place.

Certainly there is probably resistance to changing some of these rules to
become defined, from the compiler authors - but I don't think it explains
(mostly) their existence in the first place. Some of the rules are kind of
papering over things that could actually be done more efficiently if higher
level semantics were available. For example many of the "signed overflow
allows loops to be optimized/vectorized more efficiently" would not be needed
if there were some built-in way to iterate over things like arrays (arrays
largely don't even exist in C as you mostly end carrying around a decayed
pointer and size). Another example is the aliasing system: with stronger rules
about aliasing memory areas, or better support for first-class arrays, or even
just earlier introduction of restrict into the language, you'd wouldn't need
to lean on the aliasing rules as heavily to get the effect through the strict
aliasing "backdoor".

I see the most of the problems with UB in C and C++ as stemming from three
main factors:

1) C is more than 40 years old, and C++ is more than 30. Many of the UB rules
have been with us, formally or informally for a long time, and a long time ago
there was both more weird hardware in actual use, and more uncertainty about
what type of hardware would dominate in the future.

So a lot of the things that were left as UB, as well as lots of implementation
defined things like the sizes of all the various primitive types, may have
made actually made sense decades ago as a strategy for wide adoption and
future trends.

More recent languages have the advantage in a dramatic stabilization in
hardware: everyone has really settled on 2's complement, and non-weird word
sizes, etc. It is very hard (I think) for C to roll back, practically and ego-
wise, to roll back of the very old decisions which may have made sense when
they were made.

2) C and C++ are both designed by committee. The committee is full of
representatives from various hardware and software companies. These
representatives are obviously going to "talk their own book" and make sure the
language works well on their hardware. This is why we have all sorts of
concessions to Itanium in the standard - do you think the committee could have
said "no" to what seemed like the future architecture at the time? Itanium
failed but we are left with its legacy in the standards. Similarly we have the
"consume" stuff in the C++11 (and C11?) memory model. Probably have of the
complexity in the entire memory model is just related to this concept. The
model is _much_ more difficult to understand and many different concepts that
wouldn't have existed needed to be introduced to support it. Not only that it
affected all sorts of stuff outside of the memory model such as the whole
concept of "carries dependency" which needs to be pervasive in the language
spec.

That's _all_ there to support the POWER memory model in some scenarios. It
seems like a shame to me. If you consider the amount of code in the world, on
a cycle-weighted basis, that is running on power and would be affected by
leaving this out it must be a small faction of 1%, but how can the committe
say "Sorry IBM, you just aren't important enough anymore, you'll have to live
without this crazy consume stuff?". It's just not going to happen. As some
kind of consensus based body, rather than say a private, profit driven
business, a weakness it that it's hard to say "no" and everyone's little idea
or weakness gets accommodated.

3) C and C++ were very reluctant to make things "implementation defined" and
almost always favored performance over complexity. I think there is some
method to that madness: implementation defined is kind of ugly. How do you
ensure compilers really define it? Will they use the same kind of precise
language as the standard? What happens to portability in a language with many
impl-defined aspects? Every compiler would essentially define their own sub-
language.

Still I think it was worth trying to lean on this a bit more. For one, you
could define some "default" way of doing things, like 2s complement arithmetic
or pointers-that-are-just-integers-and-otherwise-work-like-you-think and have
some standard macro you can check to see if that's the case. Almost like C did
with `__STDC_IEC_559__`: if you define that you are using standard IEEE 754
floats (it didn't work out that well because some deviations from the large
IEEE standard means that many compilers don't define it).

Given that, you could simply write programs that check the expected macros and
error out if not supported and still have "portable" code as long as the
target architecture is non-weird. You could just up-front check that you
aren't running on a deathstation 9000.

If you wanted, you could also have conditional compilation: do the fast/normal
thing if the hardware worked as expected (e.g., sign-extending shifts), and
then fall back to some slower hack if not.

~~~
saagarjha
> C and C++ were very reluctant to make things "implementation defined" and
> almost always favored performance over complexity. I think there is some
> method to that madness: implementation defined is kind of ugly. How do you
> ensure compilers really define it? Will they use the same kind of precise
> language as the standard? What happens to portability in a language with
> many impl-defined aspects? Every compiler would essentially define their own
> sub-language.

Personally I think this was a mistake. A lot of things could be left as
implementation-defined instead of undefined: most arithmetic operations, for
example. The issue with undefined behavior is that you can’t have it in your
program, period. And if you do, your entire program is technically invalid,
and the compiler could do unexpected things to it without warning. However,
with implementation defined behavior, you can use language constructs (e.g.
#ifdef) to choose the correct implementation.

~~~
catamorphismic
Wouldn't the better solution be having compiler support for rejecting
compilation of constructs that have UB?

~~~
the_why_of_y

      int add(int a, int b)
      {
        return a + b;
      }
    

Do you want the compiler to reject this because (assuming it has external
linkage) there's no guarantee that there is no call with arguments such that
the addition would overflow?

~~~
iforgotpassword
This discussion came up recently, but my stance still is that if you _know for
sure_ at compile time that some code _always_ results in UB or is just non-
conforming, you should not fuck around and just remove a big chunk of code
because the standard allows you so, but either simply refuse compilation (my
preference) or at least still just emit code representing what was written
(effectively the implementation defined behavior route).

~~~
BeeOnRope
Sure, and compilers do now sometimes emit warnings like "shift amount is
always out of range".

Overall though, this isn't how compilers work: they aren't "fucking around"
with code they know for sure is broken and will be executed. They are just
applying a large series of optimization passes which includes passes like
removing unreachable code, and that interacts with an analysis of paths that
cannot be take for various reasons (including that end in UB) to remove "big
chunks of code".

The same passes that screw you over in any of these famous examples are the
ones that help code generation in the 99% of remaining cases including many
"obvious" optimizations that you'd want the compiler to perform.

I know the situation with undefined behavior is distressing and the examples
make it look like the compiler writers are out to get you, but that's not
really the case.

------
jl6
Could one write a C compiler that intentionally caused noisy havoc in every
situation left undefined by the language spec? As opposed to trying to do
something reasonable. Then at least you’d know that you were on shaky ground?

~~~
raphlinus
That is more or less what the sanitizers do, and it's a reasonable approach to
trying to discover UB.

------
jancsika
Here's something I was wondering about:

Suppose I've got some nasty DSP algos that depend on undefined behavior of
type-punning in C. To address this I use the union member access trick which
is supported by a gcc extension.

If my _only_ target is web assembly, then what are the reason that I should
care that my binary is actually a gcc dialect of c and not standard c?

~~~
raphlinus
That puts you pretty firmly in what I call the "semi-portable" category (I
would say "unportable" but wasm is very likely going to have flavors, such as
a 64 bit pointer variant).

But if you're being careful, you can do type-punning. The "modern" way is
generally with memcpy, but union can be valid if done carefully. The clearest
statement I was able to find of this is [1]. I find it frustrating that it's
so hard to get a clear read on this. For example, if as Regehr says union-
based punning is reliable, then what was the concern that the linked Linux
kernel patch trying to address? I haven't dug into it in super detail.

[1]
[https://blog.regehr.org/archives/1307#comment-18418](https://blog.regehr.org/archives/1307#comment-18418)

~~~
saagarjha
IMO it’s pretty iffy and you should steer clear of it, since the rules around
it seem to be murky. I think the actual rule is that it’s illegal in C++, and
valid in C99 and C11 but only because it was included in a footer note as an
afterthought in the standard.

~~~
jancsika
So what is the well-defined and performant way to fetch the exponent and
mantissa of a double in standard C?

~~~
saagarjha
Use frexp from <math.h>.

~~~
jancsika
The implementation for frexp uses type-punning.

Also, it adds a denormal and inf check which I don't need in this particular
hot loop.

~~~
saagarjha
Does it? On my computer it's a builtin (__builtin_frexp). But even if it _did_
use type-punning, that doesn't make it legal for your program to do it. I
don't believe there's any requirements on the C standard library to follow the
standard themselves.

To answer the second part of your response, the correct way to do this
yourself would be to memcpy the double and then inspect the result directly.
Any compiler worth its salt will end up generating the assembly code you want
(namely, no call to memcpy, just a couple of arithmetic shifts and masks).

~~~
jancsika
> To answer the second part of your response, the correct way to do this
> yourself would be to memcpy the double and then inspect the result directly.
> Any compiler worth its salt will end up generating the assembly code you
> want (namely, no call to memcpy, just a couple of arithmetic shifts and
> masks).

What does a compiler "worth its salt" do with the memcpy at "-O0"?

~~~
saagarjha
See for yourself: [https://godbolt.org/z/nsg9nt](https://godbolt.org/z/nsg9nt)

------
fulafel
It's curious that this 30-year old thing continues to be such a big problem
but there are no fixes in sight, while most C code continues to have UB bugs.
And many programmers have attitudes along the lines of "that's just your
opinion, man". (Eg the Linus vs signed overflow rants)

The T-shirt is good though :)

