
A Guide to Undefined Behavior in C and C++ (2010) - peeyek
http://blog.regehr.org/archives/213
======
tedunangst
> undefined behavior in C/C++ is that it simplifies the compiler’s job, making
> it possible to generate very efficient code in certain situations.

I've been growing more skeptical of this claim over time. First, more evidence
would be nice instead of "some loops". What is a program (not a benchmark)
that has an observable performance difference when compiled with and without
-fwrap?

Second, it seems counter productive. The word is now out that if you use
signed ints, the compiler is going to fuck you over. So now everybody is (or
should be) using unsigned ints, and whatever optimization the compiler was
performing, now it can't. A lot of grief for no progress.

~~~
pcwalton
> I've been growing more skeptical of this claim over time. First, more
> evidence would be nice instead of "some loops". What is a program (not a
> benchmark) that has an observable performance difference when compiled with
> and without -fwrap?

Google brings up this old message. It seems to affect the SPEC2000 suite
fairly badly. SPEC2000 mostly measures real applications, such as gzip, and is
not a series of microbenchmarks.

[http://www.archivum.info/autoconf-
patches@gnu.org/2007-01/00...](http://www.archivum.info/autoconf-
patches@gnu.org/2007-01/00027/Re:-changing-quot-configure-quot-to-default-to-
quot-gcc-g-O2-fwrapv-...-quot.html)

~~~
tedunangst
Interesting, though I still wonder to what extent this is a result of
optimizations chasing SPEC. Alternatively, could similar optimizations be made
to work with defined behavior? (How will rust fair with "less undefined"
behavior?) As I mentioned, it's kind of crazy that I might substantially
impair the performance of gzip by changing a few ints to unsigned.

------
jasode
Fyi... Chris Lattner (LLVM, Apple Inc) wrote some companion articles on the
same topic and refers to these John Regehr posts.

[http://blog.llvm.org/2011/05/what-every-c-programmer-
should-...](http://blog.llvm.org/2011/05/what-every-c-programmer-should-
know.html)

------
jrbrtsn
Isn't complaining that (INT_MAX + 1) is undefined the same as complaining that
(8/0) is undefined? What meaningful functionality could be gained by defining
overflow behavior? If you really want to know what INT_MAX + 1 is, try:

printf("INT_MAX+1= %lld\n", (long long)INT_MAX + 1LL);

~~~
Verdex
I'm only speaking for my personal preferences here.

My two problems with undefined behavior is that 1) it is not always obvious
that your program is guilty of undefined behavior and 2) when a C program
contains undefined behavior the compiler can generate surprising machine code.

I don't have any problem with 8/0 being undefined mathematically, but I do
have a problem with a C compiler generating code that will delete my hard
drive if it ever occurs (an unlikely but theoretically possible for a
conforming C compiler).

INT_MAX + 1 is problematic because most people expect wrap around (and in fact
they may be using a compiler that does this). It's more problematic because
UINT_MAX + 1 does get wrap around. And finally there's a bunch of difficult to
remember auto integer conversion rules. Taken together it's not surprising
that many people will find it difficult to determine if their integer usage is
correct.

One solution is of course to use a safer language than C (where I
unfortunately have to end up). But I personally object to the idea that you
_have_ to have undefined behavior in order to get the performance of C
(waiting to see how the development of languages like Rust goes to determine
if I'm right).

~~~
jrbrtsn
There exists an enormous body body of reliable and fast code written in "C"
(e.g. the Linux kernel). I've been coding in C for 27 years, C++ for 20. I
think the idea that a language can permit only useful work to occur is naive.
In order to code well, the programmer must understand how the computer works,
and have a great deal of focus. If you have those, "helpful" languages only
seem to get in the way.

