
What every compiler writer should know about programmers (2015) [pdf] - mpweiher
http://www.complang.tuwien.ac.at/kps2015/proceedings/KPS_2015_submission_29.pdf
======
pcwalton
The author seems to be saying, in section 4, that compilers should just stop
doing some optimizations (such as converting int to size_t where appropriate)
and should expect programmers to do those optimizations instead, in order to
reduce undefined behavior.

The problem with this argument is that it doesn't match economic reality.
Compiler authors wouldn't disagree that it would be nice to not have to do
these optimizations! (Chris Lattner has said as much to me, for example.)
Unfortunately, there is a lot more application code out there than there is
compiler code. So it makes economic sense to add optimizations to the compiler
rather than to individual programs, so that the huge universe of C and C++
programs can benefit without having to be optimized by hand at the source
level. This is why companies like Google and Apple employ compiler engineers:
so that their large codebases can become faster automatically.

It's trendy to complain about compiler engineers because they're an easy
target, and because of nostalgia for the days of Turbo Pascal when
optimizations weren't done and programs were slow. It's much less trendy to
analyze the complex circumstances that has led to UB exploitation in C and
C++. In my opinion, if I had to assign blame to one thing, it would go to the
C language itself, for e.g. encouraging use of int for array indices.

P.S. As always, Fabian Giesen's description of why compilers exploit signed
overflow is a must-read:
[https://gist.github.com/rygorous/e0f055bfb74e3d5f0af20690759...](https://gist.github.com/rygorous/e0f055bfb74e3d5f0af20690759de5a7)

Note that the conclusion is _not_ as simple as "compilers should stop doing
this optimization in C". There in fact isn't a great solution precisely
because int is 32 bits for popular ABIs in C; the best options are "hope the
programmer used size_t right" or "use another language".

~~~
petergeoghegan
> The problem with this argument is that it doesn't match economic reality.
> Compiler authors wouldn't disagree that it would be nice to not have to do
> these optimizations!

I think that the extent to which that's actually true is pretty debatable.
Note that both Linux and Postgres use -fwrapv/-fno-strict-overflow and -fno-
strict-aliasing. I'm not a compiler engineer, but if I was I might think to
ask why that's the policy of these projects, and address the criticism on
something closer to its own terms.

The author makes a clear distinction between "Optimization*" and
"Optimization". That distinction might be a bit squishy, but not so much that
it isn't still useful. Even the Fabian Giesen thing that you linked to says "A
lot of people (myself included) are against transforms that aggressively
exploit undefined behavior", right at the start. It's trendy to blame the C
language itself, but surely we can expect C compilers to care about the C
language.

~~~
pcwalton
Compiler developers do care about the C language. They just care about it as
it's _actually used_. In real-world C, people write "for (int i = 0; i <
array->length; i++)". The Linux kernel might not, but that's a choice the
Linux kernel developers have made. Lots of C code uses different idioms from
the Linux kernel, and users expect that code to run fast. Slowing down that
code out of a notion of language purity (especially when "Friendly C" isn't
even defined) does a disservice to users.

I can't speak for anyone but me, but I read the opposition to doing these
kinds of optimizations in that Gist as preferring compiler warnings to suggest
users use size_t when appropriate. This would be great, but I'm skeptical
that, for example, Linux distros would be happy about that. Most Linux
distributions consist of packages of vast amounts of unmaintained or mostly
unmaintained C code. There simply isn't the manpower available to fix all that
code to be properly 64-bit optimized (which is in fact why int is 32 bits on
x86-64 Linux in the first place).

~~~
brigade
> Most Linux distributions consist of packages of vast amounts of unmaintained
> or mostly unmaintained C code

Well, that's exactly the kind of code I'd expect to simply _break_ when the
compiler becomes able to prove new optimizations based on undefined behavior.

When that happened to me, I decided there was no way I was trawling through
our massive legacy codebase to fix undefined behavior that ubsan can't catch.
Resulted in -fwrapv -fno-strict-aliasing being added, and any speed loss was
noise under 0.5%

Then in our modern codebase, I did my absolute best to ensure we built it no
undefined behaviour, including contortions to avoid undefined left shifts...
and we still got screwed over by a new version of clang.

For all the worry about "for (int i = 0; i < array->length; i++)"
specifically, how often is that actually measurable whole-program? It's one
additional sign extension in a tight loop, plus a store/load in a complex
loop, which a modern CPU would completely hide 99.9% of the time.

~~~
pcwalton
> It's one additional sign extension in a tight loop, plus a store/load in a
> complex loop, which a modern CPU would completely hide 99.9% of the time.

Not true. It's a store/load in the _tight, simple_ loops that are the problem.
Sometimes making your loop too big will cause you to fall out of the uop cache
on x86, which is a significant performance hit. Modern CPUs don't hide this
inefficiency, which is why compiler developers implemented the optimizations
in the first place. (LLVM and GCC benchmarking infrastructure is excellent;
they test this stuff continuously and don't land changes that don't
demonstrate benefits.)

~~~
brigade
My suspicion is that loops that need even a dozen registers are in general
either not tight enough on modern CPUs that a spill is measurable (maybe they
_were_ that tight 10-15 years ago), or were optimized after 64-bit was common.
Or maybe I'm just disappointed from all the times I've eliminated spills and
other extraneous µops in small tight loops with no measurable gain even in
microbenchmarks.

I kinda wish that benchmarking infrastructure was reflected in my experience;
I've spent _entirely_ too much time fixing performance regressions in various
inner loops caused by newer compilers. Though to be fair, only once has that
regression exceeded 2x.

To give some context: if a loop is large enough that it's on the verge of
falling out of the µop cache, I'd consider that a massive loop. And from my
experience any code that complex I _definitely_ wouldn't trust newer compiler
versions to not randomly add a dozen or two instructions.

Anyway... exactly how much does -fwrapv impact LLVM or GCC's benchmarking
testsuite?

------
WalterBright
"Compiler writers are sometimes surprisingly clueless about programming."

Sure, but there's another aspect, which is trying to guess at what the minimum
level of knowledge the user of the language should be expected to have. For an
interesting debate on if an "isOdd(int i)" as opposed to just using (i & 1)
should exist as a standard library component or not, see:

[https://digitalmars.com/d/archives/digitalmars/D/food_for_th...](https://digitalmars.com/d/archives/digitalmars/D/food_for_thought_-
_swift_5_released_-_bottom_types_string_325712.html#N325778)

~~~
earenndil
> For example, if I saw "isOdd" in code, I'd wonder what the heck that was
> doing. I'd go look up its documentation, implementation, test cases, etc.,
> find out "oh, it's just doing the obvious thing", and be annoyed at the
> waste of my time.

Huh. That is not at all what I would do. I would think 'oh, that's checking if
the thing is odd' and keep going. If I see i&1, then I have to unpack that,
parse it, figure out why it's there and that it checks oddness. Also relevant
is that i&1 checking if i is odd is a detail of i's internal representation.
There's nothing intrinsically odd-related about the operation &1; that it
checks for oddness is a co-incidental side effect, just like how making a
truth table of (0, 0) => 0, (1, 0) => 1, (0, 1) => 1, (1, 1) => 1 _happens_ to
correctly simulate the behaviour of an or logic gate, even though there's
nothing inherently or-y about it.

------
whatshisface
This entire issue comes from "poor" (by modern standards, it was great for its
time) language design. The compiler writers have to fight with the compiler
users because the language that was meant for communication between them does
not include all of the information that the compiler needs to know. Instead,
it depends on implicit things that "everybody knew" in 1980 but which are not
all true today.

In order to know what your C program is going to do, you have to have some
idea of how it is compiled. That's bad, because if they go and change how it
is compiled your expectations will be wrong. Instead languages should be
designed so that you write down what you want to have happen, without having
to think about how the compiler will implement it.

~~~
cle
Very rarely have I gotten deep enough into a project and not had to pop open
the hood. How the lower level abstraction implements things is often important
and often leaves you with no practical choice but to depend on implementation
details. These are just unavoidable realities. “Hiding implementation details”
is one of the great lies of abstraction.

~~~
nikofeyn
> “Hiding implementation details” is one of the great lies of abstraction.

do you have some examples? i would imagine it is simply a matter of (poor)
documentation or something not doing something it was documented to do, or
vice versa.

if something that has its implementation details abstracted away correctly and
is documented fully and accurately, it seems to me the only reason to "pop
open the hood" is to ask if it could be faster. i suppose you could want to
abuse the abstraction by relying on implementation details, but that doesn't
seem like a good thing to do.

basically, i wouldn't say hiding implementation details is a lie of
abstraction. i would wager it's developers' inability to abide by the laws of
abstraction.

~~~
WalterBright
> some examples?

Exception handling. EH is often sold as a "zero cost abstraction". The reality
is far different. To see what I'm talking about, create a simple D (or C++)
struct with a constructor and destructor. Write a simple piece of code that
constructs an instance, calls some function where the implementation is hidden
from the compiler, and then expect the destructor to run.

Compile it with and without EH turned on. Take a look at the code generated.
You might be quite discouraged.

The existence of exceptions also significantly degrades flow analysis by the
compiler optimizer, to the point where large swaths of optimizations are
simply abandoned if exception unwinding may occur.

------
userbinator
Good to see a detailed study on this --- it's been my experience that Intel's
compiler (icc) is far less eager to exploit undefined behaviour, yet generates
just as competitive if not sometimes significantly better code than GCC.

It's also worth mentioning the note that the standard has about UB (emphasis
mine): "Possible undefined behavior ranges from ignoring the situation
completely with unpredictable results, to _behaving during translation or
program execution in a documented manner characteristic of the environment_
(with or without the issuance of a diagnostic message), to terminating a
translation or execution (with the issuance of a diagnostic message)."

That emphasised part is what the majority of programmers using C expect.

~~~
colejohnson66
Could that be because of ICC’s dispatch code? It compiles code for different
feature sets and then uses the CPUID feature list to know which to run.

~~~
userbinator
That's for autovectorisation, but my experience has been that icc's regular
scalar code generation is what makes it perform so well. Instruction
selection, register allocation, etc.

~~~
conistonwater
Aren't those common to all compilers? So it just does everything that
clang/gcc do also, but does it better?

~~~
naikrovek
Depending on what you are doing a bit, stuff compiled with `icc` runs about
20% faster than the same code compiled with `gcc` or MSVC's compiler on Intel
CPUs.

~~~
earenndil
Hmmm. The one time I tried it (for a crypto miner), it ran about 60% the speed
of gcc and clang (which were comparable; one was slightly ahead, I forget
which but I think it was gcc). Granted, I may have gotten the flags wrong
(there were a lot of them), but I turned all the ones that looked relevant.

------
mannykannot
In section 2.2, the author claims that the intent of a programmer is knowable
when the programmer writes something that could result in undefined behavior,
but this claim does not hold up for the example given. When a program attempts
to read from beyond the end of an array, it is rarely something that a
programmer intended to happen, and it is rarely the case that any action the
program takes as a response to that event (even if it is a segmentation or
general protection fault) is in conformance with what the programmer intended
to happen after the sequence of inputs that led up to this event. It is much
more likely that the programmer did not even think that this could happen, and
expected that the program would continue to run and produce results conforming
to the purpose they intended the program to serve.

If I can divine what the author of this article actually intended to say, I
think it is that, in at least some cases of undefined behavior, programmers
commonly have intuitions about what would happen if that behavior occurred
during the execution of their programs. I don't know if the rest of the thesis
can be salvaged by substituting that reading.

~~~
abcdef123xyz
I think it is clear they mean what the programmer would expect /even if they
knew the code was wrong/:

"And that’s what programmers expect: In the normal case a read from an array
(even an out-of-bounds read) performs a load from the address computed for the
array element; the programmer expects that load to produce the value at that
address, or, in the out-of-bounds case, it may also lead to a segmentation
violation (Unix), general protection fault (Windows), or equivalent; but most
programmers do not expect it to “optimize” a bounded loop into an endless
loop."

~~~
mannykannot
Well, that's what I wrote in my second paragraph, and the article would have
been more straightforward if the author had not drawn an unjustified
conclusion from it.

For example, he goes on to say that some programs capable of undefined
behavior are nevertheless correct if compiled in a certain way, but how is the
compiler-writer to determine that? The unstated assumption is that there is a
preferred compilation under which all programs are 'more correct', but when it
comes to undefined behavior, then, by (non)definition, there is no 'more
correct' way to compile it. Therefore, saying there is a 'more correct' way is
equivalent to saying that at least some undefined behavior should be defined
in a particular way. The article would have been simpler if the author had
realized that this is what he is proposing, and skipped the confused
correctness arguments.

------
chrisseaton
If you're interested in the weight that the author puts on 'confirming' and
don't know what it exactly means - a 'conforming' program is one that runs
correctly on some 'conforming' implementation. So it's a program that at some
point ran correctly on some compiler somewhere. If your C program is only
'conforming' then it isn't necessarily portable between compilers or even
versions of compilers.

------
bigcheesegs
A major issue with removing UB exploitation is that you've now created C _,
but each compiler implements a different C_. It becomes harder to port code,
and you end up with bugs that can be harder to catch, especially if don't
consider what you're doing an error, as now you can't even use the sanitizers.

What my team has been looking at to help developers that have these issues is
providing low overhead ways to give defined, but likely to crash, behavior for
these cases. This helps protects against UB without creating C _. An example
would be variable auto-init, which intentionally doesn 't support zero init,
as that is likely to hide bugs:
[https://reviews.llvm.org/D54604](https://reviews.llvm.org/D54604)

I'm fine with fully defining some things that are currently UB or IB. I'm also
fine with keeping things as UB. What I'm not fine with is creating this
implicit C_ that programmers think they know what it is, but it's not actually
defined anywhere, and not everyone working on the compiler is on the same page
about.

~~~
perl4ever
To me, whenever something is undefined at one level, there are still
constraints on it at another level. We go about our daily lives without laws
and regulations explicitly detailing our every interaction, and rely on custom
and ingrained behavioral rules to fill in the gaps. Every level of abstraction
has another layer behind it. When I read someone arguing about undefined
behavior allowing a compiler writer to do anything, it makes me imagine
someone who, if they could afford the finest lawyers and found the appropriate
loophole, would consider it not just their right, but their duty, to kill and
eat me. It's not terrifying to imagine there are monsters here and there;
there always have been a certain number. What terrifies me is the idea that
this mentality is infecting society and altering norms and interactions. It
doesn't feel like evil as such, it feels more like a metaphorical prion
disease. There's something fundamental to being a human being in being able to
deal with undefined situations by breaking out of a given context into a
broader one.

------
virgilp
I'm tempted to write a "what every programmer should know about compiler
writing" article. In particular, that they're not alone in the universe. The
price that you have to pay for the incredible combo "performance +
portability" is undefined behavior. Why do people assume that if they don't
care about some sorts of portability and exotic platforms, nobody else does?

~~~
anfilt
Probably because a lot programmers are just used to the fact pretty much at
the desktop level it's x86/64.

However, when you start talking microcontrollers for instance the
architectures are all over the place.

C covers both these use cases.

~~~
dumael
Indeed, the likes of x86 little endian assumptions of programmers (even
compiler engineers) can be hilarious (in the sad clown way) when targeting
something like MIPS64 big endian.

------
DannyBee
We have this discussion every few years :)
[https://news.ycombinator.com/item?id=11219874](https://news.ycombinator.com/item?id=11219874)

I have the same comment now i had then.

~~~
nkurz
And reading it once again, I'm still sympathetic to Anton's argument and
frustrated by your response. Although reading the exchange between the two of
you at the bottom, I do see why you might have difficulty looking afresh at
his argument. But in the hope of moving the discussion ahead at least a
little, could you perhaps respond here to forthy's comment from the last
thread?
[https://news.ycombinator.com/item?id=11226391](https://news.ycombinator.com/item?id=11226391).
In particular, why does the C standard seem to prefer undefined behavior over
the easier to reason about implementation defined behavior?

~~~
bigcheesegs
Implementation defined behavior isn't actually any easier to reason about.
Compilers are free to choose "assumes this never happens" as their behavior,
as there is no limit on what can happen. This is different from implementation
defined value or unspecified behavior (which generally has a list of options).

------
jasonhansel
I really think that we need a standard variant of C in which all undefined
behavior is (somehow, even if arbitrarily) defined. (Also: a variant with some
restricted level of polymorphism would make C much less painful.)

~~~
dooglius
That's not possible in general, for instance if you write a block of memory
than execute it, you can't really define what's going to happen without
including an ISA in your language spec. Another example would be executing a
syscall with a not-yet-used number. The ability to do both of these things are
necessary for a low-level language such as C.

~~~
userbinator
_you can 't really define what's going to happen without including an ISA in
your language spec._

That's called "implementation-defined".

~~~
dooglius
True, but there are a lot of things that are also undefined at the ISA level

~~~
earenndil
Don't specify that at all in the standard. Let compilers implement it as an
extension, like inline asm. They'll _have_ to make it have sane behaviour
because they don't have the excuse of 'the standard says it's ub'.

