
What every compiler writer should know about programmers [pdf] - pjmlp
http://www.complang.tuwien.ac.at/kps2015/proceedings/KPS_2015_submission_29.pdf
======
DannyBee
One of the things this person misses is a lot of these are undefined _with the
explicit goal to let compiler authors take advantage of them_.

So his real argument is with the C committee for not defining these things the
way he wants them. His claim that the behavior of what should happen is
obvious is pretty strange (IE "it should do exactly whatever the compiler is
trying to prove it doesn't")

It's essentially a complaint that compiler authors should follow a mystical
spirit of C that he seems to have in his head.

(Plus, you know, he's kind of a dick about it all)

I read things like "We had plans to turn this optimization into a production
feature, but eventually realized that the GCC maintainers only care about
certain benchmarks and treat production programmers as adversaries that
deserve getting their code broken. ... The current GCC and Clang maintainers
suggest that we should convert programs to “C”. How would that turn out for
Gforth?"

I was intrigued, so i went looking for the mailing list discussion where the
gcc or clang people suggested this or where any of this happened.

I can't find it. On either set of mailing lists. I searched for gforth, i
searched for anton, etc Nothing.

~~~
kazinator
> _lot of these are undefined with the explicit goal to let compiler authors
> take advantage of them._

Constructs should be undefined because the current landscape of C
implementations is surveyed, and those constructs are found to be nonportable
to such an extent that their behavior cannot be even "implementation-defined"
or "unspecified".

That's the beginning and end of it, period.

That's why certain thing were undefined or unspecified in the first ANSI C
standard. For instance, different compilers had different evaluation order for
function arguments, without documenting which, or guaranteeing that it would
always be the way it is.

The committee didn't sit there and decide "let's make evaluation orders
undefined for the sake of blazing speed, even though all C compilers today are
doing it left to right. Let's have it so that after 1989, they can use
whatever order they want!"

It was, "Ooops, we can't codify a specific evaluation order, because looking
at what's out there, there isn't one. If we were to codify an order,
implementations not having the order that we codify would have to be modified
to conform with our standardized language."

As time goes on, these stupid things should be _tightened_. Over the course of
thirty years, it's okay to codify some _gratuitous_ lack of definedness in a
language area and make everyone straighten it out in their implementations.

~~~
loup-vaillant
I believe the order of evaluation of function arguments is _unspecified_
(though it might lead to undefined behaviour in some cases, I'm not sure).

\---

Signed integer overflow is more interesting. In most platforms, it is very
well defined: it either wraps around or it saturates. On one's complement
machines I don't know, though I'd expect a deterministic (yet mostly
nonsensical) result.

So it might sound reasonable to have signed integer overflow be
_implementation defined_. Except that another reasonable behaviour is to
trigger a trap, just like divide by zero. So they put in into the _" anything
can happen"_ bucket, that is, undefined. And now we have compilers taking
advantage of this, by assuming signed overflow simply _never happens_. Which
sometimes lets them deduce interesting things such as `k<k+1` evaluates to
`true` no matter what.

I'm not sure the standard bodies anticipated that. I bet they were thinking
something along the lines of _" the runtime can assume it never happens, and
fail to perform the relevant checks in the name of performance"_. I'm not sure
they originally interpreted this as _" the compiler can assume it never
happens, and treat the alternative as dead code"_.

Overall, I see 3 kinds of undefined behaviour: the most innocuous one is when
the runtime is allowed to trigger a trap, and crash the program right away.
The second one is when the runtime is allowed to not perform the relevant
check, and go bananas because of mangled memory or something. This one is more
serious, and the source of many security vulnerabilities. The third one is
when the _compiler_ can assume the relevant undefined behaviour doesn't even
happen, and make lots of interesting deductions from this assumption. This one
is mostly insane, though I can understand why it would be nice in some limited
cases.

We need 3 buckets for undefined behaviour. "Anything can happen" is too coarse
grained.

------
pklausler
I'm a compiler writer. I'm not actually unsympathetic to these points, and
agree that compilers intended for practical use had better have strong
justification for surprising behavior.

But the pervasively nasty tone of the paper is not going to make many friends
among the people whose work the authors are trying to influence.

~~~
WalterBright
I agree about the unpleasant tone of the paper, putting "facts" and
"optimizations" in quotes, implying they are not real, and talking about
"claims" of GCC maintainers, and cracks about psychic powers.

But there is a good underlying point to the paper, and one I tend to agree
with - a compiler's behavior should give considerable weight to how the
language is used more than taking a literal view of what the words mean in the
spec.

Taking things to an absurd extreme to make the point, "undefined behavior"
implies that a compiler writer can insert code to reformat the hard disk if it
encountered such code. Granted, the point of a spec is to define things so
that compiler writers don't have to divine intent, but perfection is never
attained and hence compiler writers have to use their judgment.

I know in my compiler work, I have backed out some optimizations that were
allowed by the spec, but broke too many programs. The inconsequential speed
improvement was just not worth it.

~~~
tytso
It may be unpleasant, but the reality is that many programmers do feel that
way. I can say that as a kernel programmer, I approach using a new version of
gcc with a particular version of dread. What sort of random bugs that might
end up causing kernel crashes or data corruption will arise when I try using a
new compiler?

As a result, I'm quite sympathetic to the attitude of "any sufficiently
advanced compiler is indistinguishable from a malicious adversary"....

~~~
eru
Just use a more defined language than C..

~~~
ScottBurson
Yes. I think, if I had to do embedded work, I would use Ada.

~~~
klibertp
I immediately thought about the same language. It's clean, safe, fast and
explicit, even if a bit verbose. With a quite nice compiler, last time I
checked (some 10 years ago), and without "undefined behaviour".

------
AndyKelley

        int d[16];
        int SATD (void)
        {
            int satd = 0, dd, k;
            for (dd=d[k=0]; k<16; dd=d[++k]) {
                satd += (dd < 0 ? -dd : dd);
            }
            return satd;
        }
    

This gets optimized into an infinite loop, and the paper says that this is
incorrect. I'm on the side of the compiler. This code is broken, if you do an
out-of-bounds access then an infinite loop is the least of your worries.

~~~
kzhahou
What's wrong with this code? I don't see it.

(disclaimer: I am a programmer, not a compiler)

~~~
Someone
This:

    
    
        k<16; dd=d[++k]
    

, whenever the code reached the end of the loop, effectively becomes

    
    
        dd=d[++k]
        if(k<16) goto startOfLoop;
    

That, in turn is equivalent to:

    
    
        k+=1
        dd=d[k]
        if(k<16) goto startOfLoop
    

In the second line, k is used to index into array d. Conforming programs do
not access memory outside of arrays, so the compiler can infer that

    
    
      0 <= k < 16
    

Given that, it is clear that the _goto startOfLoop_ branch will always be
taken. Inside the loop, the program accesses some memory, but that data never
is visible to the outside world, so the compiler can optimize away the memory
reads and writes.

The compiler is that aggressive because it has to be in real world programs.
For example, compilers using C as their target instruction set may generate
bounds checks for every array access, trusting the C compiler to remove as
many as possible of them. Macros are another common cause of superfluous
source statements that programmers will expect the compiler to optimize away.

I think it will be very tricky to detect cases where compilers should be wary
of applying 'obvious' optimizations.

~~~
AntonErtl
I would be pretty miffed if I wrote a, say, Pascal compilers with array bounds
checks, and the compiler "optimized" the checks away just because accessing
a[i] would be undefined behaviour. OTOH, in a loop like

for (i=0; i<16; i++) { if (i>=0 && i<16) do something with a[i] else report an
error;

you can certainly optimize the if to if(1). That's an optimization*.

~~~
Someone
No, it would not. Compilers can remove boundary checks that provably happen
after accessing an item of an array, not those before the array is accessed.

For your example, the Pascal code (at least, I hope this is Pascal; it has
been a while since I wrote any):

    
    
      if (i>=0 and i<16) then
      begin
        x := a[i]
      end
      else
      begin
        ReportError;
      end
    

is equivalent to the C code:

    
    
      if (i>=0 && i<16)
      {
        if(i>=0 && i<16)
        {
           x = a[i]
        } else {
          RuntimeAbort(“Array access outside bounds");
        }
      } else {
        ReportError();
      }
    

A good Pascal compiler, like a good C compiler, would optimise away that
second boundary check and the call to RuntimeAbort. Neither compiler is
allowed to optimise away the first check.

~~~
AntonErtl
The point of my posting was that you don't need "optimizations" to optimize
away redundant bounds checks, optimization* is able to do it just fine. Sorry
if I did not get that across clearly. What does your "No, it would not" refer
to?

------
aidenn0
The author doesn't understand the market demands for a compiler:

1) Customers who want their text-size to be as small as possible so they can
shave $0.0015 off their BOM by using smaller ROM parts

2) Chip vendors who have 7 figure sales that will fall-through if they can't
show a 0.7% higher Dhrystone MIPS (yes Dhrystone, the benchmark that was
obsolete 20 years ago).

3) Managers who make buying decisions purely based on a set of their own pet
benchmarks, with no regard to anything in this paper.

Damn right a compiler will take advantage of every optimization opportunity
afforded by the standard. Good compilers will have ways of being much less
aggressive about optimizations; I bet with gcc and clang each you have some -f
flags that will disable certain optimizations (not to mention just good old
-O0); I know most commercial compilers do.

[edit]

I would like to see some less-optimizable C standard catch on, so that there
could be a mode common to multiple compilers that makes programmers happy.
Unfortunately attempts to create such a dialect have failed to get off the
ground.

[edit2]

Part of this is because there is less of a consensus on what are
optimizations* and "optimizations" than the author would believe.

~~~
AntonErtl
As for the market demands on compilers, I hope that this paper influences that
in more reasonable directions. In the meantime, I have nothing against having
an option (or more) for "optimizations". Enabling new "optimizations" by
default however has the effect of miscompiling existing programs and is not
such a good idea. At the very least there should be one flag for disabling
"optimizations" without disabling optimizations _.

Concerning what are optimizations_: If the observable behaviour does not
change (even when there is undefined behaviour), it's an optimization _. Also,
register allocation and rearranging stuff in memory is an optimization_
(because programs that rely on that not changing break on maintenance). Other
transformations are, at best, "optimizations". If you have any questions about
specific transformations, please ask.

------
kazinator
Optimizations based on undefined behavior are actually based on the belief
that the code in question doesn't have undefined behavior (that the programmer
is taking care of any possibility of undefined behavior, where that
possibility exists).

This assumption should be made only if the programmer who wrote that code is a
machine: a machine translating something which itself doesn't have undefined
behavior, into C that doesn't have undefined behavior.

That assumption should not be made when the programmer is a person. Or, if the
assumption is used, it should be accompanied by a diagnostic. "I'm expressing
what appears to be your intent using a more efficient approach, which breaks
in a surprising way if you made a mistake."

For instance, consider

    
    
       int compare(int x, unsigned char y)
       {
         return x + y > x;
       }
    

This is always true, or else the behavior is undefined. So it can be replaced
with:

    
    
       int compare(int x, unsigned char y)
       {
         (void) x; /* somewhat accepted de facto idiom for ignoring arguments */
         (void) y;
    
         return 1;
       }
    

And GCC does turn it into "return 1". But why the hell would a _human_ write a
complicated function in place of just constant 1?

If a machine generated this due to following a template of code, that is
understandable, and making it return 1 is helpful --- assuming that the
generator is carefully written to avoid relying on undefined behavior.

Even if the generator actually relies on undefined behavior, the issue can be
fixed in the generator, which can be run to regenerate all the code; but there
is no such easy band-aid for human-written code. (A code generator can even
have a catalog of what version of what compiler brand supports what undefined
behavior in what way, and tune its output accordingly to what compiler is used
in the subsequent pass.)

What is unhelpful on the part of compiler writers is supporting some undefined
behavior as a _de facto_ extension for many years, and then backpedaling on
it.

One day your code that depends on signed integers having two's complement
"odometer wraparound semantics" stops working after years or decades because
of some compiler cowboy who was in millennial kindergarten when that code was
written.

~~~
rcfox
(x + 0 is not greater than x.)

~~~
kazinator
Thanks for that observation; it should be >=.

I will leave that as it is so that your comment makes sense; let this be the
erratum.

------
bluecalm
Out of all ridiculous claims in that .pdf, many pointed out by others I find
this especially entertaining:

 _But do “optimizations” actually produce speedups for benchmarks? Despite
frequent claims by compiler maintainers that they do, they rarely present
numbers to support these claims. E.g., Chris Lattner (from Clang) wrote a
three-part blog posting8 about undefined behavior, with most of the first part
devoted to “optimizations”, yet does not provide any speedup numbers. On the
GCC side, when asked for numbers, one developer presented numbers he had from
some unnamed source from IBM’s XLC compiler, not GCC; these numbers show a
speedup factor 1.24 for SPECint from assuming that signed overflow does not
happen (i.e., corresponding to the difference between -fwrapv and the default
on GCC and Clang). Fortunately, Wang et al. [WCC+12] performed their own
experiments compiling SPECint 2006 for AMD64 with both gcc-4.7 and clang-3.1
with default “optimizations” and with those “optimizations” disabled that they
could identify, and running the results on a on a Core i7-980. They found
speed differences on two out of the twelve SPECint benchmarks: 456.hmmer
exhibits a speedup by 1.072 (GCC) or 1.09 (Clang) from assuming that signed
overflow does not happen. For 462.libquantum there is a speedup by 1.063 (GCC)
or 1.118 (Clang) from assuming that pointers to different types don’t alias.
If the other benchmarks don’t show a speed difference, this is an overall
SPECint improvement by a factor 1.011 (GCC) or 1.017 (Clang) from
“optimizations”._

I mean, the speed-up of 7.2%, 9%, 6.3%, 11.8% for specific cases is huge. It
will make real time and money difference for people. He is advocating against
having that just because he thinks what signed integer overflow should do is
obvious. Entitlement, ignorance, incompetence. I am not sure which one is it.
It takes 1 minute to figure out that signed integer overflow is undefined and
act accordingly.

~~~
DominikD
Developer time is a zero sum game. If one of these optimizations bites me and
I spend one day figuring out what happened, I don't spend one day doing
optimization somewhere else, where a much greater speedup can be achieved. Or,
to out it in your money difference terms, I could be adding value to the
program somewhere else, adding or polishing features.

It's a balancing game. Looking at it through the lens of % speedup, ignoring
everything else is just stupid. In a perfect world we'd make decisions based
on numbers. He provided his. Where are yours?

------
j2kun
I half expected the document to present statistical evidence about how
programmers tend to behave, and use that to generate advice for compiler
writers.

The article is a nice read (and doesn't present that kind of statistical
evidence), but I'm still wondering, _has_ anyone done a significant study of
how programmers work? I would be really fascinated to read an analysis of,
say, data taken from full-time software engineers at a given company where all
of their actions are recorded (key-strokes, compiler runs, error messages,
etc.). Similar to how online competitive gamers have their actions recorded to
the microsecond to identify weak points.

It would even be interesting to know, e.g., what is the net average number of
lines of code created/deleted per day?

~~~
conceit
The average doesn't matter, specific problems are reported on the
mailinglists, for example. You wouldn't sway everyone's writing style with
some statistical evidence, not in a giant ecosystem like C/unix and it's not a
democracy either where 51% would just ignore the other 49% existence.

~~~
j2kun
What? No, I mean that's how to allocate your resources well, you prioritize
the problems that will affect most of the people most of the time.

------
kazinator
What? "tested and production" programs might be conforming according to the C
standard, but that's only because it's a largely useless term: _[a] conforming
program is one that is acceptable to a conforming implementation._ [C99, 4]

If the implementation is conforming (which might not be the case) and has
accepted the program (which could be chock full of cruft), then the program is
conforming. The "conforming" designation simply doesn't assure us of any
measure of quality, other than that the program isn't so bad that it doesn't
compile and link (with that one toolchain which produced its executable).

If the program is tested and in production using a compiler that is operated
in a nonconforming mode (often the default on many compilers), then acceptance
of the program doesn't speak to its conformance at all.

This kind of twaddle in the abstract is a very bad way to start out a paper.
(And I'm one of the "converted" people, you don't have to work hard to
convince me about the dangers of optimizing based on "no UB" ssumptions,
without a single diagnostic squawk.)

~~~
AntonErtl
"Conforming program" may be a largely useless term, but it's one of the two
conformance levels for programs that the C standard defines. The other is
"strictly conforming program", and it does not include any terminating C
program, so it's also a largely useless term.

Now C compiler maintainers justify "optimizations" with the C standard, and
indeed, a compiler that behaves like that is a "conforming implementation"
(the only conformance level that the C standard defines for compilers). Is
that a useful term?

Yes, we should be exploring the fringes of the C standard (in particular,
"optimizations") less, because that is not useful territory; instead, we
should orient ourselves by looking at the C programs that are out there.

But anyway, my thinking when I wrote that was: if somebody comes at you with
the frequent stance of "your program is non-standard crap full of bugs", when
it worked as intended with all previous versions of the compiler, you can
shoot back at them with "it's a conforming program". Of course, if the other
side is a language lawyer, that approach will not help, as you demonstrate,
but if not, maybe you can get the other side to take a more productive stance.

~~~
kazinator
"Strictly conforming program" is a set that certainly includes terminating
programs; I think you mean that it has a shortcoming because it doesn't
exclude non-terminating programs? But a program that keeps running forever,
servicing commands entered by a user, is nonterminating, yet possibly correct
and useful. Nontermination doesn't constitute misuse of the language as such.

> _you can shoot back at them with "it's a conforming program_

Not really; retorting with a useless term from ISO C doesn't help in any way.
You can only effectively shoot back by demonstrating that all the claims that
the program is buggy are rooted in constructs and uses which are in fact
defined on all the platforms which the program explicitly supports. (Just not
ISO-C-defined.)

If you're doing something that isn't defined by ISO C, and isn't documented by
your compiler or library either, then you do in fact may have a real problem.

But portability to platforms which are not specified for the program is a
specification issue, not a coding issue. If the specification is broadened to
encompass those platforms, then it becomes a coding issue.

I wouldn't listen to anyone who complains that, say, some program of mine only
works on two's complement machines, not ones with sign-magnitude integers. My
response would not even be "patches welcome" (a patch to "fix" that would
definitely not be welcome).

On the other hand, merely calling a function which isn't in the C program and
isn't in ISO C is undefined behavior, as is including a nonstandard header
file like #include <unistd.h>. We can make a conforming implementation in
which #include <unistd.h> and a call to fork() reformats the disk.

Simply accusing a program of undefined behavior isn't damning in an of itself;
hardly any useful program can escape that blame.

Basically, I can out-language-lawyer anyone who criticizes my code from that
perspective, so I'm safe. :)

~~~
AntonErtl
"Strictly conforming program" excludes implementation-defined behaviour, and
AFAICS all ways to terminate a C program are implementation-defined behaviour,
so terminating C programs are not strictly conforming.

"C" is actually a little bit wider than "strictly conforming", because it
includes implementation-defined behaviour (I did not know that when I wrote
the paper).

So "C" does not actually correspond to a conformance level defined in the C
standard. So while the "C" advocates like to support their stance with
language lawyering, they actually pick those pieces of the C standard that
suit them and ignore the others. That is fine with me, but if the standard is
not the guideline, what is? For me it is the corpus of existing working
programs; "C" advocates seem to be more interested in benchmarks.

------
conceit
The diagram on page 13 has linear interpolation plotted between the
datapoints. That's one step away from fitting an arbitrary polynomial for the
points. Don't do it, implying measurements between point releases isn't very
sensible.

~~~
Dylan16807
It's easier to follow the data points with nice visible connections.

------
wirrbel
Click-bait.

