
The compiler is always right - nkurz
https://blog.mozilla.org/nfroyd/2014/05/09/the-compiler-is-always-right/
======
tikhonj
The compiler is always right... except when it isn't.

Really, the compiler is _usually_ right. Fair enough. But I've run into
compiler bugs, and that's just in internships! (And I'm not even counting the
compiler I helped write...)

And sometimes the issue isn't even a bug but a design deficiency. Those are
hard to deal with because they're _real_ problems, but also exist for _real_
reasons. I ran into one of these when compiling OCaml to JavaScript and
getting stack overflows—turns out the compiler couldn't handle _mutual_
recursion properly. And that's because it's a hard problem! And yet, it also
meant the compiler _was_ wrong.

Thinking about it, perhaps the title could be read as "the compiler is always
right, even when it's wrong". Just like the ultimate arbiter in a car crash is
momentum, not the rules of traffic, if you hit a compiler bug, "fixing" your
program is probably much easier than fixing the compiler—even if you're in the
right! That's what we ended up doing with js_of_ocaml, by choosing a different
backend for our parser generator that didn't create mutually recursive code.

~~~
foxhill
i think in this context, the author is referring to mainstream (gnu, ms,
intel) compilers, that are very mature and have thousands of tests that they
must pass to be eligible for release.

 _writing_ a compiler is tricky, and it is almost an absolute certainty that
it will have a bug :)

~~~
jevinskie
Nope, even mature compilers are buggy.

I find it amusing how often compiler authors have to workaround bugs in
_other_ compilers. See:
[http://search.gmane.org/?query=MSVC+2012&author=&group=gmane...](http://search.gmane.org/?query=MSVC+2012&author=&group=gmane.comp.compilers.llvm.cvs&sort=date&DEFAULTOP=and)

I don't mean to pick on MSVC and many of those are not-yet-implemented issues,
it is just the easiest to search for. I've found (and even fixed a few) enough
bugs in LLVM/Clang to realize that no compiler is perfect or even close! =)

~~~
foxhill
well, of course they can be buggy, they are written by humans, after all :)

but in my time as a programmer, i think i've seen one bug that was a genuine
compiler bug. and this was in floating point arithmetic optimisations.

llvm/clang, well, i'd hazard to call them mature (even though apple insists on
doing so).

------
mehrdada
Having done research on finding miscompilations in production compilers just
recently[1], hunting a couple hundred bugs in GCC and LLVM, I feel much more
skeptical these days about this matter.

Modern compilers are big, complex, systems, and naturally have bugs (to be
fair, given the complexity and aggressiveness of optimizations, the quality of
GCC is extremely admirable.)

[1]:
[http://mehrdadafshari.com/emi/paper.pdf](http://mehrdadafshari.com/emi/paper.pdf)
(check out the example bugs in the paper. They are amusing.)

P.S. The "it's your code, not the compiler" mindset does not generally apply
to compilers targeting embedded platforms.

~~~
makomk
Oh wow, those example compiler bugs are really interesting. Turns out
aggressive removal of code that invokes undefined behaviour can combine with
common, safe optimisations to cause misoptimisation - even though the code
output by the earlier optimisation passes was safe due to implementation
details, a subsequent pass decided that because the optimised code's behaviour
wasn't defined by the C standard it could assume it never executed and remove
chunks of it. Nasty.

------
imslavko
Someone who has been participating in algorithm competitions like TopCoder or
Codeforces would know this - compiler optimizers can have bugs pretty often.

Usually on algorithmic programming competitions the tasks are very hard to
come up with a good asymptotic solution but are fairly easy to implement and
the code can be rather short (usually 70-300 LOC written in <1h).

Given the average ratio of program complexity / size of the program,
competitive programming community manages to find bugs in GCC optimizer
several times a year.

Sometimes compilers do have bugs :) Not very often though.

[http://codeforces.ru/blog/entry/1059?locale=en](http://codeforces.ru/blog/entry/1059?locale=en)

[http://codeforces.ru/blog/entry/11450?locale=en](http://codeforces.ru/blog/entry/11450?locale=en)

[http://codeforces.ru/blog/entry/2068?locale=en](http://codeforces.ru/blog/entry/2068?locale=en)

[http://codeforces.ru/blog/entry/1993#comment-40700](http://codeforces.ru/blog/entry/1993#comment-40700)

[http://codeforces.ru/blog/entry/1840?locale=en](http://codeforces.ru/blog/entry/1840?locale=en)

[http://codeforces.ru/blog/entry/3742](http://codeforces.ru/blog/entry/3742)

etc

------
nitrogen
_For beginners_ , it is probably best to assume the compiler is right. _For
the rest of us_ , the compiler is usually right, except when it's not. See the
extensive work of John Regehr on automated compiler bug discovery and related
compiler work:
[http://blog.regehr.org/archives/category/compilers](http://blog.regehr.org/archives/category/compilers)

~~~
wglb
My favorite part is where his process found a non-trivial number of bugs in a
research compiler _that was proved correct_.

~~~
twic
Which part is that? I had a look through the archives but couldn't spot a post
which looked like that.

~~~
wglb
Here is one paper that describes it:
[http://www.cs.utah.edu/~regehr/papers/pldi11-preprint.pdf](http://www.cs.utah.edu/~regehr/papers/pldi11-preprint.pdf)
It is the CompCert compiler, which is formally verified. It looks as if the
errors found were in the unverified portion. Those errors have been found and
the verified surface has been expanded.

------
ekidd
In my experience, if I've eliminated every other possibility, it's worth
actually reading the assembly generated by the compiler. Most compilers are
pretty good, but they do contain bugs, and if I completely rule out the
possibility that the compiler is generating bad code, I can waste days
searching for the problem. Some examples:

1\. When compiling certain highly-optimized routines in Quake II, Microsoft
Visual C++ generated tail calls that popped the current stack frame before
recursing, even though a pointer to a stack variable had already escaped into
a global variable. Now, I'm sure there's some technical reason why a C
compiler is allowed to generate garbage code here (there usually is—Google
"nasal demons"), but it's still easier to debug if you assume the compiler is
untrustworthy.

2\. The Rubinius FFI had problems passing doubles to external functions:
[https://github.com/rubinius/rubinius/commit/1122c1c26b81c969...](https://github.com/rubinius/rubinius/commit/1122c1c26b81c9691049c8355c809b132b8c8f9c)

3\. I've broken quite a few in-house and experimental compilers. My favorite
error message came from a Lisp compiler: "Lost value of variable 't' in
anonymous lambda in anonymous lambda in top-level form."

Now, this sort of thing is rare. If you're using a mature production compiler
in CS class, you can assume with near-100% certainty that any problems are
your fault. But if you're maintaining a 250K-line program, you upgrade your
compiler, and you see some mysterious new backtraces showing up in the crash
reporter, then it's entirely possible your compiler is buggy.

As a handy rule of thumb, if you've been staring at a function in a debugger
for more than an hour, and you _know_ it's correct, and something weird is
still happening, it's time to choose "Show Assembly" and see what's actually
going on. It may still be your fault, but it's time to stop trusting the
abstractions provided by the compiler and debug your _real_ program—the one
the computer actually runs.

~~~
munin
> you upgrade your compiler, and you see some mysterious new backtraces
> showing up in the crash reporter, then it's entirely possible your compiler
> is buggy.

IME, there are two more likely causes of this, that are kind of rooted in the
same cause:

1\. Your code depended on a compiler bug to function, the bug is fixed, and
now your code is broken

2\. Your code depended on an undefined part of the language semantics, the
compiler changed from doing one undefined thing to another, and now your code
is broken

------
buro9
The assumption any dev should have is:

"The bug is yours"

That isn't a true statement, as there are exceptions. It's just that you are
unlikely to be the exception.

Whether you're relying on a compiler, network, some other application, a
library, an API... the bug is nearly always yours and you should more
rigorously check your code without making any assumptions, until you fully
understand the root cause of a bug.

If you actually have got to the root cause of a bug, you'll find in nearly all
cases the problem was with your code.

~~~
mratzloff
Yes. Similar to the hyperbole in the article title, I say "All bugs are logic
bugs" (in your code). Now, it's not strictly always true--sometimes it's
environmental or in a library you depend on, or it's a garbage collection bug
with the language, or what have you.

But some developers (myself included, sometimes) reach for those excuses first
because their code couldn't _possibly_ be exhibiting that strange behavior on
its own. After all, it worked perfectly in (insert situations).

No, it's almost certainly a logic bug: a race condition, a parallel execution
problem, something doesn't get set correctly in a very uncommon case, etc.

------
acqq
Is it a poor debugger that the author uses? The code looked wrong to me until
I understood that instead of writing the labels inside of the line the line
contains the value of 0 and the unconventional decoding and then the label
follows in the next line e.g.

    
    
        2:	mov    0x0(%rip),%rax
        5:  R_X86_64_GOTPCREL NSS_ERROR_NOT_FOUND+0xfffffffffffffffc
    

was probably supposed to mean something like

    
    
        mov  addr NSS_ERROR_NOT_FOUND, %rax
    

That is, the next line is actually content of the part of the instruction of
the previous, and the later was incomplete and with 0 instead of the value. If
the value in the second line would appear after the instruction in the first
line is over, during the execution the second line would be executed as the
instruction. That it's not the case is visible from the offset 5 vs. 2 of the
start of the instruction. I'm more used to Intal than AT&T notation, but I
don't believe that it's the effect of the notation. Anybody knows more, what
produces such a strange code?

This weirdness

    
    
        2:	48 8b 05 00 00 00 00 	mov    0x0(%rip),%rax
    	5: R_X86_64_GOTPCREL	NSS_ERROR_NOT_FOUND+0xfffffffffffffffc
         9:	8b 38                	mov    (%rax),%edi
         b:	e8 00 00 00 00       	callq  10 <NSSTrustDomain_GenerateSymmetricKeyFromPassword+0x10>
    	c: R_X86_64_PLT32	nss_SetError+0xfffffffffffffffc
    

probably just means

    
    
         mov addr NSS_ERROR_NOT_FOUND, rax
         mov (%rax),%edi
         call nss_SetError
    

Then what and why produces the longer and stranger form?

~~~
cnvogel
That's objdump's default format when outputting relocations together with the
disassembly. As you have already found out, the code is compiled with dummy
values for the actual addresses being accessed or jumped to, and the
reloacation table instructs the linker to overwrite these dummy values with
the actual addresses, once they are known.

R_X86_64_GOTPCREL is a constant defined in /usr/include/elf.h (I think it's
from libbdf, the library dealing with the different file formats binutils
understand).

    
    
         #define R_X86_64_GOTPC32        26      /* 32 bit signed pc relative
                                                    offset to GOT */
    

Here's another example, with the invocation of objdump:

    
    
        $ cat hackernews.c
        int
        doit(int a)
        {
           return blah(a);
        }
        $ cc -Os -c hackernews.c
        $ objdump -r -S hackernews.o
        hackernews.o:     file format elf64-x86-64
        Disassembly of section .text:
        0000000000000000 <doit>:
           0:       31 c0                   xor    %eax,%eax
           2:       e9 00 00 00 00          jmpq   7 <doit+0x7>
                            3: R_X86_64_PC32        blah-0x4
    

on ARM (raspberry pi, the relocation looks a little bit different)

    
    
        $ objdump -r -S hackernews.o
        hackernews.o:     file format elf32-littlearm
        Disassembly of section .text:
        00000000 <doit>:
           0:       eafffffe        b       0 <blah>
                            0: R_ARM_JUMP24 blah
    
        #define R_ARM_JUMP24                29      /* PC relative 24 bit
                                               (B, BL<cond>).  */
    

To only see the relocation table, use "objdump --relocs":

    
    
        hackernews.o:     file format elf32-littlearm
        RELOCATION RECORDS FOR [.text]:
        OFFSET   TYPE              VALUE
        00000000 R_ARM_JUMP24      blah

~~~
gsg
By the way, I believe the zeros are addends and not actually dummy values.
This is a bit strange because ELF has existing provisions for relocations with
addends, but there it is.

------
4ad
What a silly premise. I find bugs in compilers all the time. I don't like
these absolute statements, the industry loves them; I don't think they are
useful.

Also, the hardware is not always right. I used to write drivers and, if
anything, the hardware is always wrong. Hardware is full of bugs, drivers work
hard to hide these bugs from the user.

~~~
amboar
So at the bottom of the article the author points to bugzillas for GCC and
LLVM, and points out that compilers actually do have bugs. It feels like an
admission that the title is really just for attracting clicks, and that the
first para of the article could've just been ignored or dropped.

------
hyp0
Do not think the compiler is wrong. That's impossible. Instead... try to
realize the truth. _What truth?_ There is no compiler. (for non-compiled
languages)

For compiler authors, it's the _other_ compiler that's right.

I have to admit the only time I ever thought there was a compiler bug, there
actually was a compiler bug (javac). Same for hardware bug (on aligned
memory). And a standard library (in C++). I think it's because when I go
through my code I can _tell_ if my understanding is clear enough - and don't
look elsewhere until it is.

------
robert_tweed
In _many_ years of programming, I don't think I've ever run into a genuine
compiler bug. It's fallacy to assume they don't exist, but they tend to be so
rare and esoteric that you probably have a better chance of winning the
lottery than finding one.

I have however, run into numerous bugs in standard libraries. These are
definitely much more common in proprietary languages like ActionScript (Flash)
or Lingo (Director), or rapidly-changing (I'll be nice and not say poorly
designed) languages like PHP than in say, C, C++ or Java. Platform-specific
bugs in x-platform code also seem to be the most common.

I agree with the general premise that "it's probably your own fault". I can
probably count more times that I've suspected a compiler/OS/stdlib bug and
found after extensive testing that it was my own fault than I can count
genuine library bugs. On average, I probably hit one genuine language bug
every two years at most.

The trick is to start with the assumption that it _is_ a compiler/OS/stdlib
bug. Next, go create a minimal proof-of-concept to demonstrate the bug you
believe exists. In doing so, more than 50% of the time you'll figure out what
was _really_ wrong with your code as you are doing this. The other times, you
have a nice minimal test case you can submit to the language maintainer's
mailing list.

It's also surprising how much you can learn about how a language or feature by
trying to methodically prove that it is broken. The process of doing so forces
you to think about all the edge cases you don't normally consider, but most of
the time, the language designer already did.

------
wglb
When I was writing the code generation part of a compiler in a previous
lifetime, there were a number of conversations of the following sort early on
in the production life of the compiler:

Programmer: The compiler is producing the wrong code for this case. Compiler
team: Ok, let's look at the code and step through it. Programmer: So it is
putting this quantity in the register pair here Compiler team: Right.
Programmer: and it is taking out the high byte and putting it there Compiler
team: Right. Programmer: and then it is . . . Oh. Wow, that is a weird way to
do this. Compiler team: Right. Saves two bytes of code.

The programming team had previously used assembler and had developed
conventions of code patterns. Our boss pointed out that "A compiler should
produce assembly code that an assembly programmer would be fired for writing."

Not that we didn't have bugs, but the previous conversation was more common
than the code generation bugs we had.

And if, Team Compiler, if you think that that compiler is always right, John
Regehr [http://blog.regehr.org/](http://blog.regehr.org/) has some news for
you.

------
yk
The post somewhat misses the point of the title ( or I did overthink it). The
anecdote is interesting, but not exactly a example of a arguable compiler bug.
On the other hand, I think that _The compiler is always right_ is a rather
good general guideline for programming.

There are three models of programming [1] involved in programming, the one in
the head of the programmer, the language specification and the one in the
compiler. To look at them individually, it would be nice if the first one is
right but of course it is pretty much by definition wrong. (Until we finally
have the tools to program via telepathic link.) The second is the language,
which should be the authoritative one. And the one in the compiler, which for
pragmatic reasons wins. This is of course just a complicated way of saying
that I am usually more interested in working code than in the standard.

[1] Here understood as the correspondence between source code and program.

------
golergka
That's only true when you work with established, trusted compilers. Every
worthy Unity developer is familiar with AOT bugs on iOS platforms, which
render one of the best instruments in the hands of C# programmer, Linq,
completely unusable.

------
dap
While I appreciate the general principle that many others have also observed,
that one should assume bugs are in one's own code before blaming the
underlying system, I find that this is often an excuse to avoid understanding
the problem and applying the solution where it belongs. The operating system,
the network, remote APIs, and other infrastructure often _are_ the problem,
and if you don't invest in building and understanding tools that can truly
explain what's going on, you often and up building crappy workarounds for
shortcomings in the underlying infrastructure.

~~~
mratzloff
_> you often and up building crappy workarounds for shortcomings in the
underlying infrastructure._

Which, to be fair, you would often have to build regardless.

------
SideburnsOfDoom
See also an older proverb to the same effect: "SELECT isn't broken"

[http://blog.codinghorror.com/the-first-rule-of-
programming-i...](http://blog.codinghorror.com/the-first-rule-of-programming-
its-always-your-fault/)
[http://pragmatictips.com/26](http://pragmatictips.com/26)

------
sanxiyn
The compiler is mostly right for tested configurations. If you are using less
frequently used configurations, all bets are off.

For example, unusual target architectures, using optimize options not enabled
by default, using non-default modes of compilation such as PGO and LTO, etc.

------
ikusalic
Here's one absurd bug in Java that really surprised me:

Math.abs(-2147483647); // 2147483647

Math.abs(-2147483648); // -2147483648

When you know this behavior, it's kinda obvious why, but still...

~~~
rwallace
Well, not just kind of obvious why, but kind of obvious it's not a bug in
Java! If you're dealing with quantities that don't fit in a 32-bit twos
complement integer, you need to use a larger datatype.

------
peterbotond
and sometimes a few lines of extra code is needed to help the compiler figure
out programmer's intent. simplest, example, is extra code that will be
optimized out in the final product, or another to not optimize out certain
aliasing. the compiler is your friend who sometimes disagrees, misbehaves, and
right, just all along is a good friend.

------
frozenport
Sometimes you need to rebuild.

~~~
Ono-Sendai
Indeed. Visual studio will often link together a severely broken executable,
which will be fixed upon full rebuild.

