
Do C compilers disprove Fermat's Last Theorem? - jlangenauer
http://blog.regehr.org/archives/140
======
tetha
Welcome to the world of compiler optimization. A compiler optimization has to
preserve all side effects of a program. This is pretty similar to partial
correctness.

Partial correctness means: This code computes the right thing whenever it
terminates. It is a subset of total correctness, which states: This code
terminates given the precondition and also computes the right then whenever it
terminates.

In his case, the first loop had a simple side effect: None. Thus it was a
correct optimization to replace the code with nothing. Fixes would include
adding volatile to a variable, returning the variables as he did, printing
something, ...

~~~
barrkel
Termination is an observable side-effect of a program. It is not a correct
compiler optimization to turn a non-terminating program into a terminating
one. Turing completeness is the black box beyond which a compiler's optimizer
cannot see past, and it _must_ , for correctness' sake, _must_ give up when it
cannot prove termination.

~~~
neilc
_Termination is an observable side-effect of a program_

Not per the C99 spec, at least (5.1.2.3: "Accessing a volatile object,
modifying an object, modifying a file, or calling a function that does any of
those operations are all side effects, which are changes in the state of the
execution environment.")

Termination is "observable", but so is runtime, memory consumption, code size,
and myriad other properties that a compiler is at liberty to manipulate.

 _Turing completeness is the black box beyond which a compiler's optimizer
cannot see past, and it must, for correctness' sake, must give up when it
cannot prove termination._

Correctness according to what standard?

~~~
barrkel
Re C99 - the existence of an execution environment is certainly observable
(it'll disappear with respect to the program when the program terminates),
unless you try to argue that exit(0) has no observable side-effects according
to the C standard, and thus we should expect code following exit(0) to also be
executed, where I think we would be entering absurdity.

Re correctness: That the compiler optimizer cannot see past the halting
problem is a general fact about compiler analysis, which you should be
familiar with from most introductions to compiler theory. That an optimizer
assumes without proof a solution to the halting problem for some sub-program
would mean that it's an unreliable translator, because it is making unsound
assumptions.

Of course, if your define your language such that it's OK to make unsound
assumptions, that's another matter.

~~~
neilc
_That an optimizer assumes without proof a solution to the halting problem for
some sub-program would mean that it's an unreliable translator_

The optimizer doesn't need to solve the halting problem for this example. It
merely needs to prove that any iteration of the loop has no side effects;
hence, an unbounded number of loop iterations also has no side effects, and
the loop can be removed.

The real issue is whether termination is considered a side effect. The canon
is the C spec; if you can point to some language in the spec that would
prevent the optimization, I'd love to see it.

Update: on reflection, I think you're right. The point isn't that the loop has
side-effects (it obviously doesn't); but by changing the termination behavior
of the loop, the compiler is likely to induce additional side effects (when it
runs the subsequent code).

~~~
barrkel
But a loop that never exits will never continue on to the rest of the program,
which will never have _its_ side-effects.

This loop has no side-effects:

    
    
        for (;;)
            /* do nothing */;
    

So the compiler, according to the spec, can remove it, right?

    
    
        for (;;)
            /* do nothing */;
        destroy_the_world(); /* has side-effects */
    

So will the world be destroyed? Or is the following code dead? In which case,
it's _not_ OK for the compiler to remove loops that have no side-effects.

~~~
dkersten
The C spec states[1] that code without side-effects may be removed. The C spec
does not list non-termination as a side-effect. The C spec also states that
anything not explicitly stated is _undefined behaviour_. That means that in
the abstract machine that the C spec defines, it is valid to remove the loop.

If you don't want the loop to be removed, then you are working outside of the
defined abstract machine, which means that its _up to you_ to make sure it
operates as desired. The embedded systems guys mentioned in the article (on
the llvm bug tracker, they also said it "wasn't much of an issue" for them,
because they did, in fact, tell the compiler to do what they wanted, wheras
the author stated that it was) did this by compiling the necessary code
without optimisations. Other valid approaches would be to tell the compiler
that the loop may, in fact, have side-effects _which cannot be known in the
context of the code_ by declaring the variables volatile (the C spec defines
reading volatile varibales as side-effects). If you see non-termination as
side-effects, then you should tell your compiler somehow.

The C programming language runs in a well defined abstract machine. Anything a
compiler does which is not defined by the abstract machine (so long as it
doesn't contradict the abstract machine, ie the abstract machine functions as
stated) is not a bug, but is considered undefined behaviour and cannot be
relied upon - therefore relying on an infinite loop which does not contain
side-effects is _undefined behaviour_.

\--

As for if a programming language in general (rather than specifically C)
should remove infinite loops.. I think its up to each language to define this.
If a language states that non-termination is a side-effect, then compilers
cannot optimise away code which _may_ not terminate, unless the user invokes
undefined compiler-specific behaviour by telling it to (eg compiler options)
or by changing the code to make your intent obvious to the compiler. Or by
solving the halting problem.

So, in summary: The C compilers are _NOT_ wrong to elliminate the side-effect
free infinite loops. Other languages _MAY_ be wrong to do so - it depends on
what the language specs state. You _CAN_ get around this by telling the
compiler what your intent is, by either changing the code (eg by introducing
side-effects) or by invoking non-standard behavour (eg, through compiler
switches).

[1] 5.1.2.3 (Execution Environment), paragraphs 2 and 3.

------
chime
Proggit had a lively discussion on this earlier today:
[http://www.reddit.com/r/programming/comments/byilp/c_compile...](http://www.reddit.com/r/programming/comments/byilp/c_compilers_disprove_fermats_last_theorem/)

------
rbabich
I can't say I've tried it, but shouldn't this be enough to avoid the
optimization?

    
    
       volatile int i=1;
       while (i) { }

~~~
tetha
Being volatile, i can change in ways which cannot be deduced from the program
code.

Thus, the loop loops until something which cannot be deduced from the program
code happens.

Thus, yes.

------
antirez
Make sure to read the first comment of the post, the poster wrote a number of
nonsensical things ;)

The compiler does not try at all to analyze the loop to understand if this can
exit or not, because it can do this only for very trivial cases. Instead since
the variables are not references after the loop, the loop is ignored as it
does not matter what will happen, but the outcome of the computation will
never be used, so it can be skipped at all.

