
Xcode 4 ships a buggy compiler - FooBarWidget
http://blog.phusion.nl/2011/12/30/xcode-4-ships-a-buggy-compiler/
======
pkaler
First of all, all compilers are buggy because all software is buggy. Software
has bugs. Deal with it.

Second of all, if you are using alloca to allocate memory on the stack you are
asking for trouble. There are so many filthy corner cases to deal with it is
not worth it. Figure out another way to do what you are trying to accomplish.
This has nothing to do with C or LLVM. This has to do with how a Von Neumann
architecture works. [http://stackoverflow.com/questions/1018853/why-is-alloca-
not...](http://stackoverflow.com/questions/1018853/why-is-alloca-not-
considered-good-practice)

Third of all, alloca() is a compiler intrinsic. The compiler may choose to
implement it in any way it wants to.
<http://en.wikipedia.org/wiki/Intrinsic_function>

Fourth of all, sending a system function with a size_t of zero is undefined
behaviour. This is not a bug. Undefined behaviour means that the compiler may
do anything. <http://en.wikipedia.org/wiki/Undefined_behavior>

In general, LLVM does the most optimal thing when it hits undefined behaviour
because it assumes that the Clang static analyzer will warn the programmer.

~~~
FooBarWidget
> First of all, all compilers are buggy because all software is buggy.
> Software has bugs. Deal with it.

This is a _compiler_ we're talking about. Having a "deal with it" attitude is
not a good thing to have. If there's a kernel bug that causes a crash would
you also say "deal with it" or would you ask the creators to fix it?

> Figure out another way to do what you are trying to accomplish.

This is part of the code for the conservative garbage collector which is
supposed to scan the stack for pointers. If you know a better way to do that,
by all means let's hear it.

~~~
jey
> This is part of the code for the conservative garbage collector which is
> supposed to scan the stack for pointers. If you know a better way to do
> that, by all means let's hear it.

Why does scanning the stack for pointers need to involve alloca(0)?

~~~
FooBarWidget
alloca(0) is used as a way to get the end of the stack. There's a bunch of
fallback #ifdefs in the code: on platforms that don't have alloca, it detects
the end of the stack by calling a forced-non-inline function which returns the
address of its sole local variable, but that assumes that the compiler
supports the 'noinline' keyword. In any case, all of versions depend on highly
platform-specific behavior. See:

[https://github.com/FooBarWidget/rubyenterpriseedition187-330...](https://github.com/FooBarWidget/rubyenterpriseedition187-330/blob/mbari/gc.c#L567-570)

[https://github.com/FooBarWidget/rubyenterpriseedition187-330...](https://github.com/FooBarWidget/rubyenterpriseedition187-330/blob/mbari/rubysig.h#L210-269)

------
forrestthewoods
"Calling alloca(0) should return a pointer to the top of the stack"

Eh? My understanding is that it's undefined behavior and varies per platform
and compiler. Relying on it to return a stack pointer seems like a pretty
terrible idea even if it should work.

~~~
saurik
As much as I hate LLVM, and as many times as I've been burned by bad code it
has generated, I at least have to agree about this: alloca(0) tends to do
random stuff on different systems (in fact, it often seems to be special-cased
to clear the alloca area in library implementations).

However, assuming the actual compilation and test results reported in this
article are true, I personally don't care what the function does: if
(alloca(0) != NULL), then alloca(0) /should not/ return NULL. ;P

~~~
eschaton
You lost me at "As much as I hate LLVM."

What's to hate?

~~~
saurik
I am often seen arguing against LLVM on purely philosophical grounds from my
asserted position in the world of open devices and jailbreaking; specifically
that it, and Clang, are now products heavily funded (nearly owned) by Apple
with the goal of decreasing their reliance on a project (gcc) that was
relicensed under GPLv3, an event that caused Apple to immediately retract all
of their engineers from contributing code, or even merging newer versions.
This opinion has been strongly forged after numerous dealings with Apple's
open source release department, having to pester them over and over again to
get updated versions of gcc, gdb, and WebCore for their various systems (most
specifically the iPhone, for which they like to redact all open-source code).

~~~
w0utert
So you are basically saying you hate Apple and the way they contribute to OSS
(which IMO has been a very significant contribution benefiting many other OSS
projects), not LLVM or Clang. Good to have that sorted out... :-/

------
Xuzz
The Clang build shipped (called "LLVM Compiler" in Xcode) is also buggy: it
has issues with some floating point operations on armv6 (edit: I think; it
might only be THUMB, or THUMB _and_ armv6) builds (for the iPhone). There
isn't a non-LLVM GCC build for the iPhone anymore, however, so you'll probably
have to choose between this bug and the floating point one when building for
iPhone.

~~~
zyb09
it helps to turn thumbs off if you encounter that problem on armv6 builds

------
mahyarm
Blocks in Objective-C are also buggy. If you chain 3+ in a row, the compiler
craps out

    
    
      //Ex:
      [UIView animateWithDuration:1 animations:^{...} completion:^{
        [UIView animateWithDuration:1 animations:^{...} completion:^{
          [UIView animateWithDuration:1 animations:^{...} completion:^{
            //Compiler error here
          }]
        }]
      }];

------
chj
I use gcc for code gen.

The idea to use undefined behavior for optimization is really terrible.
Somehow they forget this is engineering project, not in researching. Breaking
existing code costs a lot of time and money.

By the way, I don't see their optimization has any impact on the real project
I am working on (highly cpu intensive).

~~~
jtc331
"The idea to use undefined behavior for optimization is really terrible.
Somehow they forget this is engineering project, not in researching. Breaking
existing code costs a lot of time and money."

Saying that using undefined behavior for optimization is really terrible is
living in a pipe dream if you code in C at all. No performant C compiler does
not use undefined behavior to optimize it's resultant code. In fact, GCC does
a fair amount of this too.

The real value in using undefined behavior to optimize code is that it leads
to a fair amount of layering of optimizations. And the real world tests so a
significant improvement in program running speed because of the interaction of
optimizations.

The real problem is that most C programmers expect far more defined behavior
than the standard actually gives. For a long explanation and lots of examples
of this, see the series of blog posts at: [http://blog.llvm.org/2011/05/what-
every-c-programmer-should-...](http://blog.llvm.org/2011/05/what-every-c-
programmer-should-know_21.html)

BTW, breaking existing code happens all the time with new compiler version
releases with LLVM or GCC.

A side note on GCC: back in school we actually had very simple programs that
used only STL data structures (all calls being legal according to the lib
specs) that would cause seg faults with certain levels of optimizations in
GCC. What's worse: it didn't happen in the previous major X.Y version (i.e.
the difference between 4.2 and 4.1.) But even on the breaking build it worked
with -O1 but not any higher. That's inconsistent behavior in optimizations if
I've ever seen it. But to be fair to the compiler: it's likely that the STL
made assumptions that you technically can't make according to the C spec. Just
like the OP's code. I love Ruby, but using MRI as an example of code that is a
correct program is rather extreme. MRI makes tons of assumptions that often
lead to noticeable bugs in the interpreter.

------
zitterbewegung
I tried compiling the code they have in the example with llvm-gcc and it
doesn't compile and returns an error?

