

Compiler Writers Gone Wild: ARC Madness - mpweiher
http://blog.metaobject.com/2014/06/compiler-writers-gone-wild-arc-madness.html

======
xenadu02
The complaints about clang optimizations are misplaced; the example relies on
undefined behavior; regardless of what you think about it, all compilers are
free to do such things.

This is also a classic case of optimizing for the wrong and uncommon things
(billions of objc_msgsend calls in a tight loop), instead of the really common
stuff (like leaks/crashes due to a mistake in memory management, lower
productivity across your entire team due to time wasted thinking about
retain/release or tracking down aforementioned leaks, etc).

~~~
mpweiher
Two points.

1\. Breaking reflexivity is not OK. Ever. a==a should hold.

2\. The compiler knows that this is undefined behavior, and uses this to
"optimize" to a wrong result.

However, it does _not_ warn me about the use of undefined behavior. Not with
-Wall, not with -Wpedantic.

    
    
       marcel@localhost[tmp]cc -Wall -Wpedantic -Os -o undefined-realloc undefined-realloc.c 
       marcel@localhost[tmp]./undefined-realloc 
       1 2
    

I can't think of any reason for this to be OK. Either warn me, or don't use
it. Silently breaking reflexivity...did I mention "not OK"?

Of course, that's only a minor point in the article, but it does capture the
theme that's explored in the main point.

~~~
ridiculous_fish
1\. a==a is false for NaN, so it's not quite "Ever"

2\. The compiler does not know this is undefined behavior. Dereferencing a
pointer that has been previously passed to realloc is not necessarily
undefined behavior. realloc may fail and return NULL; in that case the passed-
in pointer remains valid.

~~~
tarpherder
Just a note: "a==a is false for NaN" is only true if compiled with precise or
strict floating point mode. It will return true in fast mode, which is often
what is used. Do not rely on this behavior for checking NaNs, use isnan
instead.

------
Stratoscope
For anyone like me who wonders what ARC is (my first thought was "pg's Lisp
dialect?") it sounds like they are talking about Objective-C Automatic
Reference Counting:

[http://clang.llvm.org/docs/AutomaticReferenceCounting.html](http://clang.llvm.org/docs/AutomaticReferenceCounting.html)

Tip to writers: Even if you think your audience knows what an acronym stands
for, spell it out or provide a link the first time you use it. You never know
when other people will see your article who are interested in the general area
but unfamiliar with your specific topic.

~~~
mikeash
It's kind of ironic that you use "pg" just a few words after the acronym
you're complaining about, although I understand that a blog post and a comment
aren't quite the same. Still, is it really necessary to define acronyms when
they're easy to find and pretty fundamental to the language? A Google search
for "objective-c arc" turns up nothing but hits that discuss Automatic
Reference Counting. Should he also expand shorthand like "x86" or "WWDC"?

~~~
Stratoscope
Ah, but how would I know that I need to search for "objective-c arc"? There's
no mention of Objective-C until well past the midway point of the article.

And of course a search for "ARC" turns up all kinds of unrelated topics. In
fact, the only software-related reference on the first page of a Google search
is Paul Graham's Arc.

It would have been fine if the first mention of ARC in the article was more
specific: "Objective-C ARC".

This is different from x86 or WWDC, where the very first match on a Google
search is exactly what you'd be looking for.

(FWIW, I upvoted your comment because I always appreciate interesting
feedback.)

~~~
mikeash
Fair point. But on the other hand, Objective-C is pretty much all he writes
about. Is it really necessary to mention it every time? I understand that it
makes life harder if you come in for the first time, but you did figure it
out, and I imagine he writes for his usual audience. Personally, I would say
that this is a good case for making the HN title different from (more explicit
than) the page title, although I know that that way often lies madness.

Really, I think this is just another example of how we should name everything
using UUIDs rather than pronounceable chains of letters.

------
userbinator
_Turn on the "standard" optimization -Os and we get the following, much more
reasonable result_

That's still 250% more instructions than the necessary "xor eax, eax; ret". I
find it a bit disappointing that the compiler would miss this trivial
optimisation opportunity, while at the same time doing subtle and often
unwanted things with undefined behaviour. C was conceived as a "portable
assembly language", and as much as the language lawyers, theoreticists, and
compiler writers love to exercise their "undefined behaviour" rights and try
to dissuade others from thinking of it that way, that's what people use it for
and that's what they'll expect.

Rather, undefined behaviour should be interpreted as "do the obvious thing,
the results may differ on different platforms, and turn off all optimisations
exploiting it because it's either intentional or the programmer has made a
mistake so the compiler should also generate obviously wrong code." A warning
would be nice too. Instead of blind adherence to the standard, consider what
behaviour programmers actually expect, and write compilers accordingly. One
example of this is casting integers to pointers of various types, and using
them to access hardware. Device drivers would be impossible to write if the
compiler decided that such undefined behaviour meant it could remove the
accesses completely (which is completely legal according to the standard.)

More examples: dereferencing a null pointer should generate code that accesses
address 0. Signed integers should wrap around on overflow. Shifts should do
whatever the shift instruction of the architecture does. Trying to access
beyond the end of the array should try to access memory there. Division by
zero will do what the machine instruction would do. Etc.

~~~
foodevl
The 250% more instructions are the frame pointer, which is how your debugger
figures out the stack trace when you happen to break execution at the "xor
eax, eax". Use -fomit-frame-pointer to get rid of that. That has nothing to do
with optimization.

~~~
userbinator
Those extra instructions shouldn't be necessary even when debugging, since
that's what the -g option is for; to generate information the debugger can
use: [http://yosefk.com/blog/getting-the-call-stack-without-a-
fram...](http://yosefk.com/blog/getting-the-call-stack-without-a-frame-
pointer.html)

When optimisation is enabled, all extraneous instructions should disappear.

------
cjensen
The author's surprise at the size of the ARC code seems to have led him astray
from the actual bug in his code. The unoptimized version performs refcount
manipulation of 'id' and crashes. The optimized version avoids refcount
mucking and does not crash. Seems to me this strongly implies that an invalid
'id' is being passed into the function!

------
mmastrac
I was a little disappointed that we didn't get a resolution here, but it's
interesting to see a little more of ARC under the hood:

"It isn't clear why those retains/releases were crashing, all the objects
involved looked OK in the debugger, but at least we will no longer be puzzled
by code that can't possibly crash...crashing, and therefore have a better
chance of actually debugging it."

~~~
e28eta
I'd be surprised if suppressing the crash actually makes this easier to debug.
I'd much prefer to debug with the crash in place, than to try to determine
what's wrong _later_ in the execution of the program when the bug has had the
opportunity to spread.

I'm also not very impressed with trotting out the assembly without using it to
diagnose the crash. Particularly not if it's being used to disparage ARC
(wrongly, I believe).

~~~
mikeash
It's pretty weak. He doesn't tell us which line of assembly actually crashes,
nor (presuming it's one of the calls to objc_storeStrong) what object is being
manipulated when it crashes. There's no point in looking at your assembly code
unless you're going to actually use it to investigate what's going on.

------
rqebmm
Why were they compiling without the optimizer (-Oo) in the first place?

~~~
krakensden
Presumably to aid in debugging- that's why _I_ do it with native languages...

------
szatkus
It's annoying when people write things like "13" MBPR" and not concrete CPU
model...

------
malkia
I was really thinking it'll be about arc lisp

