
Hello. I'm a compiler - andremedeiros
http://stackoverflow.com/a/2685541/5646
======
cobrausn
This was linked in the comments (by the author):

<https://www.facebook.com/sedatk/posts/10151240841812644>

 _Sedat Kapanoglu · 2,372 followers

3 hours ago near Maslak, Istanbul

Today I was at the Istanbul courthouse third time this year. I attended a
trial defending myself to a judge. Then I bore my testimony to a prosecutor
about a different case. Both cases were about the free speech platform I own
in Turkey. Meanwhile in the world, one of my older posts in Stackoverflow
became Hackernews #1 & reddit/programming #1. I wish it was Turkey which made
me feel better about myself, not the rest of the world._

------
Xcelerate
The poor compilers do the best with what they have. And by "what they have", I
mean the things they can't make assumptions about. Which turns out to be a
crapload. Until that happens, very highly-tuned assembly will continue to
outperform the best compilers.

Of course, a more typical scenario is hand-tuning a few loops where 99% of the
clock cycles occur and letting the compiler take care of the rest.

(Also, I was attempting to send a Morse code message by flashing the vote
counter between up and down, but I don't think anyone got it :( )

~~~
codeflo
Most of all, the compiler can't change the layout of your data. Many of the
use cases for hand-crafted assembly involve SIMD instructions, which perform
several operations in parallel. These vector instructions can be incredibly
fast, but their limitations often mean that you need to carefully design the
data structures of the whole program around the optimization of one single
critical loop.

~~~
colanderman
I've written lots of code that either GCC has autovectorized into SIMD
instructions, or that I've represented using GCC's vector types which produce
SIMD instructions. While it is true that a poor choice of data structure will
preclude SIMD optimizations, use of a compiler does not preclude the same.

~~~
pmr_
At the same time I have seen autovectorization to fail on things that would
look like obvious candidates or to generate still sub-optimal code. I would
prefer autovectorization over hand-written SIMD code everyday but that key
here is reliability. If I cannot guarantee that the performance is going to
stay the same across different compiler releases (which is often that case
with autovectorization) it is much more convenient to just write the
optimization myself and be sure that it is going to happen regardless of the
environment.

------
gilgoomesh
Cute. Although I think my compiler has different things to say.

Hello, I'm a C compiler that still can't handle C99. I hope you're wearing
waterproof clothing because I'm gonna throw up on you. Also, I've been
drinking heavily so your C++ code is going to take a while to compile and when
it's done, it's gonna smell funny.

~~~
valdiorn
Visual C++ ? :)

~~~
pjmlp
C++ compiler != C compiler

~~~
spatulon
Visual C++ is a C compiler - it compiles C90. The issue is that they still
haven't added support for C99. You can compile your C code as C++, but then
you're restricted to the common subset of C and C++.

~~~
pjmlp
No it is not. It is two compilers bundled together and called by the same
driver application.

Microsoft will not update the C compiler and will focus only on the C++ one.

As explained by their main architect:

[http://herbsutter.com/2012/05/03/reader-qa-what-about-vc-
and...](http://herbsutter.com/2012/05/03/reader-qa-what-about-vc-and-c99/)

Bundling C and C++ compilers together was a way for C++ vendors to ease the
migration into C++ land, specially in the early days. There is no law that C++
compiler vendors are required to offer a C compiler as well.

Given that on Build 2012 it was mentioned that the Windows team is making
their code compilable under C++ and the actual position on C++ vs C at
Microsoft, the C compiler might even be dropped from Visual Studio.

~~~
simias
I think you're in violent agreement with the GP.

------
qompiler
Hello. I'm a programmer.

I noticed you couldn't optimize my code to use SIMD so I went ahead and used
inline assembly. It will probably take another 30 years before you can
actually think like a human and perform optimizations like this.

~~~
Scaevolus
30 years? ICC already does automatic vectorization pretty well. LLVM and GCC
have implementations that need more tuning. I bet they'll be solid within 5
years.

You still beat them with inline assembly, but you probably won't accelerate
the code vectorwidth times anymore.

~~~
raverbashing
Exactly

ICC is the best one, but GCC from 4.0 could do it "automatically" (being very
loose about the term)

And no one does inline assembly for that, they use intrinsics

~~~
cwzwarich
> And no one does inline assembly for that, they use intrinsics

Depends on your architecture. With ARM, compilers will often not set the
alignment bits in vector loads and stores, and this can be a big performance
hit depending on the microarchitecture. In general, compilers sometimes deal
poorly with instructions that have particularly strange register constraints,
or loads/stores with address writeback.

~~~
raverbashing
Good point

------
oliland
This answer purports a myth that compilers are magical black boxes, the sum of
millions of hours of intense academic research that "you will never
understand".

Replace "compiler" with "computer". Doesn't that make you angry? Answers like
these do nothing but prevent people from learning about them.

If you are interested in compilers, here's Edinburgh University's notes from
the course "Compiling Techniques", probably a good place to start. Don't let
internet tough-guys stop you from learning.

<http://www.inf.ed.ac.uk/teaching/courses/ct/>

~~~
kami8845
>computers are magical black boxes, the sum of millions of hours of intense
academic research

Isn't that awesome? I can use the thing without ever understanding what it
does :) I'm so glad my car works without me ever having to know anything about
how it works. I know it requires some form of money (gas) to work. That's it.
Money in -> Transport out. Perfect.

The linked post never proclaimed no one "will ever understand" compilers.
That's just what you're reading into it. You try to get upset at it, so you
do. It merely proclaims that compilers are incredibly complex, which is the
nice thing about them. If compilers were stupid they'd be a lot less useful.

~~~
endlessvoid94
He's just saying the answer makes compilers sound intimidating.

~~~
Marazan
Compilers are intimidating.

I mean, the basics of parsing and lexing are easy enough to understand & do
and forming an AST is straightforward but all the "clever stuff" adfer that,
all the hundreds and thousands of optimisation tests that are done - they are
mind boggling.

Each one by itself is pretty straightfoward but start adding them up and
layering them on top of each other and it gets crazy pretty quick.

------
Breakthrough
And on the Fourth Day, God proclaimed "Thou shalt have the ability to use
inline assembly in thy C/C++ code for performance-critical tasks".

I can think of absolutely zero reason to write an entire program in x86
assembly, let alone any other kind of assembly (GCC spits out some pretty
optimized code for my little Atmel MCU)... It's a lot nicer to write
everything in a high-level, and then write any performance specifics in inline
assembly.

The really cool thing to see is how other newer languages have adopted this
scheme (e.g. PyASM for Python, or the ability to edit the instructions for
interpreted languages that run in their own VM). And as always, great power
comes with great responsibility ;)

~~~
ohwp
_"I can think of absolutely zero reason to write an entire program in x86
assembly,"_

I can think of 2 reasons: study and fun.

~~~
tedajax
For x86 I'd say study but I don't think there's any fun to be had there.
Something like MIPS32 would be pretty fun because that instruction set
actually makes sense to a human.

~~~
bitwize
Funny. x86 was designed to be understandable to humans. MIPS32 was designed to
be more suitable for compilers.

~~~
mikeash
8086, maybe, but I don't think we can say that modern x86 was "designed" for
any one thing at all. Modern x86 is an excellent example of what you get when
you evolve something for decades with backwards compatibility as a paramount
requirement and little other consistent guidance, for all the good and bad
that implies.

------
ck2
Are they pushing those realtime vote counts with websockets? Pretty slick.
Beats polling by far.

Looks like IE is holding it back as usual:

[https://developer.mozilla.org/en-
US/docs/WebSockets#Browser_...](https://developer.mozilla.org/en-
US/docs/WebSockets#Browser_compatibility)

~~~
adieulot
Yes that’s websockets it seems. Must be quite a overhead, they use it on all
answers.

~~~
naiquevin
Yes, it's websockets. You can check it in Chrome by keeping the console open
and refreshing the page. Would be interesting to test in IE.

BTW, as much as I am amazed to see websockets in action, the real time
updating of vote count feels creepy and distracting at the same time :-)

~~~
ck2
It makes me feel more connected though.

As long as it's not abused I think it's a clever use.

Have to research how to setup websocket server.

This looks like an impressive project <http://socket.io/>

------
ck2
Isn't Steve Gibson ( <http://www.grc.com/stevegibson.htm> ) still coding
large, complex programs in pure assembly?

His work on spinrite is legendary, for those born before IDE hard drives were
invented.

~~~
B-Con
From his podcasts I hear that it is pure assembly. He's borderline machine,
though.

------
Swizec
Fascinating, while I was reading that, it got 5 upvotes! Mindblowing.

This is a really awesome description too. The only thing I know about
compilers is that I implemented one for class (without fancy optimisations)
and I am surprised any software works, ever. Compiler are just ...
mindbogglingly complex things. Almost as much voodoo dark magic as
engineering.

~~~
perlgeek
One of the reason software still works is that compilers are (mostly ( _))
very easy to test. Just just need an input source code, and expected outcome
(either compiler error, or the result when running the code). No intermediate
state to consider, no concurrency, no test database to set up etc.

Oh, and many serious compilers can compile themselves, so bootstrapping is a
pretty good test too.

Another reason is that compilers simply _must* work for software development
to continue, so people have spend the necessary amount of energy to get
implementation and tests "right".

(*) There are always exceptions, like when a compiler auto-parallelizes code
and introduces a race condition that is rarely triggered. But those cases are
blessedly rare.

~~~
Swizec
Testing was the hardest thing I came across.

You have on your hands a compiler that you can no longer think of a program
for, that would produce an incorrect result. You're happy.

Then your friend comes along and the first thing they try behaves incorrectly.

The problem isn't even that the compiler would crash and burn, it lovingly
compiles the input program into something that _does the wrong thing_!

~~~
michaelt
There are tools available that can generate random (valid) C programs,
comparing the results generated by code from different compilers [1]. CSmith
say bugs have been found in every tool tested [2] - from GCC to costly
commercial compilers.

[1]
[http://gcc.gnu.org/wiki/summit2010?action=AttachFile&do=...](http://gcc.gnu.org/wiki/summit2010?action=AttachFile&do=get&target=regehr_gcc_summit_2010.pdf)
[2] <http://embed.cs.utah.edu/csmith/>

------
Arjuna
Anytime I read about the topic of assembly language, I can't help but think of
Michael Abrash. For example, check out Chapter 22 [1] from his _Graphics
Programming Black Book_ entitled _Zenning and the Flexible Mind_ for a
pleasant stroll down Optimization Lane.

You might also enjoy his book entitled _The Zen of Assembly Language_ which
features the Zen Timer (a timing tool for performance measuring).

[1] <http://downloads.gamedev.net/pdf/gpbb/gpbb22.pdf>

------
davidroberts
The best comment: "Thank you compiler, but perhaps if you weren't commenting
on StackOverflow, you could get me a drink and play some nice music while
you're working?"

------
friendly_chap
You can see the upvote counter constantly being updated ATM :)

~~~
mjs
Ugh, it's being edited constantly too! Someone just removed the "Hello" from
the first line, which was part of what made it great...

~~~
friendly_chap
Stackoverflow is plagued by overzealous editing IMHO. I understand it's hard
to balance it, but a question of mine recently got 3 downvotes and a close, to
be reopened later! And now it even has 2 upvotes.

A guy answered it immediately, while some other peeps started whining about
how it is not a proper question! I only wanted a quick fix, and I got it too,
while the others nitpicked.

Hilarious.

~~~
mctx
I think it's great - it's not about solving your quick fix, it's about
providing future value for googlers.

~~~
zapdrive
Then where does one go for quick fixes?

~~~
mctx
I treat SO like a technically superior colleague - if I'm stuck with something
and have exhausted my other options (documentation, examples, research, SO
search), then I'll phrase my question clearly, showing my progress and what
I'm stuck on. This makes me look good to my colleague, and enables him to
understand where I'm coming from. A lot of the time by writing out the
question in full I'll be able to solve it myself by getting my thoughts
organised.

------
dragontamer
Hey, my name is ICC, and I'm one of the most respected compilers in the
industry. I also sabotage your code so that it works poorly on AMD CPUs, while
making sure that Intel CPUs run my code at full speed. After all, Intel likes
to establish market dominance.

<http://www.agner.org/optimize/blog/read.php?i=49#49>

Blind trust in the compiler is bad people. Good luck discussing this issue
without any Assembly Programers who can fully understand what is going on
here.

------
mmphosis
Hello, I am a programmer.

I have little idea what modern day compilers are doing, or what the CPU, or
the operating system is doing for that matter. Often, way too often, compilers
fail, hardware fails, operating systems fail, lots of things fail. I am not
going to read the millions of lines of code written by other programmers (in
f-ing emacs no less) in the any number of differing complex beasts, the
compilers. It seems crazy-making to me, that other programmers would create
compilers that would use millions of possibilities of optimizing a single line
of mine using hundreds of different optimization techniques based on a vast
amount of academic research that I won't be spending years getting at. I do
feel embarrassed, yes very icky, that I have little to no idea what a three-
line loop will be compiled as, but bloat would be my guess. There is risk in
going to great lengths of optimization or doing the dirtiest tricks. And if I
don't want the compiler to do this, I have no idea how to stop this behavior,
nor do I want to invest in the specific knowledge of the nuances any
particular compiler. The compiler does not allow me easy access because the
compiler itself is an overly complex piece of software written by other
programmers. I could care less about how a compiler would make my code would
look in assembly, on different processor architectures and different operating
systems and in different assembly conventions. Transformation comes with how
we as programmers write code, not in compiler-fu.

P.S. Oh, and by the way if I really wasn't using half of the code I wrote, I
would throw the source code away.

~~~
cliffbean
> I have little to no idea what a three-line loop will be compiled as, but
> bloat would be my guess.

This is unfortunately too often true. Some compilers are tuned too much for
looking good on artificial benchmarks, in which turning 3-line loops into
thousands of instructions sometimes helps, even if it hurts on most real-world
code.

> And if I don't want the compiler to do this, I have no idea how to stop this
> behavior, nor do I want to invest in the specific knowledge of the nuances
> any particular compiler.

The -O0 option, or its equivalent, is pretty easy to find in many compilers.
If you're happy with the performance of your code without all those fancy
techniques being applied, feel free to use it. Most people aren't ;-).

> P.S. Oh, and by the way if I really wasn't using half of the code I wrote, I
> would throw the source code away.

Only if you were aware of it ;-). I wish that compilers would focus a little
more on helping me make my code better, rather than so much on magically
making things better under the covers.

~~~
JasonFruit
If you want your compiler to do more to help you improve your source, you
might like Go, which offers a lot of (sometimes unwelcome) mandatory
suggestions that force you to do the right thing even when you don't want to.
(Ada and Pascal do the same thing according to a very different philosophy.)

~~~
rthomas6
I have come to love the strong typing of VHDL (very similar to Ada but for
hardware design). After using it for a while, in my uninformed opinion, I
think it can drastically reduce bugs because it assumes nothing and makes the
programmer define exactly what they mean.

------
johncoltrane
This is the first time I see the vote count grow by 10 in real time.
Impressive stuff.

------
dragontamer
Hello. I'm an assembly programmer. I used a compiler to generate the majority
of code, and can hand-craft any assembly that comes out of it. I understand
how compilers auto-generate SIMD instructions can be more easily compiler-
generated if I make a "struct of arrays" instead of "an array of structs".

TLDR: Real performance programmers need to understand the assembly a compiler
generates if they hope to tune the compiler to generate optimal assembly.
Also, GCC -O3 is prone to removing too much code and reordering it, causing
memory barrier issues and the like. All multi-threaded programmers need to
understand how the compiler generates assembly (ie: by reordering your code),
and how it can generate new bugs if you don't use the right compiler flags.

~~~
colanderman
_Also, GCC -O3 is prone to removing too much code and reordering it, causing
memory barrier issues and the like._

Whoah, that's what __sync_synchronize() and volatile are for. If you're trying
to write order-dependent multithreaded code without those, the bug's in your
code, not in compiler flag juju.

------
yen223
It isn't closed or locked? Huh.

~~~
ygra
Only questions are closed or locked, I think. And only if they are not very
good questions that will just lead to debate or opinions but should stay
around out of historical interest. This here is a question that can be
answered reasonably but has a single whimsical answer. So no need of closing
or locking here.

And even though the tone of that answer is humourous it still is a good
answer, explaining why we don't all write Assembly instead of HLLs.

~~~
BitMastro
Aaaand it's closed. I hate that SO can close a question as "non constructive"
while I find it constructive instead.

------
orangethirty
As someone who is working on a project where the CTO demands more LOC, this
makes me warm inside.

~~~
eru
That should be hard to believe, save for my honed cynicism. Could you please
elaborate?

~~~
orangethirty
I'm working on a project where the CTO of this _huge_ company is criticizing
the lack of code I'm writing. Doesn't matter the fact that my code meets every
criteria. No functionality is being left out. But, he wants more code. It has
been very difficult for me to deal with this, because its the first time I
have ever faced such idiocy.

~~~
ncallaway
That sounds terribly frustrating.

Have you tried including war and peace in a comment
(<http://www.gutenberg.org/cache/epub/2600/pg2600.txt>)?

~~~
orangethirty
It is quite frustrating. More so when the project demands the system I'm
building be very lean and fast. I'm talking 100K/requests per second here.

------
jordanwallwork
Am I the only one just enjoying watching the upvotes rocketing up on SO?

~~~
shared4you
I upvoted when the counter was 520. Now it is 1500! (or more, by the time you
read this :)

EDIT: Looks like it's locked at 1788!

------
kaffeinecoma
I never noticed that the Stackoverflow js pulls updates for vote tallies in
real-time. Browsing this answer while HN is sending lots of traffic there is
almost like watching a car odometer.

------
_kst_
[http://imagemacros.files.wordpress.com/2009/07/hianteater.jp...](http://imagemacros.files.wordpress.com/2009/07/hianteater.jpg)

------
opminion
Ah, the irony of Knuth being the proponent of Literate Programming, and at the
same time the last man standing in general purpose assembly.

------
jawerty
It's even better when you read what the question was.

------
Nikolas0
Now I can't wait to get a message from the kernel :D

------
majmun
Compiler got AI human language capabilities?

------
detay
Yet another spark of genius from ssg.

~~~
sonergonul
Yeah..

------
mrleinad
Really? HN is now including witty posts in Stackoverflow as news?

------
dschiptsov
It is funny how people want to believe in tools. In fact, the optimizations
compiler does are incomparable with those programmer could do by choosing an
appropriate data-structure with corresponding algorithms and by being aware of
strengths and weakness of a particular CPU architecture.

JVM, which is nothing but a stack-based byte-code interpreter is the most
famous case.) People seems to believe it can do wonders, especially in memory
management and data throughput.

It is so strange to see how people are trying to create a whole world inside a
JVM. What it is called when people are building models of ships inside a
bottle?)

btw, now, it seems, they are trying to build a whole world inside a V8
bottle.)

~~~
lmm
My experience in the industry is just the opposite; people will do anything to
avoid admitting that the tools can do better than they than. I've had senior
programmers sneeringly declare that anyone who uses a debugger must be an
inferior programmer, and then watched them spend a full day figuring out a bug
that could be found in half an hour with the appropriate tool.

You know why it's worth recreating the world inside a JVM? Because its
behaviour is so much more thoroughly specified and predictable. Prior to Java,
C++ didn't even _have_ a memory model; every new release of GCC breaks a whole
raft of C programs that didn't realise they were invoking undefined behaviour
by having integer overflow. By the C standard, even a single instance of
undefined behaviour invalidates your whole program, making it virtually
impossible to predict the behaviour of any nontrivial-sized codebase.

~~~
dschiptsov
In my opinion it is a computer architecture that has a memory model, not Java
or C++.

For some strange reasons nginx - a mere C program - works for millions. So
does postgresql or redis or whatever.

~~~
colanderman
_In my opinion it is a computer architecture that has a memory model, not Java
or C++._

Unfortunately that opinion is by definition incompatible with portable
computer languages.

~~~
dschiptsov
What is wrong with Schemes or Lisps?)

Well, even Python (cpython) is _more_ portable than Java. All it needs is
working C compiler. It is ported to Plan9 and Android.)

~~~
colanderman
Schemes and Lisps _have_ memory model, it's just one that's radically
different enough from processor memory models that no-one realizes it's a
memory model and thus no-one gripes about it being different from processor
X's memory model.

If you're wondering what the memory model _is_ , you should ask yourself: What
happens when I mutate a cell? are those changes visible in references to that
cell? are those changes visible across thread boundaries? What happens when I
dereference a cell? Does it continue to take up memory or is it garbage-
collected? What if it's part of a cycle?

But I digress; your point was actually about languages that don't talk about
memory, such as the lambda calculus. A degenerate memory model is still a
memory model, and an inherently portable one at that. My point is, for
languages which _do_ talk about memory, it's necessary that they specify _how_
that memory works in order to be a portable language.

