

Win32/64 C Compiler Benchmarks - Tatyanazaxarova
http://willus.com/ccomp_benchmark2.shtml

======
DarkShikari
If the author is reading this, x264 now officially supports the Intel
compiler, which should make it much easier to benchmark.

Additionally, x264 should probably categorized under "no significant floating
point calculations".

------
nimrody
Besides performance, Intel's compiler offers somewhat better error messages /
warnings

(Not affiliated with Intel. Just very satisfied with their performance on
numeric-heavy workloads).

------
Ralith
Disappointed to see no clang results, especially as he discusses compilation
speed.

~~~
octopus
While you can install Clang on Windows under Mingw and you can compile C
codes, it runs really slow (this is only from my own experience).

~~~
Ralith
Really? Any idea why, when it excels so well speed-wise on Linux and OSX?

~~~
octopus
I don't know why it is slower, I've just noticed that it is considerably
slower than gcc and VC.

As a side note, on my Mac, Clang is really fast for compiling code.

------
justincormack
If you follow the email thread, you can see that the big math differences are
based on libraries, not part of the compiler per se. Not to say they do not
need improving.

<http://gcc.gnu.org/ml/gcc/2012-01/msg00215.html>

~~~
berkut
It's not just the math libraries (although these are indeed much better -
especially under Linux), its loop unrolling and vectorization using intrinsics
are much better than other compilers in my experience.

However, no compiler I've found can get close to hand-crafted intrinsics in
non-trivial cases.

~~~
justincormack
That may well still be so, but these benchmarks did not seem to pick that up,
I dont think any of the code was significantly vectorizable anyway....

------
AshleysBrain
I've heard Intel's compiler makes optimisations that best favour the specifics
of Intel CPUs. All other compilers are likely to make CPU-neutral
optimisations. Do you think the results would be much different if run on an
AMD chip?

~~~
DarkShikari
Last I heard (according to Agner), Intel's compiler still intentionally
incorrectly detects AMD's CPUs and throws them to the CPU-generic code. x264
has a hacked loader function (borrowed from Agner) to avoid this, though I
don't know if the test used it.

Now I'm not sure how much this'd actually affect. The autovectorization in
Intel's compiler is weak and, at least on Win64, sse2 is allowed in normal
code without CPU dispatching. It does affect library functions like math and
memcpy, but that'll only matter if your program spends a ton of time in them.

------
rogerbinns
He should have disabled the "Turbo" setting on the processor otherwise results
will be somewhat randomized. It can take a while for the processor to switch
into turbo mode, and the decision to do so can be based on other work being
done. Additionally prior thermal load can result in turbo being disabled until
temperatures reduce to normal levels.

------
michaelhoffman
These results are useless for me since they turned on compiler "optimizations"
that may lead to incorrect results in floating point calculations.

------
richurd
I found this pretty inane. He used MinGW instead of native gcc on GNU/Linux.

~~~
markokocic
Mingw _is_ native gcc on Windows. Even more native than Visual C, since it
doesn't need additional c runtime redistributable, like msvc needs.

