Hacker News new | past | comments | ask | show | jobs | submit login
SBCL quicker than C? (lbolla.wordpress.com)
69 points by lbolla on Feb 8, 2011 | hide | past | web | favorite | 49 comments



SBCL = Steel Bank Common Lisp

More precisely, if you tell the SBCL compiler to trust that all data types are as declared and omit type checks, it gives you code that's faster than gcc with the options at the bottom of this page: http://shootout.alioth.debian.org/u32/benchmark.php?test=spe...

These are "inner loop only" compiler settings, at least the way I'd use it --- but it's still nice to see concrete demonstrations that you don't have to drop down to C code to get maximum performance.

EDIT: (declaim (safety 0)) also omits array bounds checks, and checks for undefined variables.


>>it gives you code that's faster than gcc with the options at the bottom of this page<<

1) Here are the timings for the C program you linked (x86 Ubuntu one core) -

spectral-norm C GNU gcc #4

N CPU Elapsed

500 0.09 0.10

3,000 3.31 3.32

5,500 11.13 11.14

2) Here are the corresponding timings "if you tell the SBCL compiler..."

spectral-norm Lisp SBCL #2

N CPU Elapsed

500 0.06 0.16

3,000 4.64 4.70

5,500 15.69 15.72

3) Is the program on the page you linked to faster or slower than the SBCL program ?

http://shootout.alioth.debian.org/u32/performance.php?test=s...


Bolla says that alioth blew it, by not asking SBCL for full optimization --- and he does in fact get different timing with the optimization in place.

From the original post:

    N=500
    gcc 0.15u 0.00s 0.17r
    sbcl 0.08u 0.02s 0.21r

    N=3000
    gcc 5.60u 0.00s 5.69r
    sbcl 5.18u 0.01s 5.41r

    N=5500
    gcc 18.81u 0.01s 19.12r
    sbcl 17.42u 0.02s 17.76r
I mentioned the alioth page mainly so people could see how gcc was being run.


>>Bolla says that alioth blew it, by not asking SBCL for full optimization<<

No he doesn't.

The spectral-norm Lisp SBCL #2 program is Lorenzo Bolla's program - look at the program source code

http://shootout.alioth.debian.org/u32/program.php?test=spect...

>>he does in fact get different timing with the optimization in place<<

He get's different timing but the only explanation he provides is "So, different numbers on different boxes, which is not at all unexpected."


>>Bolla says that alioth blew it, by not asking SBCL for full optimization<<

Maybe you should ask Lorenzo Bolla if he was trying to create misunderstanding by posting one of his old (December 5th, 2010) blog entries to HN ;-)

The benchmarks game website has been showing Lorenzo Bolla's spectral-norm Lisp SBCL #2 since December 8th 2010.


I just started reading the thread linked from the blog post, and it felt like reading House of Leaves. Here are some choice quotes from various authors:

Re Clojure: "This is a 'babel' plot to destroy lisp."

"Pocket Forth is a free Forth interactive-interpretor that runs fine on my Macintosh "Performa 600" (68030-CPU) System 7.5.5."

"The Mac is a desktop-publishing 'appliance' --- considering that you don't have a laser-printer, a Mac is about as useful to you as a bicycle is to a fish. Besides that, you don't seem like the desktop-publishing type of guy --- that is mostly a marketing-department girl thing."

"I really foresee the collapse of civilization. The majority of people in America are motivated entirely by hate, fear, greed and envy, and this situation can't continue indefinitely. This is what I describe in my book, 'After the Obamacalypse,' which is included in the slide-rule package on my web-page."

    Another time I was sitting in my van in a parking lot. A skinny 
    Jew walked up to the van, peered inside, then tried to open the door 
    but discovered that it was locked, so he walked away. I got out and 
    walked over to him, and I said: "What the hell do you think you're 
    doing?" He also said that he thought it was his friend's van, but he 
    didn't apologize at all, but became prideful and belligerent. When I 
    said, "I think you're a thief," he said: "Look at the way you're 
    dressed; you're the thief!" (I was wearing a hoodie). He told me that 
    if I continued bothering him, he was going to call the police, and he 
    got out his cell-phone. When I said, "I think you were looking for 
    something to steal," he said: "There is nothing in your van worth 
    stealing!" I beat him thoroughly with my fists and left him face down 
    on the sidewalk in his own blood. Somewhat belatedly, be began to cry: 
    "I'm sorry! I'm sorry!"
It ends shortly after "Discussion subject changed to 'Whining (was Re: ordered associative arrays)' by John Passaniti."


One of the reasons I left comp.lang.lisp (I used to be part of that community) was because there were so many people there who would, at the slightest provocation, fly off the handle and explain to you their alternative theory of whatever. A lot of crazy people.

It seemed like for every Peter Seibel or Kenny Tilton, there were 8 people who had 10% of 100 projects done, were happy to tell you about the anti-lisp conspiracy, and also had alternative health advice.


First, what the hell are you talking about??

Second, someone upvoted this??


1. The blog post links to a thread on comp.lang.lisp, which contains 3 or 4 posts about the shootout but was otherwise just nuts.

2. Eh, cataloging the type of wackos that hang out on comp.lang.lisp is something a lot of people enjoy.


1: Edited my post to have it make more sense

2: I just did not expect to be reading this sort of thing on a Tuesday morning on Hacker News. It caught me completely off guard.


I also enjoy comp.os.linux.advocacy for complete insanity. The fun part is that there's exactly 0 constructive conversation going on. It's all flames, all the time.



[deleted]


Robert Maas is mentally ill.


He specifically asked gcc for optimisation for code size (-Os). For speed, he should be using -O3 only. He used "-Os -O3". This invalidates the benchmark.


Have you timed this? Instruction caches are finite, and overflowing them hurts performance so badly that some more loop unrolling may not help.


This is true for real software, not microbenchmarks. All of the shootout benchmarks will fit in their entirety in L1i cache -- which makes reducing the executable size pointless.

Incidentally, this is probably the largest reason why so many people still use -O3 -- it wins in exactly the kind of programs that are used as simple and common benchmarks. It solidly loses on almost everything else.


> It solidly loses on almost everything else.

Do you have any data on that? Most CPU bound programs should have pretty good instruction locality, negating the effects of smaller code. But without some numbers this is pointless guesswork.


I've not timed anything, but asking gcc to optimise for size is the wrong thing to do when benchmarking for speed. I can think of lots of ways that this would cripple performance. Why not let gcc make its own decision?

The only justification for using -Os in a speed benchmark is "I tried it both with and without the flag, and it was faster this way". I don't see any such assertion.


> The only justification for using -Os in a speed benchmark is "I tried it both with and without the flag, and it was faster this way". I don't see any such assertion.

Really? It seems to me that this is quite enough:

> I’ve just re-run the C benchmark without -Os (only -O3) but the results are the same.



If this holds true, I'll concede this specific point.

As we know, however, benchmarking can often come down to tuning. If this most basic of compiler options has not been set to the obvious choice for speed, how can we have any confidence that the C code as written is written in an efficient way?

Are we comparing language against language here, or somebody's implementation in one language against somebody's implementation in another?

I note that there appear to be hand optimisations in the C code. Were these done well, or would the compiler have done a better job?


Of course we are comparing implementations; languages do not have a speed. My language (purely hypothetical, unfortunately) language at builds on 'Principia Mathematica' may need a 10000 page program to compute 1+1, but its compiler could, in theory, produce the same executable as C (or Fortran, or whatever) would from their one-liners that do the same thing.


There is no difference at all, since the latter takes precedence (read the man pages):

$ gcc -c -Q -Os -O3 --help=optimizers > Os-O3-opts

$ gcc -c -Q -O3 --help=optimizers > O3-opts

$ diff Os-O3-opts O3-opts

$


I didn't realise that. He actually uses -Os after -O3.


For N >= 3000 C is significantly faster. My guess is the initial slowness is caused by OMP initialising.


Is this really still news? Yes, we know you can get great performance in some tasks with languages other than C. I swear, if I see ANOTHER article with the linkbait title of "X faster than C"...

The decent ones posted at least bother to do a comparison with several pseudo-representative tasks. This one just goes "hey, I played around with this ONE SPECIFIC TASK NOBODY GIVES A CRAP ABOUT and IT RAN 0.006 MILLISECONDS FASTER THAN IN C! WOOOOOOOOOOOO!"


"We beat C" is a claim that goes hand in hand with "we are viable for scientific computing", so I'm always interested in hearing it (although more benchmarks would be nice).


I recall at least one old FORTRAN guy wandering into comp.lang.c who had very few good things to say about C's handling of floating-point calculations...


Floating point is not the problem, it's memory issues mostly due to C defaulting to allowing aliasing. C99 has the `restrict` keyword so you can generally get identical object code from both languages. SSE intrinsics are only available from C, you will either use them or assembly any time you care a lot about performance of tight kernels (very few nontrivial kernels are vectorized adequately by any of today's compilers).


As I recall, this had very little to do with this guy complaints - it was a mixture of C allowing use of x87 80-bit-wide doubles and not allowing sufficient reordering of operations.

That said, yes, restrict was added for this kind of thing.


x87

I'm sure he cackled into his pocket protector at the recent PHP/Java/gcc floating point parsing fiasco.


I said "viable".

Unfortunately, "Am I FORTRAN?" is the question that goes hand in hand with "Am I realistic for scientific computing".

c'est la vie.


N CPU Elapsed

500 0.07 0.22

3,000 2.34 2.41

5,500 7.86 8.01

Intel Fortran

http://shootout.alioth.debian.org/u32/performance.php?test=s...


True, but it's not just the fact that it's a common (and often garbage) claim--it's also about the relative worth if it's proven true.

I mean, if I could really get C performance out of SBCL (and for my purposes, I can't), I'd sure as hell want to know.

Think cold fusion. Sure, "wolf" has been cried a lot of times, but you're still going to want to know as soon as it happens "for real".


That's true, but these posts aren't ever of the "language Y is actually as good as or better than C, always!" variety, are they? Instead what we get is the results of (in the best case) a couple of micro-benchmarks that happen to show comparable performance to C.

If someone could show me that "yes, your Python programs are now AS FAST AS C!" then of course I'd be ecstatic to hear that; but the posts letting me know that "Python is as fast as C when approximating solutions to problem X, for some X you've never heard of and never will" get kind of old after the 137th time I read them.

For me this is comparable to someone posting about yet another problem in NP that is REALLY FRICKIN' HARD, so probably P=/=NP. I know many problems in NP are hard - you're not adding anything to the discussion by showing me yet another one. Let me know when you have an actual proof that P=/=NP.


It is interesting to note that a beautiful python program from the "interesting alternative" category [1] beats the C program, and LuaJIT is always impressive [2] on these sorts of microbenchmarks (beating SBCL, with one third the source code).

[1] http://shootout.alioth.debian.org/u32/program.php?test=spect...

[2] http://shootout.alioth.debian.org/u32/program.php?test=spect...


[1] 12.54s < 11.14s ?


My bad. I must have been confusing the SBCL time for the GCC time.


If you are such an expert on what makes a benchmark representative of "real-world" problems, you're welcome to make a contribution to the Shootout. I'm sure they'll be glad to accept it, and everyone will be relieved to find out all the other benchmarks are worthless and everyone has been wasting their time.


Actually slavak, inner loop C performance is all that's necessarily to make many of these tools viable.

If I can write my entire program in LANGUAGEX and just compile the inner loop a magic way and voila, the program runs at 85% C speed, we have a winner. We can use it in long-running programs which have a fierce compute time bounding.

This is an article explaining the magic way for a flavor of lisp.


At least in the past the Shootout code wouldn't have explicit (declaim (optimize ...)) in the source files, but the command used to compile the files would have it. Did it really get removed from the command line?


Alexey Voznyuk wanted it removed - My point is that obligatory "(optimize (speed 3) (safety 0) (debug 0) (compilation-speed 0) (space 0))" is totally wrong.


Thanks for the explanation. I think he's totally wrong though, and could have just overridden whatever setting he was unhappy with in his own program. It seems crazy to pessimize every other program for the sake of one solution though, especially when the approach taken is considered cheating and the solution marked as "interesting alternative".

And no criticism implied on the Shootout maintainers, I'm sure that dealing with the submitters is like herding cats :-)


Alexey Voznyuk contributed several Lisp programs, and a couple still show as the fastest Lisp program - maybe you can do better?

(The project name changed nearly 4 years ago http://groups.google.com/group/haskell-cafe/msg/61e427146c8d...)


I did years ago, and a few of them still seem to be around. But won't be doing it again both due to reasons we have discussed before, and because the implementations seem to have totally dived off the deep end of complexity by now, and don't really look like they'd be much fun anymore.


Well for fun you might do want no one else has done, and contribute a Lisp program for meteor-contest - solve as you please, there's no cat herding for that one

http://shootout.alioth.debian.org/u64q/performance.php?test=...


Yes there are several that include a credit to you in the source code comments.

The last program Alexey Voznyuk contributed seemed mostly to be inline assembler with a hint of Lisp - off the deep end :-)


This question is incorrect, C is a language, SBCL is a CL compiler. Kindly amend that.


After reading all these interesting and enlightening comments (no pun intended here, there are all really useful), the blog post should really be titled: "A particular SBCL-compiled LISP-implementation of a specific algorithm gives comparable results to an analogous GCC-compiled C-implementation, when run on particular boxes."




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: