
Is Scheme Faster than C? - fogus
http://www.cs.indiana.edu/~jsobel/c455-c511.updated.txt
======
dkarl
This makes me so nostalgic. Ah, to be back in school when the hardest thing I
had to do was math, and I didn't have to worry about anyone wanting to modify
my code or coming to me saying, "Hey, dkarl, we just signed a huge contract,
and part of what we promised is that a 2 in the hundreds place of the left
operand will be treated as a 3 when we do multiplication for this customer.
What base? I don't know, rebel base alpha. I'm just the project manager. The
product manager answers those kinds of questions. And make sure it's
configurable, because the customer has been going back and forth a bit about
the specifics, and we don't think we'll be able to nail them down before we
deploy."

~~~
swannodette
I would hope that your work is a _healthy_ mix of the kind of elegant design
thinking covered in the post and real world pragmatics. Also your example
problem doesn't even sound particularly bothersome to implement in Scheme.

~~~
dkarl
It wouldn't be hard to implement in Scheme -- I'd actually love to work in
Scheme -- but if another developer asked me how to update the code for a
second customer and I replied, "Start with this Scheme source code, make your
changes, then apply this series of transformations and translate the resulting
code into C," I would get doused in gasoline and set on fire.

~~~
pmiller2
Of course you would be. That's because everything after "make your changes" is
supposed to be applied by the Scheme compiler, possibly with assistance or
annotations from the programmer (e.g. macro expansion, tail recursion
elimination). Doing it manually is the Wrong Way (TM) unless you're in a
PL/compilers class.

~~~
btilly
Did you read the story that this whole discussion is about? Everything after
"make your changes" was the procedure that was followed in the story. The
final result was a C program that was substantially faster than anything which
the Scheme compiler could ever produce.

No Scheme compiler would be likely to do those transformations for you.

~~~
silentbicycle
Read the Orbit thesis.

~~~
mahmud
Orbit was ground-breaking, but it's Stalin that's _the_ most powerful
optimizing compiler for Scheme, or perhaps any other language.

~~~
silentbicycle
Sure but it's mentioned in-thread already, while Orbit explains the CPS and
others well.

~~~
mahmud
Yep. I responded before reading the whole thread.

------
scott_s
Neat story - he was systematically applying optimizations that compilers could
do if they had full knowledge of the semantics of the program. By transforming
the program into continuation-passing style, he performed a very similar step
that modern compilers do when they transform a program into static single
assignment, _then_ optimize the program. For a full treatment of what I mean,
see A Correspondence between Continuation Passing Style and Static Single
Assignment:
[http://www.cs.purdue.edu/homes/suresh/502-Fall2008/papers/ke...](http://www.cs.purdue.edu/homes/suresh/502-Fall2008/papers/kelsey-
ssa-cps.pdf)

------
stcredzero
_Out of a class of about 15 students, only one person beat me (and only
barely), and he wrote significant portions of his program directly in Assembly
Language. The next fastest after me took nearly twice as long to do the same
work.

...A runtime profile of my program revealed that the majority of time was
spent in the C routines "malloc" and "free."_

For all of the beginning programmers and most of the student programmers out
there: Yes often a fast JIT VM might actually beat your C or get pretty close!
If you ever become a good programmer, then you can do much better than the JIT
in key situations.

EDIT: There's a self evaluation method here for aspiring C programmers. Start
benchmarking your code against the same thing in LuaJIT. You're not "good"
until you know how to routinely beat it by a factor approaching 2. And that's
necessary, but not sufficient.

~~~
scscsc
He does not use a JIT, he translated his code into C by hand.

~~~
stcredzero
I never said _he_ did. I think it's good to point out there's a lot of
beginner C code out there involving significant memory management that's not
much better or slower than the best JITs.

EDIT: Added "involving memory management"

EDIT: To students doing C programming: Be honest, now, but have you ever
checked?

------
srean
Wish the code was available to take a peek at. Talking about optimizing
transformations, the Stalin scheme compiler is quite freakily good at that.It
would have been interesting to run this code through Stalin and see what
happens.

From the author Je rey Mark Siskind's research statement:

    
    
      It uses the results of flow analysis to perform life-time analysis, 
      escape analysis, points-to analysis, and must-alias analysis. ...
      It also uses the above analysis to support flow-directed region-based
      storage management, where run-time garbage collection is replaced
      with static allocation and deallocation on a per-abstract-value
      and per-program-point basis. It also performs flow-directed
      lightweight CPS conversion,.. to support extremely efficient first-class continuations.
    
    

It is quite remarkable that even without any type annotations Stalin can hold
its own against a hand written C and often beat it as well.

<http://en.wikipedia.org/wiki/Stalin_(Scheme_implementation)>

(Has escaped chrismonsantoization)

~~~
igouy
Stalin download doesn't seem to have changed since 2006.

Is "Stalin can hold its own against a hand written C and often beat it" just
your paraphrase of wikipedia's quote from a bare assertion in Jeffrey
Siskind's Purdue research statement - or due you have some numbers to share?

~~~
srean
The Sourceforge version might be more updated. But I doubt it will differ by
much from what you have, the changes in the version are in the minor numbers.

Numbers circa 2008: take a look at section 6 in
ftp://ftp.ecn.purdue.edu/qobi/TR-ECE-08-02.pdf and
ftp://ftp.ecn.purdue.edu/qobi/TR-ECE-08-03.pdf These are numbers from
StalinGrad. A automatic differentiation engine written in Stalin (hence the
Grad).

Also the thread
[http://groups.google.com/group/comp.lang.lisp/browse_frm/thr...](http://groups.google.com/group/comp.lang.lisp/browse_frm/thread/ee481e354a7c8ccc/9801ba2edd839daf?hl=en&lnk=gst&q=integrate-1D#9801ba2edd839daf)
that got me interested in Stalin in the first place.

With all that said, any reason to distrust research statements ? Those are
usually taken seriously in academia. One only jeopardizes his position by
bluffing.

------
silentbicycle
Really, what he's saying is that it saves a lot of time to prototype code in a
_convenient_ language, and then translate parts to optimized code in a _fast_
language (where necessary), once good algorithms have been worked out. Many
fast languages get their speed by trading away flexibility.

This is not specific to Scheme, though - while it's a good prototyping
language, so are Lua (my favorite, and LuaJIT reduces the need for C), awk,
Python, etc.

Knowing how to implement the prototyping languages' constructs efficiently in
the "fast" language is important, though.

Of course, "fast" and "convenient" can also be the same language, such as
prototyping in Common Lisp and then adding declarations. OCaml is also quite
good.

~~~
dkarl
He did say the semantics-preserving transformations were easier in Scheme:

 _This is where Scheme really won. Because of its extremely algorithmic---
almost mathematical---nature, Scheme can be easily manipulated in a sort of
algebraic style._

------
copper
The conclusion is the most inspiring part of this:

> Real efficiency comes from elegant solutions, not optimized programs.
> Optimization is always just a few correctness-preserving transformations
> away

~~~
colomon
Unless you define "elegant" as "conceptually optimized" or something like
that, this is just nonsense. There are many cases where a superior algorithm
will run circles around an elegant but not so efficient algorithm -- and no
number of "correctness-preserving transformations" will bridge that gap.

~~~
flogic
I believe that's the point. I can imagine though there are instances where it
is easier to code up a better algorithmic solution in higher level languages
than it would be in C. Lazy data fetches would be a prime example. Obviously,
it can be done in C like any Turing complete language. However, it's easier in
something higher level.

~~~
colomon
Wouldn't that be much better expressed as something like "It's easier to do
elegant optimizations in a high-level language", then? Finding the best
algorithm to use is almost always the single best optimization you can do to a
piece of code. And that certainly is easier to do in a higher level language.

On the other hand, the original quote seems to regard elegance and
optimization as opposed. (And really gloss over the step of porting your
elegant routines by hand to C!)

------
jfm3
_Is Scheme Hand-Compiled Into A Single Main Routine in C Faster than C?_

Fixed that title for you.

------
iwwr
Scheme employs tail-recursion optimizations. An unoptimized recursive
algorithm would indeed run faster in Scheme.

But I'd be interested in how the port to C actually happened. That is, if he
translated the functional aspects of the algorithm, or if he just copied
intermediary C code that a Scheme compiler can generate.

~~~
andjones
I had the fortune of working with Jonathan Sobel at IU and hearing this story.

We were taught to translate a recursive program to continuation passing style
in a way that many variables could be translated to registers directly. If I
remember correctly, Jonathan translated this from Scheme to C and just
compiled that result.

~~~
nikster
I'd be interested in the actual application of that. Scheme and Lisp don't
seem to lend themselves to a powerful object oriented design very much. Since
that is IMO the highest order task in programming, I have never been that
interested in them.

I guess if you only write number crunching code, then yeah. But if you write
an app (web or client) - what good is Lisp?

~~~
leif
Clearly, you have never been exposed to CLOS, the most powerful object
oriented system of our time.

<http://en.wikipedia.org/wiki/Common_Lisp_Object_System>

There is also Arc, the language Paul Graham and Robert Morris wrote for the
web. This site is written it it.

<http://arclanguage.org/>

~~~
stcredzero
Hint: If you're stuck for something to publish and present at an OO
conference, just reimplement something done in previous decades with CLOS/Lisp
or Smalltalk in a newer 'hipper' language. This is only a half-joke. This
actually works!

~~~
derleth
And then denigrate Lisp for all the features you haven't stolen from it yet.

------
sambeau
Every language should be faster* than C in some way otherwise no-one would use
it.

Scripting languages are faster to code in.

Fortran is faster at complex matrix maths.

Lisp is faster to code in _and_ faster to run for a large set of computing
problems.

C wins the trade-of of speed-to-create vs speed-of-execution for most system
programming tasks but usually when compared to bare assembly code. But there
are a large number of problems that it is horribly unsuited for. Hence we end
up with c-programs with accidentally embedded lisp implementations.

* - where faster can also mean faster to code to a secure standard.

~~~
koenigdavidmj
> Hence we end up with c-programs with accidentally embedded lisp
> implementations.

See Greenspun's Tenth Rule: Any sufficiently complicated C or Fortran program
contains an ad hoc, informally-specified, bug-ridden, slow implementation of
half of Common Lisp.

And of course the famous corollary: Including Common Lisp.

<http://en.wikipedia.org/wiki/Greenspun%27s_Tenth_Rule>

~~~
silentbicycle
XML is proof that Greenspun's Tenth Rule applies to data, as well. :)

~~~
derleth
I've said this before, but JSON is actually more like sexprs than XML is.

~~~
silentbicycle
Agreed, but XML does a far better job of re-implementing sexprs _badly_.

------
abrenzel
I thought this was going to be another one of those benchmarking articles, but
it was really a great exercise in how lateral thinking can produce clean
solutions.

The more I hear about the Lisp family of languages, the more I want to give
them a try sometime...

~~~
nikster
Neat as this may be - optimizing a trivial algorithm is not something you'll
encounter out of school.

The lesson learned is a good one though: Good design always wins. In real
life, that's obvious almost immediately, yet the concept seems to elude the
vast majority of programmers.

In reality the difficulty in programming is creating powerful, flexible
designs. If you can look at your own code a year later and say "wow - this is
good" - then you win.

~~~
RiderOfGiraffes

      > ... optimizing a trivial algorithm is not
      > something you'll encounter out of school.
    

Sometimes it is. For me, it happens a few times a year.

More importantly, the techniques, methods and mind set are critical daily in
the code I and my employees write.

But we aren't doing web development.

~~~
dwc
I agree.

There's also another thing that happens, which is to look at a big mess of
twisty code and think about it, finally seeing that it should have been done
as a simple algorithm. Then you get to delete a lot of code and replace it
with a small bit of clear code.

There are many more ways the simple things learned in school help in the real
world.

------
dfj225
FYI, this approach, in general and completely automated, is taken by the
Spiral Project DSP generator: <http://spiral.net/index.html>

DSP transformations are represented in a Scheme-like language. The optimizer
performs operations on this language then compiles to C code.

I worked on the Spiral project on things tangental to the DSP generator, but
wrote my own optimizing DSP compiler based on Spiral technology for class in
university.

------
malkia
I think this is rather useless and pointless.

My daily work for the last few weeks has been optimizing speedwise our
pipeline.

Usually what we do is OpenMP (openmp.org) the code where it's possible (we
deal with lots of DXT compression, mipmaps, normal map calucations, etc.)

Then where possible decrease the floating point accuracy to point where it's
acceptable, and use SSE2 through some veclib.

It's all in C (C++) and the difference can be 1:10 and even more.

All I'm saying is this - you sit down, find your bottleneck and do something
about it. But simply saying this is faster than this is pointless.

That to be said the language shootout is how it should be done - you have
people using (trying at least) to use the same algorithms in different
languages by allowing certain language (implementation) optimizations, or
available libs.

Yes, I've seen LuaJIT, and LispWorks (which I love dearly) producing better
code than C++, especially when comes to std::string put in stdext::hash_map<>
simply because it's just harder to intern stuff in C++ (unless you do it
manually). In Lua every string is interned, and in LispWorks (and in Common
Lisp in general) as long as it's symbol it can be interned.

~~~
igouy
Hint - say "benchmarks game" to show you've checked the website sometime in
the last couple of years ;-)

------
asarazan
I was in this program at IU, and later worked with Jonathan at Cisco-- and I
actually remember this exact story being told by one of my AI's in class.

Guess it's become part of IU lore :-)

~~~
mkarazin
Seems to be that way. Never worked with Jonathan but I did do my undergrad at
IU. Both 311 as well as 422 (? compilers) were a ton of fun. I may still have
the code for scheme to C lying around somewhere... If you are a programmer and
have never used scheme I'd recommend having fun one weekend with it.

------
chalst
The kind of hand transformation from a Lisp-like language into C/asm is
described very well in Dan Friedman (2001)'s talk, _The Role of the Study of
Programming Languages in the Education of a Programmer_
<http://www.cs.indiana.edu/hyplan/dfried/mex.ps>

Friedman is at Indiana, so the similarity in the transformation is no
surprise. I'm surprised the arty doesn't mention Friedman.

------
morphir
"This is where Scheme really won. Because of its extremely algorithmic---
almost mathematical---nature, Scheme can be easily manipulated in a sort of
algebraic style. One can follow a series of rewrite rules (just about blindly)
to transform a program into another form with some desirable property. This
was exactly what I needed."

Can anyone explain this with an example? I have been pondering what he meant
with this for a long time.

~~~
plinkplonk
" 'One can follow a series of rewrite rules (just about blindly) to transform
a program into another form with some desirable property. This was exactly
what I needed.'

Can anyone explain this with an example? I have been pondering what he meant
with this for a long time. "

see the cps-transform rewrite rules in "Essentials of Programming Languages".
The first edition has the densest and most extensive explanation. The third
edition has the clearest, but a _relatively_ attenuated exposition.

~~~
morphir
if this proves to be true, I'm most grateful for your tips. Maybe you could be
even more specific providing a page number/chapter?

[http://books.google.com/books?id=GtPoyremy4EC&printsec=f...](http://books.google.com/books?id=GtPoyremy4EC&printsec=frontcover&dq=essentials+of+programming+languages&hl=en&src=bmrr&ei=a_ZgTcjFH8XxsgaD78y1CA&sa=X&oi=book_result&ct=result&resnum=1&ved=0CCwQ6AEwAA#v=onepage&q&f=false)

------
16s
The use of gotos rather than function calls is interesting... man that's gutsy
programming! I'd never thought to do that when considering optimizations.

~~~
chalst
If you were to hand transform your high-level code into assembler, you would
have little choice.

