
Timing data comparing CClasp to C++, SBCL and Python - OopsCriticality
https://drmeister.wordpress.com/2015/07/30/timing-data-comparing-cclasp-to-c-sbcl-and-python/
======
jlarocco
People are missing the point with the Python code. The C++ code and the Lisp
code aren't particularly optimized, either. The point of the benchmark is to
compare the relative speeds of roughly the same Fibonacci code out of the box.

On SBCL, a memoized recursive fibonacci is about twice as fast as the Lisp
code given also running in SBCL on my machine, for example.

I'm more suspicious about why the C++ code is so slow.

Edit: I wrote my own C++ Fib code and tried benchmarking it outside of Clasp
([https://gist.github.com/jl2/4d74958b02b3caea2f5c](https://gist.github.com/jl2/4d74958b02b3caea2f5c)).
It routinely ran in less than 0.005 seconds, which seemed too fast, so I
looked at the assembly output. AFAICT, the compiler is smart enough to realize
fib() has no side effects and is "pure", and is computing the value at compile
time, reducing the function call to essentially be myfib = 8944394323791464.
It almost seems like an unfair comparison, but since it's comparing compiler
performance, I think it's relevant information.

~~~
drmeister
Thank you, you summed it up better than I could.

------
Fede_V
I was a bit curious, so I did a short experiment:

[http://nbviewer.ipython.org/urls/dl.dropbox.com/s/l9naqibqyt...](http://nbviewer.ipython.org/urls/dl.dropbox.com/s/l9naqibqytv8vjt/Numba_Fib.ipynb)

Using numba, and adding a one line decorator to the function (without any
changes whatsoever) we get around 2 orders of magnitude speed up.

Note - this doesn't involve any fancy re-writing, annotating, etc, you
literally just add a decorator.

Writing really fast numerical code in Python is very easy. There's absolutely
no reason not to use numba if you have small functions that just do number
crunching. numba can even inline other numba functions - so you don't even pay
the function call overhead.

~~~
lqdc13
I love numba, but I am never successful at compiling it.

Using the Anaconda distribution results in some issues as that distribution is
not compatible with some other libraries and requires installing everything
through conda.

Most packages use Cython for this reason instead.

~~~
ycnews
I failed to get numba installed on a Raspberry Pi 2, and I'd been wanting to
try Nim, which turned out to be easy to compile on the RPi running Ubuntu.

    
    
      nim -r -d:release c rw_fibn.nim                                                  
      ...
    
      Hint: operation successful (12152 lines compiled; 2.632 sec total; 8.870MB; Release Build) [SuccessX]
      /home/rw/git/Nim/examples/rw_fibn
    
      Result = 8944394323791464
      elapsed time: 5.376946926116943
    

My first try at translating Python to Nim:

$ cat rw_fibn.nim

    
    
      import times
    
      proc fibn(reps: int64, num: int): int64 =
        var z: int64
        for r in 1..reps:
          var
            p1, p2: int64 = 1
            rnum: int = num - 2
          for i in 1..rnum:
            z = p1 + p2
            p2 = p1
            p1 = z
    
        return z
    
      var start: float = times.epochTime()
      var res = fibn(10_000_000, 78)
      var finish: float = times.epochTime()
    
      echo("Result = ", res)
      echo("elapsed time: ", finish - start)

------
robmccoll
In Python 2.7, shouldn't you be using xrange() rather than range()? xrange()
is a generator whereas range() will actually create the entire list and
iterate it.

In case anyone isn't aware: in Python 3, range()'s implementation was
effectively replaced with that of xrange().

~~~
0942v8653
For me, range (the posted source) took 95 seconds, and xrange was only a
little better at 85 seconds.

I think most of the benefit of xrange comes from the decreased memory usage,
not from lower CPU usage. But xrange is definitely closer to what the other
code is doing.

~~~
dekhn
that's almost a 10% improvement- not just a little better!

------
wtbob
CClasp is looking really, really interesting. I'm still enjoy SBCL, but on the
right project—why not try CClasp?

------
rch
Cython would make sense in this context (~19 seconds for me).

~~~
igouy
>>would make sense in this context<<

Not really -- _" I don’t want to start an argument about the speed of SBCL vs
C++ here, my point is that CClasp has come a long way from being hundreds of
times slower than C++ to within a factor of 4."_

~~~
rch
I meant only that Cython results would tell me more about relative CClasp
performance than a comparison CPython does.

I didn't want to get into the optimization game either, but I'm happy someone
here reminded me to try numba.

