
Our experience using Clojure to speed up Beanstalk - dsabanin
http://blog.beanstalkapp.com/post/23998022427/beanstalk-clojure-love-and-20x-better-performance
======
zackzackzack
I wonder how much of the speed change came from the difference in languages
versus the difference in experience when writing both. The ruby version was
written prior to the clojure one and so anything that was learned about
git/programming during the ruby writing would haven be available during the
clojure writing. I can believe that clojure would be faster; using concurrency
well guarantees this somewhat. I still wonder what would have happened if they
had written the clojure version first without having written the ruby version.
Or wrote a ruby version after the clojure version.

~~~
asparagui
According the article, they were using ruby to bridge to a svn module that had
git capability. They replaced that module with a native git library, called
from Clojure.

In summary: benchmark, profile, find hotspots, optimize. Works in every
language. >:3

~~~
dsabanin
Sorry if it sounded confusing, but we had two modules: one for svn, one for
git, there was no intersection between those.

But I agree, we've used Clojure(and JVM) in exact point where it would bring
the most speed up to our app.

------
spullara
This is more about the Ruby VM being much slower than the JVM than anything to
do with Clojure. It would be somewhat interesting to see how JRuby 1.7 running
their Ruby code but using the Java libraries would fare.

~~~
arohner
JRuby is still going to be slower than clojure, due to the semantics of ruby.
All function calls are resolved at runtime, while clojure resolves as many as
possible at compile time.

Foo::Bar.method() vs (foo.bar/method)

The ruby version does three loads from the heap, at runtime, while clojure
does zero.

invokedynamic in JDK 7 will help narrow the gap, but the fact remains that
Clojure was designed with more performance in mind than ruby.

~~~
lemming
I don't think this is quite right, although I'd be delighted to be corrected.
Clojure still has to deref the var in order to call the function, and afaik
this happens at runtime (hence the fact that you can redefine a function in
the REPL).

~~~
rsanders
In recent versions of Clojure, vars are not by default dynamically bound.

------
dons
[http://shootout.alioth.debian.org/u64q/benchmark.php?test=al...](http://shootout.alioth.debian.org/u64q/benchmark.php?test=all&lang=clojure&lang2=yarv)

Median speedup for the same problem in Clojure was 12x over Ruby. and 14x for
Haskell; ... and 25x in Java .. and 35x in C++...

~~~
swannodette
Alioth benchmarks are out of date as far as Clojure is concerned. These days
it's not much work to get Clojure to deliver identical to Java performance.
Better versions of the same benchmarks here that show the same performance as
Java that you verify yourself - <http://github.com/clojure/test.benchmark>

~~~
igouy
>>out of date as far as Clojure is concerned<< What a pity that Clojure
programs, written months ago to do benchmarks game tasks, have not been
contributed to the benchmarks game for all to see!

~~~
swannodette
All in good time Isaac :)

~~~
igouy
Let's hope the benchmarks game website is still being updated when that time
arrives ;)

~~~
heretohelp
What are you two plotting?!

~~~
dons
Programming language benchmark data.

------
michaelbuckbee
I wonder to what extent the speed increase was due to Clojure vs the JVM
itself.

An interesting comparison might have been Clojure vs their previous code on
JRuby.

~~~
jwr
In my code, there are two kinds of places where I get a performance boost by
using Clojure:

1\. Micro-optimizations, mostly due to JVM and its excellent JIT (the garbage
collectors are quite impressive, too, if what you need is predictable response
time).

2\. Architectural gains: thanks to the Clojure's excellent concurrency support
I can make much better use of multiple cores. I get more parallellism, hence
better performance on same hardware.

The first kind is cool, because you get it "for free". The second kind is the
real game-changer, because non-parallel software only gets you so far in terms
of performance, and writing concurrent software is Hard. Clojure makes it
much, much easier.

But overall I wouldn't say that Clojure is a performance daemon on a single
CPU. You can get performance similar to carefully written Java code. This is
good, but you can always do better with C or hand-written assembly on critical
sections. But that's not the main advantage: the big thing is that I can write
correct Clojure code _fast_ , it runs well enough, and I can easily make use
of multiple cores. You can debate micro-benchmarks all you want, but what
really counts for me is how quickly (and correctly) I can get from zero to
production code that runs fast enough.

------
DanI-S
I've been playing with 4clojure[1] in my off time for weeks, and it's been a
great introduction to the language, although it obviously doesn't help with
learning how to package and deploy an application or service. How did you make
the first steps from playing around in the REPL to writing production code?

[1] <http://www.4clojure.com/>

~~~
dsabanin
Have you seen Leiningen[1]? It pretty much solves packaging and deploying
problem. Awesome tool.

Also if you're into Web development, I can recommend playing with Noir[2]
framework and Korma[3] for SQL abstraction. Heroku also support deploying
Clojure apps out of the box, so you can easily use a free tier to get
something out there.

[1] <https://github.com/technomancy/leiningen>

[2] <http://webnoir.org/>

[3] <http://sqlkorma.org/>

~~~
cmelbye
I've got a question about Noir. Is it necessary for me to write HTML with
Clojure, or can I separate it and write plain HTML somehow? Writing HTML in
the middle of my Clojure code doesn't appeal to me coming from a MVC
background...

~~~
drostie
Of course you could probably write an s-expression serialization of XML as a
macro, and then you could start writing things like:

    
    
        (html
            (head (title "My page")
                (script :type "text/javascript" :src "/static/page.js"))
            (body (p "Stuff")))
    

Actually for a while I had a JS tool which I was using called "build.js" which
would just build DOM components like this. (It was therefore JSON-serializable
too, but I never really had an occasion to use that.)

There is something nice about HTML which is a bit lost here: HTML (and LaTeX
for that matter) allow unquoted text, with fewer escape characters. They are
_markup_ languages, which s-expressions crucially lack. (On the other hand,
XML & family lack the ability of Lisps to make the first token anything other
than a symbol.) There was briefly a plot called NML / Enamel and some others
-- DTML and TML I believe -- which would instead write:

    
    
        <html | 
            <head | <title | My page >
                <script <type|text/javascript> <src|/static/page.js>>>
             <body | <p | Stuff>>>
    

This is actually an emulation of a C-type syntax with a Lisp-type semantics:
the idea is that you have in some sense two syntactically different channels
into your expressions, one which comes before the pipe character | and one
that comes after. The stuff that comes after is allowed to be marked-up text;
the stuff that comes before is some sort of node list, and perhaps has certain
conventions (one could imagine instead using a Clojure-style `:type
"text/javascript"`, which would limit you to what XML attributes can do --
short text only, symbolic keywords).

That is, one could hypothetically rewrite this in some C-ish syntax which
would look like:

    
    
        html {
            head { title { My page }
                script(type {text/javascript} src {/static/page.js}) {}}
            body { p { Stuff }}}
    

and again, if you wanted to limit yourself to XML, the parens above could then
say instead `type: "text/javascript" src: "/static/page.js"` but it doesn't
have to be that way.

Some crazy ideas for anyone building a new language to think about.

------
Patient0
"The rewrite in Clojure resulted in much cleaner, faster code, totaling at
only 700 lines of Clojure code (I don’t have a clear comparison with Ruby code
here). "

Could you post the Clojure code somewhere? It would be very interesting to see
clean, fast Clojure code written to solve a real-world problem (as opposed to
some toy example).

~~~
edwinnathaniel
Yes, please. I'd like to see "clean and fast Clojure code".

------
shanemhansen
beanstalkapp guys/girls, thanks for sharing! I'd be really interested to see
some profiling results, were you bottlenecked on sha hashing? IO/execve
syscalls? Memory usage?

I think the reason for the performance difference is pretty clear. According
to the article and the grit documentation, all the git api calls were either
done in pure ruby, or by shelling out to `git`. Also, reading between the
lines, it sounds like they required some information/relationships on commits
that was non-trivial to retrieve using a basic git shell command. So by
switching out the runtime and the algorithm, they get a huge performance
increase.

With jgit, they can more easily traverse the graph directly and efficiently.
I'm really curious to see what kind of performance they could get from using
FFI+jruby and raw c calls to libgit/libgit2 (<http://libgit2.github.com/>).

------
will_work4tears
I use Beanstalk every work day (for the last 8-9 months) and when they rolled
out their changes I could tell a difference.

Didn't know what those changes entailed, but this is interesting. Thanks!

------
luminaobscura
clojure has some great features but syntax is just disgusting (not because of
parentheses, i like scheme)

~~~
glogla
And because of what then?

~~~
luminaobscura
other tokens like #{ ^{ #() ~@ #_ #^ #'x ˆ:x

~~~
ths
Clojure does have more sugar than Scheme, but imho some of it improves
readability; for example, using brackets for grouping instead of overloading
lists like Scheme does can make code easier to scan, because when you see
parens in Clojure there are fewer meanings to choose from (usually only
function application or a list literal). Example:

Scheme

    
    
      (let ((x 2) (y 3))
        (let ((x 7) (z (+ x y)))
          (* z x)))
    

Clojure

    
    
      (let [x 2 y 3]
        (let [x 7 z (+ x y)]
          (* z x)))
    

I think the Clojure version is easier to read without a paren-matching editor,
though Scheme's rigorous minimalism does have its charm.

------
EternalFury
There seems to be a race towards the most exotic or revivalist programming
languages. As if programming languages were magic bullets of some sort.

~~~
balac
What is your contention to their choice here? It seems that are simply
choosing the best tool for the job, it is not as if they are rewriting their
whole web frontend in Clojure too.

------
trimbo
> I’ve been looking for an excuse to use Clojure in a production environment
> for a while

Except in wanting to mess around with Clojure, you didn't use the best tool
for the job.

The Clojure solution now uses parallel implementations of Git and SVN to solve
the problem, rather than the core code of SVN and Git. And now you also have a
one-off daemon written in Clojure. It doesn't have the same support structure,
ops requirements, or anything, as your Ruby code. Virtually no one uses
clojure, so hiring and training are different, etc. You've incurred a lot of
overhead for something not that great.

The best tool for this job was to improve the Ruby version by way of C
extensions, or write a new C command that does this work for you, linking in
the Git and SVN code directly. This has little to no new concepts, is
straightforward, and would have given you the best compatibility and
performance.

~~~
dsabanin
Clojure code is 700 lines, and pretty easy to grasp by anyone who've used Lisp
before. We would have to implement a lot of C code to speed up both Git and
Svn bindings. And I seriously doubt that we would be able to debug this in a
reasonable amount of time.

Clojure caching took me like a month max.

~~~
trimbo
Lines of code is the new form of premature optimization. A C solution is
better in performance and compatibility, but you've prematurely optimized on
lines of code as your deciding factor (based on you mentioning it in the
article, and again here).

Obviously, you can choose whatever you like, it's your company. I'm just
explaining what the best solution was for the many students and young
programmers who visit hacker news. Clojure is not it for all the reasons I
mentioned. And if people think C is hard, practice it. Read Zed's book and do
Project Euler problems with it until you feel comfortable with it. It's an
essential tool everyone must know how to reach for.

~~~
spitfire
They're optimizing for their business value and resources (programmers)
available.

It might not impress their programmer friends, but it'll impress their
accountant.

~~~
dsabanin
Well, in my book using Lisp in production is impressive for the best part of
my programmer friends :-)

~~~
spitfire
Two birds, one stone. I think that's what they call in the industry
"Experienced". Keep that up.

