

Goliath: Non-blocking, Ruby 1.9 Web Server - igrigorik
http://www.igvita.com/2011/03/08/goliath-non-blocking-ruby-19-web-server/

======
gregwebs
From the FAQ: <http://postrank-labs.github.com/goliath/doc/index.html>

    
    
        How do I deploy Goliath in production?
    

* We recommend deploying Goliath behind a reverse proxy such as HAProxy, Nginx or equivalent. Using one of the above, you can easily run multiple instances of the same application and load balance between then within the reverse proxy.

I am still wondering about how Goliath fits into both deployment architecture
and application development. Traditionally these 2 have always been separated
out.

* Thread safety. It is explicitly mentioned that middleware used must be thread safe. Doesn't this hold for all code?

* Can Goliath use multiple cores, or does one instance need to be spun up for each core?

* does it make sense to say, server a Sinatra app from Goliath?

~~~
igrigorik
In all cases where we've deployed goliath, its running with a single thread.
In theory, nothing stops you from creating a thread to do some background task
(or EM.defer), but then you're on your own to make sure you have all the right
synchronization logic.

As far as middleware goes, because you are reusing the same "app chain"
between multiple requests, you just have to make sure that your middleware
does not rely on any instance variables since those will get clobbered by
other requests. Check the wiki page on middleware, we have a few examples
around this.

Last but not least.. the EM reactor runs on a single core, so you're basically
in the same deployment scenario as Thin or node.js. Having said that (always a
caveat! ;)), Goliath _can_ run on JRuby and Rubinius.. so in theory we have
non-GILed environments there, which means we can start multiple reactors and
run across multiple cores from within the same proc. This is not something
I've experimented with in practice yet, but in theory its possible.

~~~
astrodust
2011 is either going to suck or be awesome depending on how quickly the new
Ruby 1.9 features are taken up by Rubinius and JRuby.

Having to choose between multiple threads and a better language is really not
a decision you should have to make.

~~~
igrigorik
\--1.9 mode works really well under JRuby now - that was a big push within 1.6
and they've made a lot of progress. I'd call it "almost there".

Rubinius is a little bit further behind, but I've been able to run our Goliath
stack on it, and that exercises quiet a few syntactic changes. So, I think
both are close.

------
mrinterweb
Just wanted to thank Ilya for em-synchrony, em-mysqlplus, em-http-request,
goliath, and other gems that help to make developing evented ruby applications
much easier.

------
vamsee
Pardon me if this is a dumb question: how do I hookup Rails (3.x) with it?

~~~
mnutt
I'd say that generally, you don't hook it directly into rails. I think it may
fill a role similar to node.js, with the added benefits of fibers and the
ability to load your rails app's models and libraries.

For instance, I have a 20-line node.js app that does nothing but serve
websockets for my rails app. I may replace that with Goliath in the interest
of consistency.

------
yatsyk
Very timely release: I'm creating facade to other http API. Something like the
sample in the article. I faced problem with slow responses from other party so
I need to move to async request handling. Would you compare Goliath to EM
HttpServer or other options?

~~~
igrigorik
evma_httpserver is definitely an alternative and one we used early on at PR.
Having said that, we migrated away from it because it didn't provide us with
the flexibility we needed to do keepalive, pipelining, etc - it basically
provides just a very thin layer on top of an EM connection, hence the minimal
API and functionality.

Thin would be an alternative to Goliath as well, although we chose to switch
to a different parser and also to add the Fiber logic/wrappers right into the
framework.

The way I think about it is: if Thin is an app server, then Goliath is more of
a minimal framework which you can use to go from start to finish in if you
need to bring up an API endpoint. That includes, configuration, routing if you
need it, validation, etc.

------
boundlessdreamz
A noob question: If I have an expensive background job (say pdf generation) is
there a way to make it async and use it with Goliath ?

~~~
igrigorik
That's a great question actually.. In any evented app, blocking your "reactor"
is a big performance problem, since everyone will be waiting for you to
complete that operation before anything else can happen. In general, you want
to turn any CPU intensive work into an IO-bound operation, where the reactor
is "waiting" for the IO notification that the computation is complete.

Now.. How you actually achieve that is a whole different story. You could, in
theory, throw a job into some external work queue and poll that, or if your
runtime permits, spawn some threadpool and periodically check that, or.. spawn
a process and wait on that. In other words, it all depends on the actual
operation.

In the case of PDF generation, if you rely on some external tool, you could
use a mechanism like EM.system('shell cmd') to spawn a process and wait for it
to return you the data.

~~~
tptacek
If you're going to be forking off processes to do work for you, I think it's a
bad idea to hide the implicit state by using EventMachine to try to manage the
pipe I/O and the process state. The reason for that is, sidecar processes
screw up, backlog, crash, and eventually start consuming resources you want to
track. You end up reinventing the same wheel the Github people did with
Resque. Better to reify all those processes from the start with a real queue.

I only make this pedantic comment because EventMachine makes it _really_ easy
to start down the path of "just event the process management and I/O", and it
seems like you're almost always better off not doing that.

------
otterley
I'm still not convinced that Fiber-related gymnastics are massively superior
to callback-related gymnastics, especially if you ever have to debug the magic
under the hood (which I inevitably end up doing).

Can someone enlighten me?

~~~
gregwebs
With fiber pause resume the onus is put on the IO library writer. Someone has
to write a MySQSL library that used threads. The advantage is that this is
completely transparent to the library user. A goliath user need not be
concerned about events or fibers when using MySQL.

In contrast, a node user must specify a callback every time they make a query.
I don't know if the library writer has to do anything special.

------
molecule
sounds awesome.

also noteworthy: postrank has been running ruby19 fiber'd webserver in
production "for well over a year"?

cool!

~~~
xal
So has Shopify. Serving terabytes of on-demand resized assets.

------
sshillo
Know of any async postgres drivers out there?

~~~
dj2sincl
There is one built into EventMachine.
([https://github.com/eventmachine/eventmachine/blob/master/lib...](https://github.com/eventmachine/eventmachine/blob/master/lib/em/protocols/postgres3.rb)).
Never used it, so I can't say if it's good or not.

