How we’ve made Raptor fast

gnufied · on Nov 11, 2014

Just some glaring inconsistencies that I found:

1.

> It’s less work for the user. You don’t have to setup Nginx. If you’re not familiar with Nginx, then using Raptor means you’ll have one tool less to worry about.

>For example, our builtin HTTP server doesn’t handle static file serving at all, nor gzip compression.

Sounds like I would need nginx(or another frontend server) anyways?

2. > By default, Raptor uses the multi-process blocking I/O model, just like Unicorn.

> When we said that Raptor’s builtin HTTP server is evented, we were not telling the entire truth. It is actually hybrid multithreaded and evented.

So, which it is? I assume by default is multi-process + events, but a paid version offers multithreaded + events? If so, isn't unicorn's model of multi-process+blocking IO is pretty good as well because OS becomes load balancer in that case.

Overall it seems they wrote a very fast web server. Kudos to that! But I don't think the web server was ever the problem for Rack/Ruby apps? Still on fence with this one until more details emerge. :-)

shandogs · on Nov 11, 2014

The first point makes sense for me given that all my static files are served gzipped out of cloudflare/cloudfront on another domain.

wereHamster · on Nov 11, 2014

And the rails asset pipeline can pre-compress the files, so you don't waste compressing them for each request individually.

bratsche · on Nov 11, 2014

Does it seem weird to anyone else that they're doing all this marketing for a project that's supposedly going to be open source? Why all the suspense, why not just release it? Or, if it's not ready yet, why not just finish it and then start promoting it?

theossuary · on Nov 11, 2014

Half way through the document they mention their plan to add paid for features: "Raptor also optionally allows multithreading in a future paid version."

teddythetwig · on Nov 11, 2014

Yep, that's when I stopped reading and closed the article. Why hide the fact that you are building paid software? Perhaps it was naive to assume that it was open source, but that piece really surprised me.

FooBarWidget · on Nov 11, 2014

What problem do you have with open source software that has a paid premium version? For example, the Sidekiq gem is open source but they also have a paid Sidekiq Pro. Would you avoid Sidekiq too?

lbotos · on Nov 11, 2014

They have a full drip with email capture around this product as they are looking to sell a portion of it. (I think it was a multithreaded variant.) I do agree that I wish the product was ready but I can't fault them for "following the playbook".

bithive123 · on Nov 11, 2014

As someone working in an enterprise environment, I've sort of lost interest in this breed of Rack server now that I've gotten used to having SSO and LDAP authorization available via Apache modules, to name a few features. Apache allows me to accommodate all sorts of requirements like setting up vhosts that require authentication except on the LAN, or vhosts which allow members of certain groups to access an internal app via reverse proxying.

I don't mean to be negative; other posters have that angle covered. But I would comment that this ongoing proliferation in prefork backends is hardly disruptive to organizations who have already made significant commitments to Ruby web apps. Our Apache/Passenger servers aren't going away anytime soon.

rarepostinlurkr · on Nov 11, 2014

This is Passenger Phusion +1. As was pointed out in a thread several months ago, the DNS resolves to the same place. The writing style is similar and the feature set far too mature for a 1.0 product.

randall · on Nov 11, 2014

How would one use this on Heroku? It doesn't support static file transfers allegedly... and per Heroku they require your app server to serve them by default.

https://github.com/heroku/rails_12factor#rails-4-serve-stati...

Any ideas?

snikch · on Nov 11, 2014

You use a rack middleware to handle static files, which is how Rails handles it by default. So, unless I'm completely mistaken (which I may well be), this should work just fine?

randall · on Nov 11, 2014

Wouldn't that sit behind the app server? (Equally mistaken potential. :)

_raul · on Nov 11, 2014

Using a CDN (Cloudfront, Fastly add-on, ...) is a common choice, as it allows delivering assets from a closer datacenter to the client and removes load from the app server.

randall · on Nov 11, 2014

Yeah but getting assets to said cdn is annoyingish without having app server origin, I'd argue.

_raul · on Nov 11, 2014

I find the setup pretty simple and convenient if you configure the CDN to fetch the assets from the app (in which the app will need to serve the assets only once for each new release).

cmelbye · on Nov 11, 2014

It's trivial with Rails, at least.

fiatmoney · on Nov 11, 2014

"You will need 5000 processes (1 client per process). A reasonably large Rails app can consume 250 MB per process, so you’ll need 1.2 TB of RAM."

Quibble: most multi-process web servers use fork() for child processes, which means they can share identical memory pages.

rustyconover · on Nov 11, 2014

Not deeply understanding copy on write[1] semantics for virtual memory paging and designing application servers seems just foolish.

I'll chalk this one up to the PR/marketing person probably not taking an OS course.

Still it would be nice if they really did go back and read a little W. Richard Stevens[2].

[1] - http://en.wikipedia.org/wiki/Copy-on-write [2] - http://en.wikipedia.org/wiki/W._Richard_Stevens

rubiquity · on Nov 11, 2014

> I'll chalk this one up to the PR/marketing person

This entire web server is a marketing hype since day one. I imagine they are trying to build a pro product and support company out of this.

It's a web server with event loops and some fancy memory allocation. Shouldn't Node.js have taught us all by now the perils of event loops and insanely tweaked HTTP parsers? Sure, it looks great for "Hello World" benchmarks but falls right on its face as soon as you have an app of significant size spending real time on CPU.

nostrademons · on Nov 11, 2014

I'm wondering just how well these app servers perform with a real-world Rails app behind them? My understanding was that Unicorn deliberately does not try for maximum performance simply because you'll lose most of that as soon as you add Rails, and then the app server is not the bottleneck. Amdahl's Law applies - you can't get more than a 1% speedup by optimizing a component that consumes 1% of the total CPU time.

I also wonder how their hybrid evented/threading/process model works in the presence of a GIL (which, last I checked, Ruby still has) and in the presence of blocking socket calls (which, last I checked, both the MySQL and PostGres APIs used).

wmf · on Nov 11, 2014

It sounds like the real benefit is not performance but the simplicity of having the slow-client spoonfeeding built in rather than requiring an external Web server.

nostrademons · on Nov 11, 2014

I agree that would be a benefit, but

a.) they could achieve that a lot simpler by bundling nginx, Unicorn, Rails, and a pre-vetted set of config files and shell scripts to bring the whole thing together and

b.) that's the value proposition of PaaS offerings like Heroku. Heroku is pretty damn simple already - just git push your code - and you'd outgrow it around the same time as you'd outgrow the bundled slow-client spoonfeeding, so what's the value proposition of this?

snowmaker · on Nov 11, 2014

I would really like to hear their answer to this question.

FooBarWidget · on Nov 11, 2014

In my experience -- as the Phusion Passenger author, and as the one of the original developers behind the copy-on-write feature in Ruby -- a moderately large Rails app can use 250 MB per process even with copy-on-write.

Rapzid · on Nov 11, 2014

Now that they have fixed the garbage collector. However, how much of that 250MB is built up after the fork, and how much is static? I actually don't have that answer.

rubiquity · on Nov 11, 2014

In the case of Rails, all of your Gems, Rails and application code are loaded pre-fork.

hurrycane · on Nov 11, 2014

I strongly believe that this is the next version of Phusion Passenger.

grandalf · on Nov 11, 2014

The writing style is very similar.

hurrycane · on Nov 11, 2014

Their architecture looks to me really mature - a lof of features are the same as Passengers'

triskweline · on Nov 10, 2014

Spoiler: Insane amounts of low-level optimization.

film42 · on Nov 11, 2014

Actually, everything was pretty simple: Use libev since it's faster than libevent, use Node's http parser because it's fast. Then allow each thread in the pool to run its own event loop. This pretty much sums up my university's Internet Programming course. There were few hairy bits about tcmalloc, but they did a great job about explaining how they took advantage of object pooling and region-based memory management. Great post guys, I can't wait to give your source a read :)

yazaddaruvala · on Nov 11, 2014

Any idea why they didn't use libuv instead?

film42 · on Nov 11, 2014

I haven't seen any benchmarks comparing the two, but Node first used libev, and then the team created libuv out of a need to support Windows, because libev is unix only.

Actually, it's all in this about page: http://nikhilm.github.io/uvbook/introduction.html

nteon · on Nov 11, 2014

I would have loved to see some data that showed their bump pointer regions and thread local stuff performing significantly better than tcmalloc or jemalloc, both of which do thread-local caching that avoids locks for the vast majority of allocations. Additionally, what they came up with sounds like talloc[1], which has been in production for years with samba.

1 - http://talloc.samba.org/talloc/doc/html/index.html

jonaphin · on Nov 11, 2014

Congratulations on Raptor, I'll definitely give it a whirl. Regarding static asset serving, I'm fairly certain serving them through the application server is often not the way to go anyway.

resca79 · on Nov 11, 2014

Raptor seems pretty interesting, but personally I don't like its marketing approach, and I'm not the only one.

On twitter some ruby heroes say : " Raptor is 4x faster than existing ruby web servers for hello world applications" :)

The strong proclamations in favour of an open source project is a little bit strange if the open code is not yet released.

However I hope that all graphs on the home page are real for the ruby programmers happiness

alvare · on Nov 11, 2014

http://www.yesodweb.com/blog/2012/11/warp-posa

covi · on Nov 11, 2014

The section "Hybrid evented/multithreaded: one event loop per thread" suggests that the whole model is basically SEDA [1]. I'm surprised the article does not directly reference the project/paper.

[1] http://www.eecs.harvard.edu/~mdw/papers/seda-sosp01.pdf

jrk · on Nov 11, 2014

The "hybrid" IO architecture is historically known as AMPED (asynchronous multi-process event driven): https://www.usenix.org/legacy/event/usenix99/full_papers/pai...

simonmales · on Nov 11, 2014

Puma says it runs best when using Rubinius or JRuby, but from my limited understanding, not everything can run on those implementations.

Are there any giveaways in the blog that wouldn't allow Raptor to run on Rubinius or JRuby?

steveklabnik · on Nov 11, 2014

That's basically because Puma uses threads, and threads are way better on Rubinius and JRuby, due to lack of a GIL. It's not due to some kind of incompatibility.

riffraff · on Nov 11, 2014

well if part of it is written in C++ it most likely won't run easily on either?

jcampbell1 · on Nov 11, 2014

Does anyone know if the 60.000 in the chart means 60 or 60,000?

djur · on Nov 11, 2014

It's from Phusion[1], which is headquartered in the Netherlands. Non-English Europe conventionally uses . as a digit separator, so 60.000 would be sixty thousand.

[1]: % ping rubyraptor.org PING rubyraptor.org (97.107.130.55) 56(84) bytes of data. 64 bytes from shell.phusion.nl (97.107.130.55): icmp_seq=1 ttl=50 time=98.3 ms

davedx · on Nov 11, 2014

If it's the Passenger guys then 60.000 is how you write 60,000 in the Netherlands ;)

mrinterweb · on Nov 11, 2014

I was thinking the same thing. 60 reqs/sec is not very fast. I suppose it depends on what is being tested. "Hello World" is pretty useless to base a benchmark on as DB and other IO is not involved. Who knows what that number means without some source code it was benchmarking.

Ono-Sendai · on Nov 11, 2014

Not bad work. Seems somewhat futile though, since the speed will probably be massively slowed down by the actual ruby application code and database accesses etc..

derwiki · on Nov 11, 2014

If it means you can squeeze more requests/sec out of the free tier of Heroku, I'm all for it!

mrinterweb · on Nov 11, 2014

I am all for a faster ruby application server. If Raptop can stand behind its claims on November 25th, that will be the best birthday present I could get.

corford · on Nov 11, 2014

Slightly OT but does uwsgi feature much in the ruby world? /from a curious python guy

bradleyland · on Nov 11, 2014

Ruby uses a specification called Rack.

http://rack.github.io

kevinastone · on Nov 11, 2014

uWsgi is an implementation of the python WSGi spec that also supports Rack: http://uwsgi-docs.readthedocs.org/en/latest/Ruby.html

djur · on Nov 11, 2014

I haven't heard about it myself. Sounds like it could be helpful if you're trying to fit some Ruby services into a mostly Python shop.

According to that page, it doesn't support the most recent Ruby versions. That might not be accurate, though.

unbit · on Nov 11, 2014

Ruby support in uWSGI is really solid, and it is more used than you can think of (As an example the biggest italian rubyonrails site run over it, and you will find a bunch of interesting blog posts about it combined with ruby). The main issue here is that we never pushed it in the community like we did with python and perl. Frankly i do no know why, maybe it has been a terrible error (taking in account that whoever tried it with ruby has been really excited)

yxhuvud · on Nov 11, 2014

What does uWSGI support bring to the table?

grandalf · on Nov 11, 2014

I hope better subdomain support

rurounijones · on Nov 11, 2014

I try to keep up with most things ruby and I have never heard it discussed in a ruby context

swrobel · on Nov 11, 2014

Does it support SPDY?

coned88 · on Nov 11, 2014

what are these instead of just using a webserver like apache or nginx?

SpikeGronim · on Nov 11, 2014

You generally use Unicorn etc. behind something like nginx. The last time I did that I used nginx to handle thousands of concurrent connections that were forwarded to nCPUs Unicorn instances. Nginx is very good at handling lots of connections, Unicorn is good at handling a Rack app.

tryp · on Nov 11, 2014

To be a bit more explicit, using a separate httpd and application server allows a division of labor between the resource-bound task of handling the request + building the response from the network-bound task of dibbling bytes back to the original requestor.

Nginx (and the general class of highly concurrent servers) is good at handling lots of connections largely because it tries to minimize the resources (memory, process scheduler time, etc) required to manage each connection as it slowly feeds the result down the wire.

The application server generally wants an instance per CPU so that it can hurry up and crank through a memory-, cpu-, or database-hungry calculation in as few microseconds as possible, hand the resulting data back to the webserver and proceed to put the memory, DB, and CPU to the task of processing the next request.

This is in contrast to the (simplified here) old-school CGI way that say ancient Apache would receive a request, then fork off a copy of PHP or Perl for each one, letting the app get blocked by writing to the stdio pipe to Apache then Apache to the requesting socket. All the while maintaining a full OS process for each request in play.

lbotos · on Nov 11, 2014

This is an "application server". When you are doing anything that isn't PHP you basically need a process behind your web server (Apache/Nginx) that runs your actual ruby/python/java application code and speaks HTTP.

girvo · on Nov 11, 2014

Even in PHP you want that for decent performance: php-fpm is similar in concept to a Ruby/Python application server, albeit slightly different due to PHP being a web language first and foremost, and it's interesting execution method. Still, you run a process and connect nginx to it :)

jrochkind1 · on Nov 11, 2014

Even for PHP you do, it's just delivered as an apache module. (Passenger can be run as an apache module too, fwiw).

Although to be fair, the PHP model doesn't require a _persistent_ process between requests (I think?). But most other platforms do.

lucaspiller · on Nov 11, 2014

It isn't required for Ruby either, but loading a Rails app is slow so it's better to persist it between requests.

Rapzid · on Nov 11, 2014

Unless you also need high density multi-tenancy in which case you probably want to use mpm-itk for security..

est · on Nov 11, 2014

because you need something between your Ruby code and nginx/apache.

kondro · on Nov 11, 2014

Does this mean that the majority of the performance improvements over Puma actually come from the fact that they are using the considerably less battle-tested HTTP parser of PicoHTTPParser over the Mongrel one?

Of course, this may be mitigated by the fact that any reasonable production environment will have a web server layer over the app server/s anyway for load balancing, fail-over and exploit detection/prevention anyway.

adamnemecek · on Nov 11, 2014

They actually aren't using PicoHTTPParser.

kondro · on Nov 11, 2014

Whoops! My reading skills need some work.