

How Ravelry Scales to 10 Million Requests Using Rails - timf
http://highscalability.com/how-ravelry-scales-10-million-requests-using-rails

======
n8agrin
For a post titled "How Ravelry Scales to 10 Million Requests Using Rails" the
only scaling advice they mention are the technical specs of the site like:

 _Tokyo Cabinet/Tyrant is used instead of memcached in some places for caching
larger objects. Specifically markdown text that has been converted to HTML._

and this one tip:

 _The database is the problem. Nearly all of the scaling/tuning/performance
related work is database related. For example, MySQL schema changes on large
tables are painful if you don’t want any downtime. One of the arguments for
schemaless databases._

Not much "how" in that.

~~~
callmeed
The article linked in this thread had pretty good details:

<http://news.ycombinator.com/item?id=802889>
(<http://www.tbray.org/ongoing/When/200x/2009/09/02/Ravelry>)

------
timf
" _Casey is the sole engineer for Ravelry and to run it takes only a few
people._ "

Looking over the setup, that's quite a lot for one person to do during nights
and weekends in 4 months. Pretty cool.

~~~
ionfish
I suspect that was prior to the initial launch. The architecture has
presumably evolved to the state described in the HS article since then.

Edit: yes, Casey says in the Tim Bray interview that "As soon as we could, we
got alpha testers in to try it out... 4 months later, we had a site that we
were ready to announce."

------
timf
I'd never heard of HopToad before, that looks interesting.

~~~
steveklabnik
+1 from me. I use it on my site, it's pretty awesome.

------
warfangle
Curious:

10 million server requests per day sounds kind of impressive, until you
actually do the math.. divided by how much physical iron they're using, that's
a little less than 9 requests per second per server.

It makes me wonder: if they were using something other than rails, would they
need that much iron?

~~~
sophacles
I strongly suspect a flaw in your statistics: I'm willing to put money on this
site having a spikey workload, not a constant workload. There are probably
hours in a row that 6 of those servers sit idle.

Rant: I wish technical sites would stop using req/day as a metric. It leads to
the op type of analysis. At the very least, such articles could use a format
of "X req/day peaking at Y/s". Maybe if the NYT was writing it would be ok to
use req/day but a sight who's tagline is: "High Scalability Building bigger,
faster, more reliable websites." should know better.

~~~
toddh
What should I use in a title that is informative and gets people interested
enough to read an article? A munin graph isn't quite as punchy.

~~~
sophacles
Sorry for that late reply, IMO, that title was fine, it did it's job well. My
rant, etc, was about the stats section in the article itself. It still uses a
flat time model, on the scale of N things/day instead of a more representative
N things/day (X things/(smaller than day time unit) at peak).

------
jherdman
Wait... they have Nginx out front passing requests to HAProxy and THEN to
Apache + mod_rails? That just seems like a bit much given that mod_rails can
be installed with Nginx straight up. Why would you want a set up like this?

~~~
brett
Having Nginx in front is a lot more flexible that just having HAProxy listen
on port 80. For example, it can serve static files and do redirects both of
which don't need to pass through the whole load balanced stack.

They could use Nginx -> HAProxy -> Nginx(w/ passenger), but the Apache version
feels slightly more mature (e.g. it has some config options that the Nginx
version lacks) and it's likely they were already using it before the Nginx
version came out.

~~~
zepolen
Why use HAProxy at all?

~~~
brett
It's better at load balancing. For example it handles app servers that have
gone down more gracefully. Also it generates an awesome stats page that gives
you way more info about what's going on than you can get from Nginx.

~~~
caseyf
Yep! We used nginx's fair balancing module before switching to haproxy. It
also helps me do rolling restarts/hot deployments in a nice way.

It's really a great piece of software. Kudos to Willy.

PS - you're also correct about the nginx->haproxy->apache. nginx makes a
fabulous front end and I just plugged in Apache/Passenger where Mongrel used
to be. I like that 1) I can easily plug in something else in the future and 2)
Passenger on Apache is very stable. Nginx support is newish and I'm running
stripped down Apaches that only do Passenger, so I'm not too fussed about it.

------
gdp
Purely anecdotal, but does anyone else notice while browsing around even the
(largely static) unauthenticated pages that the generation times are a bit
lousy?

------
idleworx
it's always interesting to find out what's behind the effort. great link.

