
How a 1-Engineer Rails Site Scaled to 10 Million Requests Per Day - wgj
http://www.railsinside.com/deployment/338-how-a-1-engineer-rails-site-scaled-to-10-million-requests-per-day.html
======
antonovka
The summary: An surprisingly large investment in a complex platform
development and extensive hardware purchases.

Let's be really, really generous and say peak load on 10 million requests is 5
hours, and pack all 10 million requests into 5 hours. That's 2 million
requests/hour, 33,333 requests/minute, 555 requests/second.

You can easily handle 5-10k requests a second with a few ms response time --
including dynamic templating and hitting a backend database -- on a 2.66 GHz 4
core Xeon 5150 with a couple gigabytes of RAM, using a servlet runtime and a
JVM-based language.

Basically, you could take that entire solution stack and compress it down to a
few items with far fewer moving parts, and spend half as much on engineering
and capital expenditures.

My hope is that efforts like MacRuby bring this level of free performance to
the Ruby world, and that languages like Scala and Clojure make the JVM more
attractive, too.

~~~
modoc
You can support 10,000 requests/second hitting a servlet engine, with dynamic
templates, and backend DB calls, on one quad core box? I'd like to give you a
job.

Maybe I've been working with heavyweight frameworks too long, but that seems
an order of magnitude off at least. Any stats on Servlet JVMs that can do 10k
requests/second on a single quad core?

~~~
antonovka
_You can support 10,000 requests/second hitting a servlet engine, with dynamic
templates, and backend DB calls, on one quad core box?._

5-10k, depending; a local instance of a recent webapp I wrote can run 5.9k
requests/sec on a simple page JSP-templated page backed by a database request.
It's running on tomcat, using servlets with a lightweight REST API, and
postgresql.

The webapp I'm working on right now has the following performance profile for
a page that fetched the user-list from the backing database. Not as fast, but
2k req/sec is not bad, and I haven't done any profiling on the new stack we're
using at all yet:

    
    
      Server Software:        Jetty(6.1.x)
      Server Hostname:        localhost
      Server Port:            8080
    
      Document Path:          /users
      Document Length:        5855 bytes
    
      Concurrency Level:      4
      Time taken for tests:   0.508 seconds
      Complete requests:      1000
      Failed requests:        0
      Write errors:           0
      Total transferred:      6045474 bytes
      HTML transferred:       5855000 bytes
      Requests per second:    1968.41 [#/sec] (mean)
      Time per request:       2.032 [ms] (mean)
      Time per request:       0.508 [ms] (mean, across all concurrent requests)
      Transfer rate:          11621.09 [Kbytes/sec] received
    
      Connection Times (ms)
                    min  mean[+/-sd] median   max
      Connect:        0    0   0.1      0       1
      Processing:     1    2   1.7      2      22
      Waiting:        1    2   1.7      2      22
      Total:          1    2   1.7      2      22
    
      Percentage of the requests served within a certain time (ms)
        50%      2
        66%      2
        75%      2
        80%      2
        90%      3
        95%      3
        98%      4
        99%     12
       100%     22 (longest request)
    
    
    

_I'd like to give you a job._

Something in your tone tells me you're not serious ...

 _Maybe I've been working with heavyweight frameworks too long, but that seems
an order of magnitude off at least._

I couldn't say; I do have local 3rd party spring-based webapps that take 5
minutes just to start up, not to mention 300ms+ to render a single page, so I
wouldn't be surprised.

~~~
look_lookatme
Your methodology doesn't reflect actual usage of the site. I suspect there
hardly an (if any) writes happening when you hit /users, the database is
repeatedly fetching a hot query (or you are pulling from a cache that isn't
changing because there are no writes happening), your dataset could be
minimal, etc.

~~~
antonovka
That may be partially true[1], but I'd challenge you to re-implement this non-
optimized case (ie, this is a rough webapp, no caching, etc) in another
runtime on similar hardware and see:

1) Whether it can support anywhere near the same level of concurrent requests
with similar response times.

2) How much complexity (nginx + unicorn + memcached + puppies) is required to
achieve this in comparison to a servlet engine (eg, tomcat) and your webapp.

[1] Most web applications are read-heavy, low on writes, and scaling up write
capability generally requires scaling up the database. You can grow quite a
bit with simple caching and monolithic database scaling before having to
tackle more complex distributed data architectures.

~~~
zepolen
If those 1968.41 [#/sec] were on a quad core then python can match it.

------
zaidf
Why is scaling a Rails site to 10M requests per day _that_ big of a deal?
Scaling anything to 10M has its challenges but at the end of the day, you
rarely read about people making a big deal out of scaling a php website of
that size.

~~~
icey
It's only noteworthy because of the early reputation Rails had for being slow
(earned or not).

------
Timothee
The real meat about how it was done is here:
[http://highscalability.com/blog/2009/9/22/how-ravelry-
scales...](http://highscalability.com/blog/2009/9/22/how-ravelry-scales-
to-10-million-requests-using-rails.html)

and that info was taken from this interview:
<http://www.tbray.org/ongoing/When/200x/2009/09/02/Ravelry>

The linked article is just a summary.

~~~
petercooper
Yes, it was mostly a "if you haven't seen this stuff, you need to check it
out" for my subscribers. I'm not sure why it's become so popular.. I guess a
lot of people still hadn't heard of it! (So mission accomplished, in a way..)

------
callmeed
This is at least the 4th article on Ravelry's scaling accomplishments here on
HN.

~~~
petercooper
Yeah. I made this post and it was really just a "throw away" piece for any of
my subscribers who hadn't already heard about it - I'm as surprised as Obama
is at winning the Nobel ;-)

I guess that even when a story blows up in the techie world, there are enough
people who didn't hear about it that it can be repeated several times and
still do well.

~~~
wgj
I'm the OP of this one, sorry. And I saw Casey above groaned a little bit at
it surfacing again. :)

But seriously, even with all the repostings, there are a lot of us who somehow
hadn't already seen it. Even though the headline, and the story, emphasize the
scaling challenges and solutions, the larger story is how you can bootstrap
something really successful with minimal resources.

Peter, thanks for running a great blog.

