
Scaling Rails to 125,000 Requests per Minute on Heroku - pritambarhate
https://zeemee.engineering/scaling-rails-to-125-000-requests-per-minute-on-heroku-b4128a10a769
======
iLoch
Wow $5000/mo for 2000rps, just for the application servers? That's absurd.

I think we're paying around $2000/mo for our app servers, a database which is
over 2TB in size, and we ingest about 10 megabytes of text data per second, on
top of a couple thousand requests per second to the user facing application.

I don't think we'll be turning to Heroku any time soon. Sorry if this seems
like a pissing contest, just giving some basis for comparison.

~~~
ukd1
This sounds like classic engineer(s) talk; totally ignoring the human cost of
setup, maint and ongoing development which is usually handled by an
(expensive) ops team. It's not $5k vs $2k, it's $5k vs $2k + ops time.

~~~
jacques_chester
If you run Cloud Foundry (CF), you can get the Heroku experience on your own
setup. It works on vSphere, OpenStack, AWS, Azure and GCP is getting there.
More platforms are coming.

I work at Pivotal, we donate the majority of engineering to CF. We also
dogfood it by running Pivotal Web Services (PWS)[0], which is a Heroku
competitor.

One major difference: PWS (and CF) doesn't have fixed instance sizes. You can
specify, to the megabyte, how much RAM you want your app to have. Other
resources are dished out proportionally.

PWS has thousands of customers, thousands of VMs under management by BOSH, I'm
not sure how many apps and services at this point. Last I checked, it's run by
about 8-10 people in two shifts (SF and Beijing).

[0] [http://run.pivotal.io/](http://run.pivotal.io/)

------
liquidise
I'm not sure how i feel about this. On one hand, squeezing that sort of
performance out of Heroku is a notable achievement. Props. On the other
hand... why Heroku at that scale? 10x F-Large servers is going to cost 5-10x
the cost of a comparable E3 configuration and E3 root access gives you a great
deal more "knobs" for customizing your performance outside of your code.

I get that many teams don't want the overhead of managing their own instances,
but when you find yourself optimizing for a platform that is engineered around
convenience instead of performance i think it may be time to make the switch.

~~~
joshcrews
One reason for them to scale to 10x F-Large on Heroku is that their app gets
super-slammed (150x more than average traffic) around college application
deadlines. So they can spend most their time on cheap Heroku and just
temporarily scale to super expensive Heroku for a short time.

------
noelwelsh
By way of contrast several years ago I benchmarked a real application hitting
1000 requests/s on a $20 per month Linode box. At this point there was no
noticeable increase in response time. I could have probably gone much higher
but that was enough for my needs so I stopped.

My "secret" is to use a language that is not slow-as-mollasses. A decade ago
Rails had a definite productivity advantage over the (mostly Java) web
frameworks of the time. That hasn't been true for a while, and there are many
performant alternatives. If you're culturally a Ruby or Python programmer you
can go pick up Go, or Node, or Elixir and feel at home. If you live in JVM or
FP land you've always had fast alternatives (in my case it was Scala).

You pay a one-time cost to learn new things but you reap the benefits forever.

~~~
nickjj
> If you're culturally a Ruby or Python programmer you can go pick up Go /
> Elixir

And then you're 10 years behind the curve when it comes to community support
which is really the only thing that matters when it comes to being productive.

That's 10 years of less blog posts, tutorials and most importantly third party
gems / packages that have been extracted from many years of real world use
cases.

~~~
matt4077
I think phoenix itself is proof that it can be a lot faster the second time
around (I'm one of those heretics who believe it's a pretty clever react clone
with an added layer of cool innovation).

The libraries seem to be filling out quite quickly – and libraries may be a
long tail problem. The set of 10 to 15 requirements every project has is
basically there.

~~~
asdf1234
> The set of 10 to 15 requirements every project has is basically there.

Even a lot of the really basic stuff like file uploading and image handling
libraries aren't in a good state yet. For example, there's nothing that comes
close to Carrierwave, Paperclip or Shrine and the Image Magick libraries are
nowhere close to their equivalents in Ruby or Python.

~~~
nickjj
Yep and this is the problem. Initial library support isn't close to enough.

It needs to be extracted from a real project and then carefully groomed and
enhanced based on real problems.

Otherwise you're just sitting there trying to invent something with no use
case. You can't invent a beautiful API or account for edge cases that you
don't even know exist.

This is why Rails will be around and thriving for another 10 years. Until
something so drastic comes along that it fully disrupts everything we know
about web development, I will still reach for Rails.

Right now all of these alternative solutions in other languages aren't
offering enough benefits to even think about switching.

P.S., I'm not even a Rails fanboy. I use other techs too and tried to get on
the bandwagon with Node back in the early days and did the same thing with Go.
It took a long time to realize how much of an utter waste of time it is to
chase the new shiny thing.

------
iampims
I'm always very suspicious of metrics reported in req/minute to make numbers
look bigger than they actually are.

Assuming linear load – which we all know is never the case – is 2,000 req/s on
average, which is nothing to be shy about.

~~~
sp527
> Assuming linear load – which we all know is never the case

This is phrased in a confusing way. If we're talking about sustained burst
periods of traffic, then you almost definitely expect a fairly smooth
distribution of requests over the course of a minute. You would only really
start to expect serious variance at the daily granularity (i.e. time of day).

~~~
sulam
Uniform is entirely the wrong thing to expect at a given time of day.
Typically we model these as exponential, or poisson distributions:

[https://en.wikipedia.org/wiki/Poisson_distribution](https://en.wikipedia.org/wiki/Poisson_distribution)

~~~
sp527
"Sustained burst periods of traffic"

------
petehuit
Hi everyone, OP here (Pete at ZeeMee).

Thanks for all the discussion around this! The point of the article wasn't
"How to get the most performance per $" \- I would never recommend Ruby,
Rails, or Heroku for that.

Rather, it was "If you're using Rails on Heroku, this is what it looks like to
scale, and this is how far we could get it to go" For us, it's great! We can
sit nearly idle (cheap), most of the year, and can crank it up to "expensive"
a few days a year when needed.

If you need more/bigger/better scale than that for most of the year, then this
article is still helpful because you know which technology to not use.

At ZeeMee we currently have about 2.5 people working on backend/rails stuff,
and right now it makes more sense for us to work on features at this scale,
rather than work on scaling itself. We were happy with how easy it was to
crank up to this scale when we needed without really optimizing anything else.

Of course we'll switch off of Rails+Heroku when the (cost to swtich) < (cost
to continue). Keep in mind that the cost to switch includes engineering costs,
and opportunity costs (other things we could be working on).

Love the discussion, thanks!

------
hota_mazi
Per minute? The only reason why one would measure requests per minute is when
the number of requests per second is ridiculously low.

And it is... Seriously, 2,000 requests per second is considered fast in the
Rails community?

The JVM achieves several orders of magnitude that before the JIT even kicks
in.

~~~
grobaru
Wow are you seriously comparing mri with jvm that receivee billions of dollars
and significant manpower investments and is fundamentally different from.mri?

~~~
wonnage
Yeah, this is why if you want to serve more than 200rps on a dedicated box you
pick something other than Ruby in 2016.

~~~
technion
This is a t2.micro instance (regardless of the hostname) running Ruby:

[http://imgur.com/a/EGMt4](http://imgur.com/a/EGMt4)

~~~
ehsanu1
Running what, a bare metal rack app returning an empty response and doing no
real work? All requests are not made equal, which is why these RPS judgements
are pretty silly unless we define the exact request.

There's the TechEmpower benchmarks for that though. And you can't deny Ruby is
among the worst there.

------
sjtgraham
How to scale your Rails app to 2000 reqs/sec? TL;DR Spend $5k on dynos. :S

Rails and Heroku: two things that don't really seem like good value these
days.

~~~
gleenn
The counter argument is that developer time can cost a lot more than that. If
Rails makes me faster, my company might save a lot of money. Obviously how
much it makes me a faster developer versus how much performance I pay depends
on the developer and the specific application.

~~~
sjtgraham
I did Rails for 10 years. This is not even a trade-off anymore. You can write
code in a productive language and have it be much less resource intensive out
of the box, e.g. Elixir.

Your codebase is as good as your worst developer and it's quite easy to write
faulty code in Ruby. The claims of any savings using Ruby are highly dubious
in my experience of making lots of money being called in to unf*ck many a Ruby
codebase as a contractor.

~~~
claudiug
offcourse. every rails/ruby thread _must_ have an elixir reply. every time.

Sorry to bashing, but that is the reality

~~~
technion
I'm surprised we don't see more "use jRuby" replies. Most of these discussions
come from businesses "already using Rails". jRuby has immediate, major
improvements to performance and many codebases "just work". In my case, I just
had to mess with the Gemfile a bit.

------
pritambarhate
There are some interesting points here on how Heroku internals can affect your
app:

> An important outcome of these tests is that the slow-response situation is
> outside of your control unless you’re running single-tenant (performance)
> dynos.

~~~
lossolo
It's like almost every "cloud" works. This are just VMs with a lot of noise
from your neighbors. That's why i always use bare metal. Control over whole
CPU and cache lines can be game changer if you have cpu intensive
applications.

~~~
Rapzid
Most AWS instances over a certain size have pinned cores and you'll see very
little steal time unless something is up. That said, something might be up!
I've seen it and the big players(netflix) monitor for this and re-prov
accordingly.

~~~
jon-wood
You may have pinned CPU cores, but there's still contention for other
resources such as disk IO and network unless you've got dedicated interfaces
for them as well, and that's just within your own instance - downstream you've
got shared routers, and the various AWS services you're probably depending on
as well.

------
dansingerman
There's a very important metric they have left out of their data on settings
for each test run: WEB_CONCURRENCY. While this will be different per app it
will make a big difference to the performance. They mention they had 67
processes on 10 p-large dynos, but why 67? Did they do some separate testing
to prove that was optimal?

In my experience it should be tested as an independent variable (to dyno
number and type) to optimise performance.

I will however 100% echo that it is never worth scaling on shared tenancy
dynos. In my experience the combination of noisy neighbours and random routing
means you'll very likely end up with worse performance by adding shared
tenancy dynos.

------
rcarmo
I know I'm going to get flamed for this, but I nearly stopped reading at "per
minute". Heroku is still nice, though, and I perfectly understand why people
stick with it, but contrasting it with the choice of using Ruby is what made
me read the article.

------
diegorbaquero
My 2GB VPS can handle that ~2k rps easily at a tiny fraction of the cost.

~~~
RexM
2k rps with rails?

With all of the talk about how slow ruby is, that sounds too intensive for a
2gb vps to handle, but I'm not a rails dev so I don't know.

~~~
jon-wood
2k rps alone is fairly meaningless without the context of what exactly is
being done. I could build a Rails application that can serve a cached static
template at 2k rps without breaking a sweat, but once you start hitting the
database and doing some processing of that data it gets more difficult.

------
malyk
Once you get to ~25 dynos the cost performance large dynos makes sense and you
can run a lot of puma workers with the 14gb memory quota.

------
jordanthoms
This is great. The multi-tenant dyno issues certainly matches our experience -
we would consistently get request timeouts when under load (even if scaled up)
until we moved to the single-tenant dynos. With the random routing, if even
one of your dynos has a busy neighbour you _will_ be timing out on some of
your requests.

------
fizx
Wow, that's like over 5,000,000,000 requests per month! Please use requests
per second like everyone else.

~~~
firloop
Not really justifying the practice but I bet the OP wrote it in those terms
because that's the default unit that New Relic uses.

Bothers me whenever I look at my New Relic dashboard but I haven't bothered to
see if one can change to seconds rather than minutes.

------
markonen
So, noisy neighbors mean that you can't get predictable performance from
shared dynos; simple math indicates that even one lagging dyno out of 20 will
kill your 95th percentile response times.

The trouble with the dedicated dynos is that you need at least two for high
availability (to not miss requests on dyno crashes / restarts). That puts the
minimum cost per service at $500/mo. Not necessarily an issue with monolithic
apps, but quickly untenable with microservices.

~~~
collyw
I would imagine that Rails apps are generally more monolithic in nature.

------
morekozhambu
Always wanted to ask how much hits does HN handle by the way?

Anyone have any idea about what DB HN runs on and what is the size of the
schema?

~~~
pravula
[https://news.ycombinator.com/item?id=11889575](https://news.ycombinator.com/item?id=11889575)

[https://news.ycombinator.com/item?id=9222006](https://news.ycombinator.com/item?id=9222006)

------
cocotino
>POJA (plain old JSON API)

Oh, god.

