Aside from that, I am extremely sympathetic to Heroku's engineering point here --- it's obviously hard for HN to extract the engineering from the drama in this case! Randomized dispatch seems like an eminently sound engineering solution to the request routing problem, and the problems actually implementing it in production seem traceable almost entirely to††† the ways Rails managed to set back scalable web request dispatch by roughly a decade††††.
††† IT IS ALL LOVE WITH ME AND THIS POINT COMING UP HERE...
†††† ...it was probably worth it!
The solution is to combine the two approaches. You split the 100 nodes into 10 groups of 10, you route randomly to one of the groups, and then within a group you route intelligently. This works really well. The probability of one of the request queues filling up is astronomically small, because for a request queue to fill up, all 10 request queues in a group have to fill up simultaneously (and as we know from math, the chance that an event with probability p occurs at n places simultaneously is exponentially small in n). Even if you route randomly to 50 groups of 2, that works a lot better than routing randomly to 100 groups of 1 (though obviously not as well as 10 groups of 10). There is a paper about this: http://www.eecs.harvard.edu/~michaelm/postscripts/handbook20...
This is essentially what they are suggesting: run multiple concurrent processes on one dyno. Then the requests are routed randomly to a dyno, but within a dyno the requests are routed intelligently to the concurrent processes running on that dyno. There are two problems with this: (1) dynos have ridiculously low memory so you may not be able to run many (if any) concurrent processes on a single dyno (2) if you have contention for a shared resource on a dyno (e.g. the hard disk) you're back to the old situation. They are partially addressing point (1) by providing dynos with 2x the memory of a normal dyno, which given a Rails app's memory requirements is still very low (you probably have to look hard to find a dedicated server that doesn't have at least 20x as much memory).
They could be providing intelligent routing within groups of dynos (say groups of 10) and random routing to each group, but apparently they have decided that this is not worth the effort. Another thing is that apparently their routing is centralized for all their customers. Rapgenius did have what, 150 requests per second? Surely that can even be handled by a single intelligent router if they had a dedicated router per customer that's above a certain size (of course you still have to go to the groups of dynos model once a single customer grows beyond the size that a single intelligent router can handle).
There's a tradeoff between:
* a well-engineered request handler (a solved problem more than a decade ago) and
* an efficient development environment (arguably a nearly-unsolved problem before the Rails era)
And I feel like mostly the Heroku drama is a result of Rails developers not grokking which end of that tradeoff they've selected.
Of course Heroku is under no obligation to do anything, but its customers have to justify its cost and low performance relative to a dedicated server. And most applications run just fine on a single or at most a couple dedicated servers, which means you don't have routing problems at all, whereas to get reasonable throughput on Heroku you have to get many Dynos, plus a database server. A database server with 64GB ram costs $6400 per month. You can get a dedicated server with that much ram for $100 per month. Heroku is supposed to be worth that premium because it is convenient to deploy on and scale. Because of these routing problems which may require a lot of engineering effort in your application it's not even clear that Heroku is more convenient (e.g. making it use less memory so that you can run many concurrent request handlers on a single Dyno).
I'm not sure there are such providers, and if there aren't, I think it's safe to point the finger towards Rails.
As a system for efficiently handling database-backed web requests, Rails is archaic. Not just because of its memory use requirements! It is simultaneously difficult to thread and difficult to run as asynchronous deferrable state machines.
These are problems that Schmidt and the ACE team wrote textbooks about more than 10 years ago.
(Again, Rails has a lot of compensating virtues; I like Rails.)
> I'm not sure there are such providers, and if there aren't, I think it's safe to point the finger towards Rails.
This is not sound logic. I described above two methods for solving the problem: (1) increase the memory per Dyno (see below: they're doing this, going from 512MB to 1GB per Dyno IIRC, which although still low will be a great improvement if that means that your app can now run 2 concurrent processes per Dyno instead of 1), or (2) do intelligent routing for small groups of Dynos. Do you understand the problem with random routing, and why either of these two would solve it? If not you might find the paper I linked to previously very interesting:
"To motivate this survey, we begin with a simple problem that demonstrates a powerful fundamental idea. Suppose that n balls are thrown into n bins, with each ball choosing a bin independently and uniformly at random. Then the maximum load, or the largest number of balls in any bin, is approximately log n / log log n with high probability. Now suppose instead that the balls are placed sequentially, and each ball is placed in the least loaded of d >= 2 bins chosen independently and uniformly at random. Azar, Broder, Karlin, and Upfal showed that in this case, the maximum load is log log n / log d + Θ(1) with high probability [ABKU99].
The important implication of this result is that even a small amount of choice can lead to drastically different results in load balancing. Indeed, having just two random choices (i.e., d = 2) yields a large reduction in the maximum load over having one choice, while each additional choice beyond two decreases the maximum load by just a constant factor."
Most things are inferior to other substitutable things! :)
And here we've re-invented the airport passport checking queue - everybody hops onto the end of a big long single queue, then near the front you get to choose the shortest of the dozen or two individual counter queues
I wonder what the hybrid intelligent/random queue analogues of the in-queue intelligence gathering and decision making you caan do at the airport might be? "Hmmm, a family with small children, I'll avoid their counter queue even if it's shortest", "a group of experienced-looking business travellers, they'll probably blow through the paperwork quickly, I'll queue behind them". I wonder if it's possible/profitable to characterize requests in the queue in those kinds of ways?