I'd like to start by acknowledging that I'm one of the "non-customers who are watching from the sidelines". I think Adam's right that this is an important distinction.
Adam, there's something that confuses me about this. I'm no expert in routing theory, nor have I done the experiments, so forgive me if my reasoning misses something.
I understand why RapGenius took you up on your original promises of "intelligent routing", and I think I understand what you're saying about scaling, and how scaling "intelligent routing" is so far unsolvable, and the motivation for your transition from Bamboo to Cedar, especially in the context of concurrent clients. What I don't understand is this:
It seems to me that if you split into two (or more) tiers, and random-load-balance in the front tier (hit first by the customer), and then at the second tier only send requests to unloaded clients, that you eliminate RapGenius's problem for customers who followed your specific recommendations for good performance on Bamboo (to go single-threaded and trust the router).
Do you have reason to believe that this doesn't one-shot RapGenius's problem? Do you have strategic/architectural reasons for rejecting this even though it would work? Did you try it and it failed? What's the story there?
Maybe I'll write a simulator to (dis)prove my naive theory. :P
> It seems to me that if you split into two (or more) tiers, and random-load-balance in the front tier (hit first by the customer), and then at the second tier only send requests to unloaded clients [...]
I'm unclear how you'd think introducing a second tier changes things. That tier would need to track dyno availability and then you're right back to the same distributed state problem.
Perhaps you mean if the second tier was smaller, or even a single node? In that case, yes, we did try a variation of that. It had some benefits but also some downsides, one being that the extra network hop added latency overhead. We're continuing to explore this and variations of it, but so far we have no evidence that it would provide a major short-term benefit for RG or anyone else.
> Do you have reason to believe that this doesn't one-shot RapGenius's problem?
As a rule of thumb, I find it's best to avoid one-shots (or "specials"). It's appealing in the short term, but in the medium and long term it creates huge technical debt and almost always results in an upset customer. Products made for, and used by, many people have a level of polish and reliability that will never be matched by one-offs.
So if we're going to invest a bunch of energy into trying to solve one (or a handful) of customer's problems, a better investment is to get those customers onto the most recent product, and using all the best practices (e.g. concurrent backend, CDN, asset compilation at build time). That's a more sustainable and long-term solution.
Sorry, yes, I'm supposing that the second tier serves fewer dynos; sufficiently few that your solutions from 2009 (that motivated you to advertise intelligent routing in the first place) are still usable.
> As a rule of thumb, I find it's best to avoid one-shots (or "specials").
Absolutely, and I would never suggest that. However, it's not just RG that has this problem, right? If I understand correctly, isn't it every single customer who believed your advertising and followed your suggested strategy to use single-threaded Rails, and doesn't want to switch?
So it's not about short or medium term; it's about letting customers take the latency hit (as you note), in order to get the scaling properties that they already paid for.