Keep in mind with more Unicorn workers per box you will need fewer dynos overall so it does not necessarily mean you will have to pay double in the long run. At 1x Dynos we could only have 2 workers going but at 2x we could squeeze five workers in so we use half of the dynos we did before.
Same here. We were struggling to fit 2 unicorn workers on a 1x dyno but can easily get 4 on a 2x. We're paying the same amount for twice the concurrency. The 2x dynos with unicorn have pretty much solved the queuing issues for us.
Not so fast though. This might just be a quick fix. Any high-traffic sites might still see the same results as the routing alg is still random. Probably the guy who wrote the routing alg simulation can re-run his tests to see what kind of improvements this 2X can make.
Yeah, I had the same thought when the discussion came around last time. I actually took the source to the simulations (thanks for sharing, rapgenius!) and did some experiments on how random routing performs if it's routing to backends that can handle various #s of concurrent requests.
> and let the operating system do all the complex scheduling.
So Heroku could spawn workers with a $FD environment variable instead of $PORT and the "complex scheduling" done by the OS _is_ the second routing layer.
But really, they could still do a second level routing even outside a single OS as the scale of the distribution is much smaller, so having the routing mesh be aware of the worker availability seems feasible again.
It should save you even more than you describe, as you shouldn't need to over provision as much. The odds of two 95% percentile requests stalling your 1x dyno is pretty good. The odds of getting 5 such requests is much lower.