My understanding is that ELBs are HAProxy, and they may be set to use the leastconn algorithm (a global request queue that is friendly to concurrent backends). However, once you get any amount of traffic they start to scale out the nodes in the ELB, which produces essentially the same results as the degradation of the Bamboo router that we've documented.
The difference, of course, is that ELBs are single-tenant. So a big app might only end up with half a dozen nodes, instead of the much larger number in Heroku's router fleet.
Offering some kind of single-tenant router is one possibility we've considered. Partitioning the router fleet, homing... all are ideas we've experimented with and continue to explore. If one of these produces conclusive evidence that it provides a better product for our customers and in keeping with the Heroku approach to product design, obviously we'll do it.
I hope you'll be able to share your findings with us, even if they're negative. As someone who has no stake in Heroku, i have the luxury of finding this problem simply interesting!
My hypothesis is that tenant-specific intelligent load balancers would be plausible; i would guess that you would never need more than a handful of HAProxy or nginx-type balancers to front even a large application. Your main challenge would then be routing requests to the right load balancer cluster. If you had your own hardware, LVS could handle that (i believe that Wikipedia in 2010 ran all page text requests through a single LVS in each datacentre), but i'm not sure what you do on EC2.
However, "hypothesis" is just a fancy way of saying "guess", which is why your findings from actual experiments would be so interesting.