re (2): The router could just interpret a specific response code, or an early-close of the socket, as a directive to try another dyno. (I think the router already treats a failure-to-connect-to-listening-socket this way.) So, it would be up to the client code to optionally send such a refusal, when appropriate. It doesn't require any new Heroku component to decide when to send rejections... only router support for respecting them when recieved.
A simple solution might be to just have the request routers connect in parallel to multiple dynos, and use the first one to connect successfully. You'll increase overall load (which can be mitigated to an extent by waiting a short delay before opening additional sockets), but a single slow dyno won't hold you up either.