How is this different than using Layer 4 routing/load-balancing? If you are already scaling to multiple backends then you're probably likely to have access to more than 1 or 2 public facing IPs if you have to ensure everything inbound is on port 80. Just use sub-domains with different DNS A records. You listen on port 80 and then route to your backend application farm. The load balancers already have various means to monitor health on the backend and you can scale to 100Gbs speed already if you have the budget.
Check out LVS if you want open-source. HAProxy can handle this in a limited fashion. Then if you have gone big already then you can probably afford a Brocade ServerIron, Citrix Netscaler or F5 Big-IP.
That stuff is 1:1. I'm shooting for N:N. For example, that chat demo is taking an individual request from a browser, passing it to a backend, and then that backend can reply to N number of other browsers on the Mongrel2 server. All async, and would even work with HTTP.
That's damn near impossible with all of the tech you listed.
Not quite. I wasn't talking about how the backend communicates. I'm a big fan of zeromq too but there are many types of messaging and it is a must once you start adding more than a few servers. As I see others have mentioned, you're going to have problems with parsing speed at some point. What happens when you need to scale out to handle all the requests on the front end?
How is HA going to be handled? WIll it be easy enough to have an Active/Active of the mongrel server? How do you get away from a single point of failure?
First off, speed is not scalability. I think we need to start talking about "scalability" in terms of the "speed" / "cost". I can get billions of req/sec if I have billions of dollars. What scalability should be is getting as much as you can performance/stability wise with the least amount of money.
With that being said, the plan is that since all backends and frontends can talk using arbitrary types of 0MQ sockets, and 0MQ provides for any kind of network layout, you can make it as "no single point of failure" as you have money to burn on. In my tests so far it's incredibly trivial to have 10 mongrel2 servers submitting requests using a subscribed UUID, and then have 20 backends servicing those requests and replying as needed. Since both the frontends and backends are just using basic messaging with subscribed queues it's fairly easy to get them to talk without any one of them being the point of failure.
Of course, with this comes a configuration penalty which I have to figure out. Either the backends then have to know where all the frontends are (probably easiest) or the frontends have to know about backends. I still haven't got that worked out, but again, it's fairly trivial to create a "registration" service where newly minted backends/frontends announce themselves.
Finally, all of these things have been solved no problem. The easiest answer is I'll just figure out how other people have done it and copy them. It's not like this is new territory or anything. The question itself is kind of stupid because you're saying, for some reason, 0mq precludes my ability to use any existing best practice architecture, when really they're orthogonal and I'm not just using 0mq.