

Large-scale HTTP and websocket routing with Hipache - shykes
http://blog.dotcloud.com/under-the-hood-dotcloud-http-routing-layer

======
darkarmani
I didn't see a lot about websocket routing.

I'm in a situation where I want to use secure websockets to lots of backend
VMs. In fact, I want each user to get a fresh VM running a websockets server.

Can I reverse-proxy the secure websocket, terminating the SSL, and dynamically
forward it along to various backends (VMs)? I would like to multiplex the
secure websockets without putting the SSL cert inside each VM.

~~~
themgt
Yeah, node-http-proxy supports this built-in. You terminate the SSL at the
proxy and then forward the request to whatever backends you want.

~~~
darkarmani
And you can dynamically add backends based on urls or subdomains?

That sounds great if it is true. Do you know how well it scales?

~~~
shykes
Yes, that's what Hipache does. From the project homepage [1]: "Hipache is a
distributed proxy designed to route high volumes of http and websocket traffic
to unusually large numbers of virtual hosts, _in a highly dynamic topology
where backends are added and removed several times per second_."

It also addresses your question on scaling: "It currently serves production
traffic for tens of thousands of applications hosted on dotCloud. Hipache is
based on the node-http-proxy library."

[1] <http://github.com/dotcloud/hipache>

------
jaytaylor
It's amusing to see that they've reinvented the "active health checks" wheel
for their system, because HAProxy [1] does this out of the box.

Side-note/rant: I am always surprised that HAProxy doesn't get more attention
on hackernews, because it is an incredibly powerful tool. IMO, it is
definitely one of the biggest technical unsung heroes I'm presently aware of.

[1] <http://haproxy.1wt.eu>

~~~
shykes
(dotCloud employee here)

We are very much aware of HAProxy, love it, and would have been very happy to
use it. For the majority of use cases we would probably recommend it over
Hipache, simply because it's more proven.

However, our particular use case involves a) a very large number of vhosts,
and b) a highly dynamic topology where backends are added and removed very
frequently. In this setup, instrumenting a cluster of HAProxy (or nginx) boxes
basically means re-generating and reloading multi-megabyte configuration files
across dozens of machines, several times per second. It's what we did before
Hipache, and it was painful. Compared to that, implementing active health
checks was a small price to pay.

~~~
jaytaylor
Thank you for clarifying

------
druiid
Interesting article. I'm wondering, so all apps route through a single
collection of load balancers, or are they per-app? Seems to me like a
possibility to cut down on config turnover requirements if you were still
using a stock nginx config would be to give each app their own LB cluster and
thus changes (if they are actually needed) would be minimal. You'd incur
additional memory/cpu requirements for each I suppose which is a trade off and
perhaps there's something else I'm missing with the environment that would
keep this from working.

~~~
shykes
There's both. We use a primary cluster which all apps share by default, and
for customers which need more customization or resource isolation, we offer
dedicated instances.

We don't deploy per-app instances for all our applications because at our
scale - dotCloud has deployed hundreds of thousands of vhosts - the overhead
would be huge compared to a single large cluster, and having to deploy and
instrument so many nginx instances would make our system much more complex.

The majority of web apps are actually better off using a shared cluster: they
get better reliability, better support, better peak performance and a cheaper
price, in exchange for less customization and less guarantees of worst-case
performance.

------
ARothfusz
If you've got any questions about the tech, we'll be happy to answer them
here.

~~~
beck5
Does this mean Hipache is likely to be dead in a couple of years?

~~~
shykes
What are you referring to by "this"?

~~~
beck5
What looks to be the next step which is nginx and lua based.

~~~
ARothfusz
Pretty cool, huh? :-) You can follow (and contribute to!) that development in
the open source project, just like you can with the Node version of hipache.

~~~
shykes
I would agree with Arothfusz, think of it as a hint at a possible future
version of Hipache. It might use Nginx/Lua as a runtime instead of Nodejs -
but the redis-based architecture will be the same and there will be an upgrade
path. It will still be Hipache.

