

Ask HN: Zero-downtime deployment of multi-container app - jlu

Wondering could someone shed a light on this? thanks!
======
brianwawok
You want specific tools, or what? Are you using dockr or ? A 30 second summary
of your architecture will help a lot.

Zero downtime is usually tricky and will greatly depend on your architecture.
I can I think of two overall paths to success, but there are more (and these
may or may not work for your use case).

1) Prod A / Prod B flip. Your current code runs in prod, we will call Prod A.
Bring up an entire copy of your stack with the new code version as prod B..
Once it is up and stable, you switch traffic over to it.. once all traffic is
switched, you kill the old servers. Tricky part with this is state, do you
need to worry about state loss? If you keep a constant database that both prod
A and prod B hit, you can do this without too much trouble. Make sure you do
no state on your individual app servers (session cache etc).

2) Slow roll. Say you have 10 web servers that are behind a load balancer. You
take 1 down - upgrade it - then add it back. Repeat until all 10 are upgraded.
The trick here is what happens if a user hits code version A then B then back
to A? If it doesn't matter, easy. If it matters, you may need to lock clients
to machines from the load balancer, so that no one that has seen the new B
will ever switch to an old server still on A.

~~~
jlu
Thanks brianwawok, I'm currently experimenting with #1, switching between two
batches of docker containers with a bunch of script, but am curious about is
there more robust approaches to this? What are people using professionally in
real world?

~~~
brianwawok
The real world is a huge range. In my experience, something like 75% of the
world does..

1) Push out a new code

2) Shut down all prod servers at once

3) Restart them

Maybe during a weekly maintenance window.. maybe at 2am.. maybe at noon..
depending on the company and clients. Basic assumption is "Meh, people will
reload if the page is down for a few minutes"

Even though 0 downtime is the "right" way to do stuff, seems like the ops
level of many places is not that high.

The fact that you have scripts and are making some kind of attempt for less
downtime puts you in the top 25% of the internet.

------
theod
Docker's native CNM (Container Network Model) provides a out-of-the-box
solution to support this use-case. Pls refer to
[https://github.com/docker/libnetwork/blob/master/docs/design...](https://github.com/docker/libnetwork/blob/master/docs/design.md)
for more info on Docker CNM. This eliminates the need for any custom scripts
to achieve the swappable use-case that you have in mind.

The CNM is realized using the newly introduced Experimental Networking
solutions which includes : Network & Service UI, Pluggable Service-Discovery
and Native Multi-Host cross-container connectivity. More information on trying
out these experimental feature :
[https://github.com/docker/docker/blob/master/experimental/ne...](https://github.com/docker/docker/blob/master/experimental/networking.md)
[https://github.com/docker/docker/blob/master/experimental/RE...](https://github.com/docker/docker/blob/master/experimental/README.md)

Service (aka endpoint) owns the networking configs (such as ip-address, mac-
address, etc...) and the container that backs the service can be swapped while
retaining the same networking and service configs. Hence swapping a container
between older to newer version of app server is just a matter of detaching a
service from the older container and attaching the same service back to the
newer container. Also, Please note that a container can belong to multiple
networks and each container can publish different services in different
network.

With these simple and composable CNM design, your use-case can be mapped to
the CNM model. A quick diagram explaining the concept :
[https://docs.google.com/drawings/d/1LvD94UwfinQelpEqT9BaRYmi...](https://docs.google.com/drawings/d/1LvD94UwfinQelpEqT9BaRYmiKMcRpRxgX1CIpS-2oYI)

We can add more detailed documentation for this specific use-case. Please join
us in
[https://github.com/docker/libnetwork](https://github.com/docker/libnetwork)
and IRC@freenode #docker-network channel to discuss this in more detail.

~~~
jlu
Thanks @theod, will definitely check that out!

------
grhmc
Do it the same way you would do it without a containering system.

Load balancers, service discovery, IP swapping, etc. all help accomplish this.

------
jlu
Thanks for the feedbacks, more details here.

In a multi-container app based on docker with cross-container communication,
as structured below:

load balancer → app servers * n → db * 2

How does one deploy new code to the app servers without any downtime?

I'm looking for any docker-based solutions, toolkits or simply tips and
tricks, thanks!

