

Zero Downtime with HAProxy - weitzj
https://medium.com/@Drew_Stokes/actual-zero-downtime-with-haproxy-18318578fde6

======
zenlikethat
I think this is why some techniques for doing rolling deploys set up a health
check for the node which is configurable by the deploy process. That way,
otherwise healthy nodes report their status as "down" while still accepting
and finishing requests, but then get taken out of the HAProxy rotation without
having to re-load the configuration. The new code is rolled out on that node,
and the toggle is set to report "healthy" again, so HAProxy brings the node
back into rotation.

~~~
nulltype
This is what I've done in the past, not sure why we need to "hack TCP" to get
zero downtime.

As an additional thing I do rolling deploys where I just create an entirely
new VM and add it before removing the old one. This just means that I don't
have to recycle a node that might potentially have some state on it.

~~~
tangled
The article is about restarting HAProxy without any downtime. HAProxy restarts
are needed when adding new service instances or adjusting configuration
options. This is a different and much harder problem than gracefully
restarting load balanced service instances.

~~~
jsprogrammer
Why shouldn't HAProxy have an API for updating its configuration while
running?

~~~
vidarh
It should, and it allows you to do some changes over a socket, but allowing
complete updates with the current haproxy architecture is a lot of work and
isn't there yet.

------
trump
Relevant blog post: [http://engineeringblog.yelp.com/2015/04/true-zero-
downtime-h...](http://engineeringblog.yelp.com/2015/04/true-zero-downtime-
haproxy-reloads.html)

------
nodesocket
The downtime "gap" between step 1 and step 2 is probably way less than 500ms
right? If being down for 500ms, is unacceptable, you probably should have
multiple HAProxy instances running, and just do a rolling deploy. Seems like a
bit of over-optimization.

~~~
strebler
Exactly, with these "high uptime" requirements, isn't it usual to have 2+
HAProxy instances running on different servers bound to the same IP? At least
that's how I do it. When you update the config, first update the backup
HAProxy, then update the main one. Any requests that take place during the
main proxy restart would go to the backup.

~~~
vbezhenar
Could you please describe how can I bind 2 different servers to the same IP
with automatic request balancing (so when one server is down, another one will
serve all requests)? I understand how to do it with another server with
HAProxy, but your solution seems to be without HAProxy.

~~~
strebler
Apologies for the delay. We do use HAProxy.

To have 2 servers with the same IP, we use VRRP (Virtual Router Redundancy
Protocol) and keepalived. HAProxy is setup on 2 separate servers/instances,
and using VRRP/keepalive they both share an IP address (which HAProxy binds
to). The servers also have their own unique IP address(es) (on top of the
shared one), so the shared address doesn't really "belong" to any machine.

If one server goes down, VRRP gives the IP to the other server and that
HAProxy takes over.

------
WestCoastJustin
Played around rolling deployments a little using Ansible and tagged github
releases. Here's a screencast with some pics for anyone who's interested @
[https://sysadmincasts.com/episodes/47-zero-downtime-
deployme...](https://sysadmincasts.com/episodes/47-zero-downtime-deployments-
with-ansible-part-4-4)

Workflow is to just mark the node down for maintenance via socket cat, upgrade
it, then mark it back on-line.

    
    
      # mark node off-line
      echo "disable server episode46/web1" | socat stdio /var/lib/haproxy/stats
    
      # update app code
      git pull tagged release version
    
      # mark node on-line
      echo "enable server episode46/web1" | socat stdio /var/lib/haproxy/stats

~~~
vidarh
That's the easy part - the tricky part is when you need to reload the haproxy
config to do any of the many things you can't do over the socket.

~~~
brianwawok
If you have a few HA proxy nodes with DNS load balancing, wouldn't the client
auto move on to another HA proxy? It is a tiny lag but shouldn't be an error.

~~~
vidarh
That might have been the case if you could expect clients to be well behaved,
but you can't. In my experience, hardly no network clients handle that kind of
thing properly.

------
andreyf
This way is even better, as it queues up the SYN packets until HAProxy is
ready, leading to no faster responses than waiting 1s until the client retries
with another SYN: [http://engineeringblog.yelp.com/2015/04/true-zero-
downtime-h...](http://engineeringblog.yelp.com/2015/04/true-zero-downtime-
haproxy-reloads.html)

~~~
vidarh
It's also a lot more complex. It mainly makes a difference if you're reloading
haproxy configs frequently, which most of us don't do.

~~~
jolynch
I've been a little confused by the claims of how complex the SYN delay
solution is. Is it actually all that more complex?

The original blog post wraps a restart command in two iptables invocations
(and relies on a hacky sleep interval which may or may not work sometimes).
The SYN delay method wraps a restart command in two tc invocations. The
concepts are more or less identical in complexity as one is telling the kernel
"drop SYNs now please" and the other is saying "delay SYNs for a bit please".

All the complexity in the qdisc solution is in the one time setup of the
queuing disciplines. I think the largest drawback of the delaying SYN solution
is not complexity but that getting it to work with external load balancers is
more tricky than getting it to work with internal load balancers. Honestly,
you're right that if an org doesn't have to restart HAProxy a ton, then it
doesn't make a lot of sense to invest in solving this problem; although if it
were me I'd just make sure I was on the latest Linux kernel so that the period
during which HAProxy can cause RSTs is as small as possible and not bother
with either the iptables or tc solutions.

------
kul_
Does nginx has the same problem of "small time window" downtime? I was
guessing reloading would be more gracefully handled by having a single process
binding on http/s ports and child processes which contact upstream servers.
That way there is no "small time window" while reloading.

~~~
jolynch
No, nginx should not have this problem as it uses fd passing to gracefully
handoff connections.

HAProxy only has this issue on Linux because Linux's SO_REUSEPORT
implementation unfortunately introduces a race condition between accept and
close. While I haven't personally tested it, HAProxy on one of the BSDs should
not have this small window of downtime.

------
ericfrederich
I was, and am, of the opinion that any solution involving sleep isn't a real
solution

~~~
kentt
Why is that?

~~~
thraxil
Without being dogmatic about it, I feel similarly about fixing things by
inserting sleeps.

At a high level, you often see programmers sprinkle sleeps into their code to
"fix" race conditions or deadlocks. That doesn't really fix the problem, it
just moves it around and it's usually done because they don't know how to
reason about the underlying problem and fix it properly.

You need to sleep long enough that whatever you're waiting for will definitely
have finished. Most of the time you have no exact guarantee of that, so you
have to pick some N that is relatively large. Inevitably, no matter what N you
pick, sooner or later, the thing you're waiting for will take N + 1 and things
break. To make it worse, the N + 1 situation often happens because you're
getting an unusually large amount of traffic or because something else in the
system is already in a failure state. So the breakage tends to come at the
worst possible time and exacerbate things.

Meanwhile, if you sleep for N ms somewhere, one thing you can guarantee is
that whatever you're doing will take at least N ms. There's no way to make it
faster, even if it may have been unnecessary to wait that long. Often not a
big deal, but the more developers sprinkle sleeps into their code, the more
often you run into bizarre performance bottlenecks as a result.

Network timeouts and similar are kind of a fact of life, so there's no perfect
solution. But if you find yourself trying to solve a problem by sleeping for
some arbitrary period of time, a little alarm should go off in your head
telling you that there's probably a better solution.

~~~
danieltillett
Sometimes sleep is the only viable solution when dealing with an external
system. I have such a problem with one of my programs that has to access a
hardware device. The problem is the load the hardware can handle varies. To
get maximum throughput I have to pound the hardware as fast as possible and
when it fails sleep for a random but increasing time. If anyone knows of a
more elegant solution please post.

~~~
thraxil
Yeah, that's why I'm not dogmatic about it. With network timeouts, external
systems that you can't control, and low level hardware access, sometimes it's
the best you can do. The better solution would be for the hardware/system you
are interacting with to publish an event or otherwise signal when it is or
isn't able to handle more load. If it wasn't designed with back pressure in
mind though, you do the best you can, and in your case, exponential backoff is
probably it.

Adding jitter to avoid dogpiling is another case where sleeping is perfectly
reasonable.

What I get wary about is the common pattern of: Make a call to some external
service. Sleep for some amount of time (to "let it finish"). Then continue
under the assumption that it has completed.

~~~
danieltillett
I think we are in massive agreement here. Sleep can be used as crutch
inappropriately, but when you have a broken leg a crutch is exactly what you
need.

~~~
ericfrederich
Agree that we're in agreement ;-)

------
greglindahl
Not only is this from 2014 and not labeled as such, but the shortened title is
linkbait. No system gets 100% uptime. This article is about eliminating one
small source of downtime for HAProxy systems. Nice, but not exactly a
revolution.

------
logician76
Why does HAProxy need reloading of config change, can't it load the new config
do a diff and apply only that? Or alternative ways of having a web frontend
for HAProxy to make changes to the config piecemeal without restarting?

~~~
bbrazil
There's a few small things you can do over the unix socket, but in general you
need to restart haproxy to change it's config.

------
weitzj
Also interesting: [http://inside.unbounce.com/product-dev/haproxy-
reloads/](http://inside.unbounce.com/product-dev/haproxy-reloads/)

------
thejosh
[2014] , posted here before.

