
HAProxy vs. Nginx for load balancing (2016) - deathtrader666
https://thehftguy.com/2016/10/03/haproxy-vs-nginx-why-you-should-never-use-nginx-for-load-balancing/
======
moreentropy
I'm pretty much in love with Nginx' simplicity and capabilities as a swiss
army knife for all kinds of HTTP magic.

I had some doubts about Nginx' direction and feature development, but most
really great features (like stream proxy with SNI support) make their way into
the open source release. Built in monitoring sucks though. There are options
for better monitoring your requests with Prometheus.

This project implements prometheus metrics using Lua scripting inside Nginx:

[https://github.com/knyar/nginx-lua-
prometheus](https://github.com/knyar/nginx-lua-prometheus)

This project is indended as a sidecar service to nginx and receives access
logs and timings using syslog protocol (Nginx has a native syslog over UDP
capabilities) so no scripting inside Nginx is necessary:

[https://github.com/markuslindenberg/nginx_request_exporter](https://github.com/markuslindenberg/nginx_request_exporter)

(I'm the author of this exporter)

~~~
gshulegaard
Just curious but have you tried Amplify?

Disclaimer: I am on the Amplify team at NGINX.

~~~
dominotw
>NGINX Amplify is a SaaS application, currently hosted in AWS.

Is there a self hosted version?

~~~
gshulegaard
At the moment we do not have an on-prem or self-hosted solution.

Currently, we are working on the NGINX Controller which will have an embedded
Amplify within it to drive it's telemetry needs (and will be available in
addition):

[https://www.nginx.com/products/nginx-
controller/](https://www.nginx.com/products/nginx-controller/)

If you are interested in an on-prem version I would love if you could give it
a try. If it's something you think you would find useful in an on-prem
scenario then please leave us feedback in Intercom!

The more people that ask for it the more likely our product owners will
prioritize it :)

------
manigandham
Better options now:

 _Traefik_ : reverse proxy built in Go with dynamic backends and modern
integrations, good replacement for most situations.
[https://traefik.io/](https://traefik.io/)

 _Envoy_ : fast C++ L3/L7 proxy with some great features, http/2 support both
ways, websockets, grpc, advanced health checking and load balancing, lots of
metrics and integrations, my default recommendation now.
[https://envoyproxy.github.io/](https://envoyproxy.github.io/) \+ good
comparison page:
[https://envoyproxy.github.io/envoy/intro/comparison.html](https://envoyproxy.github.io/envoy/intro/comparison.html)

~~~
devmunchies
I can't take Traefik seriously with that silly gopher on their page. Seems as
if they are limiting it to the Go community with that move.

~~~
takeda
It just screams: "we wrote it in Go because we wanted to write something in
Go".

You supposed pick the right tools for a project not a project for the tools.

~~~
mdellabitta
To be fair, though, they're not the only project coming up in this space, and
Go is a legit choice for systems programming.

------
trjordan
I'm glad this has 2016 in the title! It's now 2017, and we have better
options.

Nginx has been the best routing and traffic serving system for a while now,
but it's showing its age. In addition to hiding metrics, HTTP 2 support is
difficult, there's no clear path to grpc, and (as mentioned in tfa) there is
no way to get stats out of it.

Envoy (envoyproxy.github.io) seems like the next thing. Nginx won because of
the predominant patterns at the time, and it's still totally fine for serving
lots of HTTP traffic. The problem is that if you're thinking microservices or
rely on metrics to do your job or have elastic infrastructure, Envoy's feature
set is the only real way to handle traffic routing in that world.

Disclaimer: I work for turbinelabs.io, and we're moving from nginx to Envoy to
take advantage of a lot of these features to power our product. We've put a
lot of effort into that decision, and believe it's the right one.

~~~
nailer
How can I manually switch an environment over from green to blue? I'm reading
[https://envoyproxy.github.io/envoy/operations/admin.html](https://envoyproxy.github.io/envoy/operations/admin.html)
but I can't see it.

~~~
trjordan
You don't call Envoy to make changes. You either update the config file, or
you implement an API server with a specific set of endpoints, then have it
serve the updated values back. It's a bit tricky, but entirely doable.
[https://envoyproxy.github.io/envoy/intro/arch_overview/dynam...](https://envoyproxy.github.io/envoy/intro/arch_overview/dynamic_configuration.html)

(You can do this in nginx, too, and it's similarly annoying. This is a big
part of what we do at Turbine Labs -- a UI with a slider between blue and
green, and a place to host all those stats, so you don't have to manage it!)

~~~
nailer
I'd use Houston in a heartbeat over nginx (currently flying blind, but need it
for HTTP/2) but it's hard to compete with free. It's hard to read what your
licensing/support model is like without signing up. Considered a Red Hat like
support model?

------
jimjag
Could I also recommend Apache httpd? Dynamic config; failover; stats; RFC
compliance; etc...
[https://httpd.apache.org/docs/2.4/howto/reverse_proxy.html](https://httpd.apache.org/docs/2.4/howto/reverse_proxy.html)

------
c0l0
I must admit I'm not that huge of a fan of nginx myself, either - The "easy to
read, easy to reason about, declarative configuration"-premise breaks down
pretty hard when your application setup gets moderately complex. There's just
too much obscure magic woven into innocent-looking configuration directives,
and as a result of this, more than half of the world gets things wrong (by
(ab)using "if" and the like). On top of that, as the article states,
introspection into what is going on is laughable in the FOSS release.

Still, we use nginx to terminate https at work, and it's doing a very fine job
at that - much better than pound, which we used before. If it weren't for
Varnish directly behind it, however, it would very often be supremely
difficult to debug what's actually going on between our HTTP backends and UAs
if the need arises.

~~~
bazzargh
I hit this problem too - we had >100k lines of nginx config. My answer was to
write a tool to simulate what happened to a request, logging the decisions
made each step. It was limited in what config I'd programmed it to understand
but did most of what we needed (rewrite rules, access control, proxy_pass
...). I also intended it to be used for regression testing - you could set up
a yaml file with requests and specify eg, return this proxied response if the
proxied request headers match X.

Other efforts I'd seen at automated testing of nginx config relied on testing
in a VM, but of necessity used different listen addresses etc. They looked
complex to set up and weren't even testing the config as shipped...that's what
I was aiming for.

Since then we've moved away from that config, and I'm in a different
role...keep meaning to see if we can open source that code. However, nginx
being such a moving target, with lua config and such, I was never sure if it
would find an audience.

~~~
patmcguire
Docker with docker compose is good for this. Nginx can listen on ports 80 and
443, other containers on the same network can reach it on those ports. I have
a setup which takes env vars, so it's not quite the same as shipped, but I can
test the whole process of getting a cert from LetsEncrypt, for example.

~~~
bazzargh
(late reply but...) we did actually look at that; someone had a blog post
about using docker for testing nginx when we did this (~3yr ago). You hit a
number of problems. Our nginx listened on multiple addresses and ports, which
weren't equivalent - external https/http arrived on one address, internal
traffic from across our clusters arrived on another, with different handling.
So, we have to have docker listen on multiple addresses too, or in multiple
instances. Then we have the outgoing requests, proxied to various haproxy,
varnish instances, themselves on various ports. You can get round this by
messing with dnsmasq and routing on your docker host...or avoid that by
reorganizing all of the outgoing config so that they talk to a stub server
with canned test responses. But, we had thousands of lines of handwritten
nginx config as well as tens of thousands of lines of generated config...this
was going to be a major effort.

And even with all that effort, it didn't give us enough information. You'd try
a request on some config and scratch your head to figure out why it didn't end
up being processed in the right place. Run it in the simulator though, and
you'd find the exact line that had rewritten your request so it didn't reach
the line of config you thought was going to kick in...

Probably the _best_ approach would have been to patch nginx itself to add the
logging we wanted, and have it handle all of those external addresses
differently. The challenge there would have been getting the same depth of
traceability with nginx doing things like converting rewrite rules to
bytecode. But, you could probably get 80% of the way there and avoid bugs from
our system not accurately re-implementing their algorithms.

------
SkyRocknRoll
We moved away from nginx to envoy
[https://envoyproxy.github.io/](https://envoyproxy.github.io/) . With envoy
you get best of both the world with hot reload. Envoy excels in metrics as
well as remote configuration.

~~~
pknopf
Website isn't clear, is Windows supported?

~~~
Sevii
I don't think even macOS is supported. Linux only for compilation.

~~~
trjordan
We at Turbine Labs have been contributing to Envoy recently, and it now builds
on MacOS!

[https://github.com/envoyproxy/envoy/issues/128](https://github.com/envoyproxy/envoy/issues/128)

------
a012
This is just a comparison of metrics monitoring capable between HAproxy vs.
nginx. Nothing related to load balancing capability.

~~~
user5994461
Yes, they have the same features mostly.

The difference is that HAProxy has a monitoring page to see what is configured
and where connections are going. Nginx is a black box unless you pay them ~
$2000 for the pro edition.

Monitoring is feature #1 of a load balancer.

~~~
vacri
I would have thought that effective load balancing would be feature #1 of a
load balancer...

~~~
fredsted
Considering "it works" as a feature sets the bar really low.

~~~
blowski
Given two load balancers, one can load balance 20K requests per second but has
good monitoring, the other can load balance 500K requests per second but has
bad monitoring. Which is the better load balancer?

Note that I'm not implying these figures are true of either HAProxy or nginx,
just pointing out that the OP is probably correct that the number one feature
of a load balancer is its ability to load balance. Good monitoring makes it
easier to use, though.

~~~
StreamBright
You have two airplanes, one can carry 100 passengers and you can see where you
are flying; the other can carry 1000 but you are flying blind.

Monitoring is essential to any infrastructure, regardless of its scale.

~~~
vacri
> the other can carry 1000 but you are flying blind.

This is a bad analogy, because 'flying blind' in a plane means you can't use
it effectively. Being able to see out of a plane isn't a 'bonus'; it _can 't_
be used without this feature. Whereas plenty of people _can_ and _do_ use a
load-balancer without heavy monitoring. (not to mention that there are strong
moves _towards_ self-driving transportation at the moment)

The analogy also doesn't work because a loadbalancer that can handle 1000
units is going to have far fewer problems that are in need of diagnosis than
one whose limit is 100 units; when the traffic level is between 100 and 1000.
When the traffic is at 100, then yes, you're going to want monitoring on the
100-unit LB, because you're trying to squeeze extra performance out of it.
Whereas the 1000-unit one isn't even sweating at that point.

~~~
user5994461
A status page is by no means "heavy monitoring".

------
rndmio
While the OP is about metrics when it comes to load balancing IPVS from the
LVS project
[http://www.linuxvirtualserver.org/software/ipvs.html](http://www.linuxvirtualserver.org/software/ipvs.html)
is often overlooked. If you're trying to balance a lot of connections/traffic
(and aren't/don't want to buy a hardware load balancer) you'll get much better
performance with IPVS than HAProxy or nginx.

~~~
linsomniac
I've used IPVS a lot in the past and it worked well in a lot of ways. We had a
number of problems with ldirectord getting wedged after months of uptime, and
the tools are all a little primitive, but IPVS itself works quite well. The
amount of traffic it can handle, especially in the "direct routing"
configuration, is stunning.

But, I can't imagine going back to it from haproxy. Haproxy allows so many
other things including nice status page, agent and service checks, SSL
termination, great monitoring (we use Influxdb/telegraf/grafana and Icinga2),
and really advanced routing of requests (SNI, path, pretty much anything in
the requests).

~~~
rndmio
I guess it depends what you're load balancing, you do get pool checks from
ipvs and node_exporter has an ipvs collector if you're using prometheus (we
are). As you say the amount of traffic it can handle in comparison in stunning
and especially if you're load balancing over a large number of ports it's less
to configure than HAProxy, but it works at level 4 so if you need HTTP stuff
it's not much help.

------
mindcrash
Recently discovered Traefik ([https://traefik.io/](https://traefik.io/)) which
is a highly performant reverse proxy written in Go. Supports just about every
popular backend in existence (Docker, Kuber, Eureka, Consul and whatnot),
supports metrics through statsd, Prometheus, Datadog or RYO via REST. Has a
pretty good Admin UI but also comes with a REST API to integrate it in your
own.

Also it is entirely free and open source. Yes you are reading it correctly:
Within your edge environment you are able to get pretty much all the features
of a $2500+ per instance NGINGX Plus deployment with equal performance for
nothing but a little investment in time (unless you need/want commercial
support, ofcourse).

~~~
trjordan
Do you know if it's been deployed at scale anywhere?

~~~
mindcrash
Not widely.

Cap Gemini UK apparently is using it within projects [1], GitLab is using it
within their infrastructure [2] and Reevoo is using it aswell [3].

Katacoda [4] and Play-with-Docker [5] both have sandbox environments available
which allow you to play with it.

[1] [https://hackernoon.com/kubernetes-ingress-controllers-and-
tr...](https://hackernoon.com/kubernetes-ingress-controllers-and-
traefik-a32648a4ae95)

[2] [https://about.gitlab.com/2017/07/11/dockerizing-review-
apps/](https://about.gitlab.com/2017/07/11/dockerizing-review-apps/)

[3]
[https://www.youtube.com/watch?v=aFtpIShV60I](https://www.youtube.com/watch?v=aFtpIShV60I)

[4] [https://www.katacoda.com/courses/traefik/deploy-load-
balance...](https://www.katacoda.com/courses/traefik/deploy-load-balancer)

[5] [http://training.play-with-docker.com/traefik-load-
balancing](http://training.play-with-docker.com/traefik-load-balancing)

~~~
sytse
[2] was a blog post by a guest author, as far as I know we don't use Traefik

------
justaaron
haproxy is the real deal. \- actual proxy server, not a 'freemium' webserver
($2500 for nginx commercial license) \- massive capacity (300k/s without
breaking a sweat on one process) \- ability to config for 'high-availability'
(hence the name) with something like keepalived and a 'floating' IP assignment
\- single config file for all: master it (it's frikkin' yaml, aka 2 spaces
indentation) and you master the world...

~~~
duozerk
It's also amazingly simple to configure (though to be fair that applies to
nginx as well, especially coming from apache) and well documented; its
performances are great too.

------
mosselman
This is a fair point. The title could have been 'I recommend HAProxy over
nginx because nginx monitoring costs $2000/year'

~~~
bfred_it
And The Great Gatsby should have been called “He dies at the end”

Conclusions don’t have to be in the title

~~~
aidos
Nooooooooo! That was totally on my to-read list... :-(

~~~
StavrosK
(He doesn't actually die, the GP was making a point)

~~~
aidos
Nooooo! I said it non-obvious-half-tongue-in-cheek. (Though it is on my to-
read list)

~~~
StavrosK
Well I don't know, I haven't read the book, I was just trying to put the
spoiler back in the bottle for you! :(

------
merb
what's sad about haproxy that it still does not support H2 (HTTP/2). I mean a
load balancer should at least support it on the front end.

Well haproxy is still great and one of the best load balancers out there. But
it's still sad that it misses h2 on the frontend. (Well backend would be cool,
but there aren't that many load balancers that can do that anyway). I'm pretty
sure haproxy 1.8 will at least give experimental support! which would be
awesome.

~~~
_joel
Are you sure? [https://cbonte.github.io/haproxy-
dconv/1.7/intro.html](https://cbonte.github.io/haproxy-dconv/1.7/intro.html)

" \- TLS NPN and ALPN extensions make it possible to reliably offload
SPDY/HTTP2 connections and pass them in clear text to backend servers"

~~~
merb
it supports the TLS ALPN extension which means tcp (haproxy) <-> backend
server or nginx <-> backend server over h2

------
AtticusTheGreat
I only skimmed the article but I agree with the conclusion. I evaluated both
HAProxy and Nginx for a high-volume/low-latency load balancer cluster (500k
requests/s) and HAProxy simply beat the pants off of Nginx. It has an
extensive and well documented set of configuration options, tooling, and
reporting, and it performed flawlessly on production (after much toiling). I
couldn't ever quite get Nginx to handle the same load without falling apart at
the seams.

------
bobfromhuddle
We use nginx-vts ([https://github.com/vozlt/nginx-module-
vts](https://github.com/vozlt/nginx-module-vts)) to get metrics out of our
nginx servers, and then we send them over to our Riemann stack with a Collectd
module: [https://github.com/bobthemighty/collectd-
vts](https://github.com/bobthemighty/collectd-vts)

The vts module has been updated in recent months to support more granular
reporting of request times, but it's had everything we need for a while.

It exposes metrics either as an html page or a json api.

~~~
Plugawy
This perfect - thanks for sharing. I looked at nginx alternatives like
traefik, envoy and haproxy and none of them have an equivalent of auth module
[http://nginx.org/en/docs/http/ngx_http_auth_request_module.h...](http://nginx.org/en/docs/http/ngx_http_auth_request_module.html)

So with vts I can finally replace last instance of haproxy which serves as a
core LB between all services (because it has stats).

------
blowski
Whenever I see an article that says “you should NEVER do x” I typically treat
it with a large pinch of salt.

Plenty of people, myself included, use nginx as a load balancer and it’s
absolutely fine. I’m sure there are lots of use cases where HAProxy would be a
better choice, but there are also many use cases where nginx is at least as
good a choice.

------
bowersbros
Haproxy supporting `.map` functionality is the biggest selling point to me.

It allows for SSL protected custom domains for a user of a SaaS app in a
really easy to do way.

We use it at Shopblocks extensively, and I've written a (brief) outline of
roughly how we do it.

[https://zando.io/post/haproxy-saas-custom-
domains/](https://zando.io/post/haproxy-saas-custom-domains/)

------
rfraile
Thanks Willy Tarreau and all the developers who make this excellent piece of
sofware and share it for free.

------
folex
Anyone knows if HAProxy can terminate Secure WebSocket? I've tried it about a
year ago without any luck, maybe something changed?

~~~
dpatriarche
Secure websockets (wss://...) works fine for my company's service that runs
behind haproxy (version 1.6).

------
sandGorgon
haproxy is able to preserve source ip by injecting proxy protocol. nginx is
capable of reading proxy protocol but not injecting it.

which is why haproxy is perfect as the ingress for your web application.
Especially if you are using something like kubernetes that will use an overlay
network.

unfortunately, the momentum in kubernetes is to leverage nginx as an ingress.

~~~
mbubb
"unfortunately, the momentum in kubernetes is to leverage nginx as an
ingress."

Moreso than the GCP loadbalancing or the relatively new AWS transit
loadbalancers?

~~~
sandGorgon
That is even bigger a problem for those that do bare metal deployment. You are
right - k8s is fairly built out as a post cloud load balancer system.

However, in the limited work happening on lb increases, it's nginx rather than
haproxy.

------
INTPenis
As with all things, it's not black & white.

If I'm load balancing http or terminating https I'm not going to use HAproxy.
HAproxy is very powerful but overkill.

~~~
jsjohnst
Why wouldn’t you for those two examples? Those are both great use cases for
HAProxy and only require like a twenty line config to be up and running.

~~~
INTPenis
* I'm not as experienced in HAproxy as I am with nginx

* I have a perception of nginx being slimmer than HAproxy and more focused on http/https than HAproxy

* Related to the aforementioned point; I have a perception of HAproxy being able to handle many OSI layers compared to nginx which only handles one.

~~~
orthecreedence
> I have a perception of nginx being slimmer than HAproxy and more focused on
> http/https than HAproxy

Oddly enough, I have the exact opposite perception. HAProxy doesn't even
support logging to files because that would "block" the event loop, so it
delegates to syslog. That seems fairly slim to me. The entire project is
written with no compromises to support extreme numbers of connections
efficiently. Nginx can do this as well, but has to be tuned a lot more to get
to the same place, and I'm convinced if both were properly tuned and put in
the same environment, HAProxy would come out ahead.

That said, if you're at a place where you already need/use Nginx, it might not
make sense to use HAProxy if you can re-use your existing Nginx instances.
However if you're in the market for a load balancer and Nginx isn't being used
already, definitely check out HAProxy. It's pretty simple to configure.

EDIT: Another point for HAProxy: Recently, the job system we run (beanstalkd)
was slowing to a crawl. I needed to find out where the slowdown was. The
easiest way was to route the worker connections through HAProxy and enable
logging and the HTTP stats page. I was able to determine the problem was in
beanstalkd itself, not our workers (or a network issue). Sure there are a
number of debugging proxies out there, but HAProxy was already installed and
the logs it emits are the perfect balance of information so you can pick
things apart without being overwhelmed. It really helped. Nginx wouldn't have
been as useful, and would have taken a much longer time to configure.

~~~
jsjohnst
> Oddly enough, I have the exact opposite perception.

Ditto, especially if you consider the OpenResty "variant" of Nginx.

Also, I realize Nginx does a good job of being a load balancer / reverse
proxy, but to me it will always be a "web server" first. HAProxy on the other
hand is built and optimized specifically to be a "load balancer / reverse
proxy" first.

------
arca_vorago
I feel like often a reverse proxy can do the majority of what a load balancer
can do just fine. I have been using Hiawatha webserver (gplv2, like HAproxy)
which has some great security features and works very well as a reverse proxy.

If you haven't check it out, I really like Hugo's work. Security first, good
looking code.

[https://github.com/hsleisink/hiawatha/blob/master/src/rproxy...](https://github.com/hsleisink/hiawatha/blob/master/src/rproxy.c)

------
mschuster91
nginx, while being a black box, can at least do logging. The haproxy logging
in docker can't log to stdout since over three years without a workaround
involving a linked rsyslogd
([https://github.com/dockerfile/haproxy/issues/3](https://github.com/dockerfile/haproxy/issues/3)).

On the other hand, I like configuring a haproxy way, way more than configuring
a nginx server. It's easier, and more concise to read.

~~~
user5994461
They both can do logging.

nginx has issues with formatting numbers/string and undefined values, which
makes it harder to extract meaningful data from log. haproxy is easier to deal
with.

Docker abstract away the logging facilities and the filesystem. It's a general
issue with Docker, not the software. See the 4th answer in your issue for a
solution.

~~~
mschuster91
> Docker abstract away the logging facilities and the filesystem.

No it does not, it expects stdout and stderr to be used and stores its output
in a logfile to be replayed with "docker logs" command.

First solution in issue: mount /dev/log. Does not work when your host system
is OS X or Windows, as you don't have access to the VM that Docker internally
uses on non-linuxes. Also, on Linux it pollutes the syslog instead of docker
log.

Second (the one that refers to a gmane post from 2010) is essentially the same
as #1.

Third, rsyslog on host (or in a docker image): does not use docker log
facility, and docker link is weird.

Fourth, use the alpine image which has a syslogd: works, maybe, but is subject
to changes in the Dockerfile (e.g. when the inner path of haproxy changes).
Also I don't touch Alpine images because I can't use the standard
Debian/Ubuntu tooling in case there are network/firewall issues that can best
be debugged from inside the container by docker exec -it haproxy bash.

~~~
user5994461
Yes, it does. Docker abstracts away the network and the filesystem.

There are many way to write logs, to files, to syslog, to unix socket, to
stdout, to UDP sockets. Docker screws with most of them.

~~~
mschuster91
> Docker abstracts away the network and the filesystem.

Not neccessarily the network, if you're using --net=host.

Docker does not screw with writing logs to files, to stdout or via UDP
networking. When it comes to unix sockets or syslog, depends if the target is
inside the container (which works fine AFAIK) or outside (e.g. due to bind-
mounting /dev/log or such), but I have not tried the latter.

------
rdsubhas
A load balancing comparison article that says nothing about load balancing.

Weird.

------
herf
RAM per connection is quite big with HAProxy, so if you're simply load-
balancing lots of connections, something like nginx has a smaller footprint.

------
felipelemos
Stats on HaProxy is still problematic when nbproc is bigger than 1
(multiprocessing).

------
ryanqian
None of them have http2 multiplex load balance support yet! None of them.

~~~
manigandham
[https://envoyproxy.github.io/](https://envoyproxy.github.io/)

------
meche123
TLDR; this whole article is a joke. The guy is not capable of extracting stats
out of his nginx instance, so "Conclusion: Avoid nginx at all costs". End of
article.

