
Ask HN: What load balancing software do you use ? - whyleyc
I'm looking into software load balancers and so far the 4 that seem to standout are:<p>- Nginx (http://nginx.net/)
- Pound (http://www.apsis.ch/pound/)
- HAProxy (http://haproxy.1wt.eu/)
- Perlbal (http://www.danga.com/perlbal/)<p>Does anybody have experience using any of these, in live production environments (under reasonably heavy usage), and if so what pros and cons do you see with them ?
======
anotherjesse
I'm a huge fan of both Nginx and HAProxy - used together.

[internet] <-> [Nginx] <-> [HAProxy] <-> [app servers]

Nginx is a great webserver, but isn't a good load balancer. You can install a
patch that improves the balancer -
[http://brainspl.at/articles/2007/11/09/a-fair-proxy-
balancer...](http://brainspl.at/articles/2007/11/09/a-fair-proxy-balancer-for-
nginx-and-mongrel) \- but it still isn't as nice as HAProxy

With HAProxy the status of the system is visible. For the largest site I use
HAProxy on I keep my status page public - <http://userscripts.org/haproxy> \-
but it isn't required.

HAProxy is particularly good for rails since you can say each app server can
only have 1 request at any time. This makes requests queue at the HAProxy
layer, so if an app server has a request that takes a extra long time you
don't have requests waiting for that app server to finish - instead you wait
for the next available app server in a FIFO queue.

Combining HAProxy with munin gives great stats for tuning your system -
whereas just nginx with the patch had no visibility into where bottlenecks
might be.

I

~~~
whyleyc
Thanks - I got a basic Nginx config up and running on EC2 within a couple of
hours. I'm intrigued by the idea of using it in combination with HAProxy
though, so wondered if you didn't mind a few follow-up questions ?
Essentially:

\- Do you run the two products on the same physical machine ?

\- What does Nginx do that HAProxy doesn't ? (i.e. why not just stick with
HAProxy ?)

\- I noticed that Nginx has some weighting options for load balancing (see
<http://wiki.codemongers.com/NginxLoadBalanceExample>). Are these just not
sophisticated enough for your needs ?

~~~
anotherjesse
I do run them on the same machine for userscripts.org - although I have used
them on seperate machines on other projects - nothing public though :)

I use Nginx for things like serving from memcached (someone else commented on
that), rewrite rules, redirects, logging, ...

Also Nginx is good because I have 5 different virtual hosts on the box, so I
can proxy to different app servers based on hostname/location.

Here is my commented nginx config for userscripts.org:

    
    
      http://pastie.textmate.org/private/rmp4b9xpwevxnckumxmq0a
    

Here is my haproxy config for userscripts.org:

    
    
      http://pastie.textmate.org/private/rbxzpeu7e0gjkkc6r04ta
    

Nginx does have weighting options, but I was unable to get the affect of
having only 1 request hitting an app server at any time (it could be possible,
but switching to HAProxy was easy, and in a month I've not had a second of
down-time, and I restart all the app servers every hour via log rotate) Not
having request queue at the app server level helped, combined with HAProxy
pinging for health between requests has lead to faster requests, as queueing
at the app server level leads to delays if you get behind a slow request.

------
photomatt
Recently switch all of the WordPress.com load balancers to be nginx. They push
a little over a gigabit of traffic right now and about half a billion requests
per day, no sweat. We use Spread + Wackamole for failover, there's more info
on Barry's blog:

<http://barry.wordpress.com/2008/04/28/load-balancer-update/>

I wouldn't recommend DNS round robin for load balancing. (We did it for a
while, many problems and flaws in the approach.)

~~~
whatusername
I love how you casually mention "Half a billion requests per day"..

------
samueladam
[http://blog.emmettshear.com/post/2008/03/03/Dont-use-
Pound-f...](http://blog.emmettshear.com/post/2008/03/03/Dont-use-Pound-for-
load-balancing)

[http://www.igvita.com/2008/02/11/nginx-and-
memcached-a-400-b...](http://www.igvita.com/2008/02/11/nginx-and-
memcached-a-400-boost/)

------
thingsilearned
We use HAProxy because we're very session based (each user sees an entirely
different thing) and HAproxy was a good choice for that. I wrote a post on
setting it up a few weeks ago.

[http://leavingcorporate.com/2008/03/03/session-based-load-
ba...](http://leavingcorporate.com/2008/03/03/session-based-load-balancing-
with-haproxy/)

------
brianr
I'm using nginx in a couple different setups:

    
    
      nginx -> paste (pylons)

nginx on one machine, three other machines with 8 instances of paste each.
Nginx proxies directly to the paste port (which incidentally is also itself a
threaded server, but I've gotten best results by running several instance per
box). Volume has been as high as ~8mm dynamic requests/day.

    
    
      nginx -> lighttpd -> php-fcgi

nginx on its own box proxying to 8 app servers each running 160 php-fcgi
instances. Volume here is ~16mm dynamic requests/day.

Both have worked pretty well so far. As anortherjesse said, there's not a lot
of feedback, but it's done everything I need so far.

------
DenisM
Consider also using multiple DNS A records - the selection would be random
thus balancing the load. For example, do "nslookup google.com"

~~~
crescendo
Don't you forfeit too much control this way? For example, the load on your
servers would be determined by the caching behaviors of all the various DNS
servers and clients out there. I think this scheme should only be used as a
front line that leads to another layer of load balancers.

~~~
DenisM
The idea is that there is enough chaos in the Internet to produce practically
random result with sufficiently large number of users. Random is pretty good,
unless you must proactively shift load to least-loaded systems (in which case
you have to be super careful to avoid oscillations) .

One drawback is that you can't take system out of rotation as quickly, but
this can be somewhat mitigated by setting very low TTL (60 seconds).

------
SwellJoe
I've used pen, Squid, and LVS. All useful for different situations, though LVS
is just not practical for the vast majority of situations.

PerlBal looks really cool, as being written in Perl means it has some of the
same kinds of flexibility that Squid has (a good reason for Squid is that you
can write your own balancing algorithm in any language you like in a
redirector script--I always used Perl, or Python when I was working with the
Zope guys--so, you can actually do crazy stuff like choose the right server
based on keeping them "primed" for the content users are asking for based on
URL, or you can use destination URL hashing and achieve the same effect even
if you have millions of URLs). Squid also has experimental support for ESI
(Edge Side Includes) which is pretty awesome...build a page from disparate and
wholly unrelated servers using a simple templating system, and caching them. I
don't think any other Open Source product out there has ESI (experimental or
otherwise).

~~~
drusenko
How is LVS not practical? LVS is awesome... it's kernel-level and doesn't use
any resources, plus is very simple but flexible. Forget the ultramonkey
configurations and go with keepalived to handle the monitoring, failover, etc.

~~~
SwellJoe
"for the vast majority of situations"

In my experience, balancing works better with smarts, thus my big long tirade
about Squid (which is historically about the smartest option...though not
necessarily the fastest...but fast enough).

LVS has its place, and I've used it when the situation is appropriate, it just
isn't all that common that you really need to be able to saturate a Gb pipe,
and the installation, maintenance, and configuration burden is much higher for
a kernel level tool.

So, perhaps I should have been careful not to sound so dismissive of a tool
that I like (and I'll note that you can find posts by me on the LVS mailing
list from the first few months of the projects existence). My previous company
even sponsored development of several related management tools.

------
rcoder
I've been using Apache 2.2 with mod_proxy_balancer to do load balancing for
PHP and Rails apps for over a year now, and had pretty good experiences with
it. Since I support a lot of existing Apache servers, the configuration is
easy for me to work with, and the ability to do authn/authz and SSL
termination at the load balancer lets me keep the load down on my application
servers.

It's probably not as fast as Nginx, but I haven't found our load balancers to
be a bottleneck. In fact, we've been doing load balancing on a pair of really,
really wimpy servers (Celeron CPU and 512MB of RAM) running Apache on OpenBSD
for about a half-dozen different apps for the last year, and never seen the
average load climb up over about 0.5, even while handling upwards of 300
requests per second.

------
blader
Nginx: [http://highscalability.com/friends-sale-
architecture-300-mil...](http://highscalability.com/friends-sale-
architecture-300-million-page-view-month-facebook-ror-app)

------
swombat
We're hosted at EngineYard, which use nginx with a fair load balancer plugin
that ensures that new requests are assigned to free mongrel instances (yeah,
it's RoR).

Works great.

Daniel

~~~
anotherjesse
EngineYard uses nginx at the slice level and LVS at the site level (to balance
between your slices) - this is for my startup (not userscripts.org which is at
serverbeach)

------
holygoat
I like Pound very much: it's simple and robust.

However, I recently noticed a memory leak. We use healthchecks on our
production machines, which means a consistent rate of hits every few seconds,
24/7. After about 3 months, Pound had chewed up 1.7GB of RAM, which caused
memory usage alerts in our monitors.

Not a big deal -- you can always restart the process -- but I'm still
evaluating alternatives.

------
jrockway
I use perlbal, which is nice-n-simple. Add a few lines describing where your
backend servers are to the config file, save it, and start perlbal.

For our $WORK applications, we just have an apache that proxies to the backend
FastCGI apps. We don't have a ton of load, so that works fine. (We might be
switching to nginx, which is much simpler than Apache for this use case.)

------
subwindow
I'm using Pound right now in one large installation, and nginx in another
smaller installation, but it still has decent traffic.

I definitely prefer Nginx. It seems much faster, and has definitely given
fewer headaches. It seems like an issue with Pound crops up every few weeks.
My Pound config file is about 600 lines long now, and it is starting to get
unmaintainable.

------
mattculbreth
I´ve used nginx in Rails and Pylons environments with good success. These apps
don´t get much traffic so that wasn´t a consideration, but the ease and
simplicity of nginx is fantastic. You never have to do anything with it once
it's initially configured.

------
azsromej
I've used nginx and have been able to adapt old htaccess rules to get
everything I used to get with apache. It's good on memory and I've never had a
problem.

------
andy
Why use software? I have a hardware load balancer at Softlayer and for $99
bucks a month it's totally worth it.

~~~
modoc
Because Softlayer (as much as they rock) don't have failover for those load
balancers. You may not need it, but after we had our site downed due to
hardware failure of one of their load balancers, we moved to a redundant
software based setup.

We used HAProxy and Heartbeat2 to provide fully redundant load balancers. On
our smaller sites, we actually run the HAProxy and Heartbeat2 on two of the
web servers for that cluster, so you don't need dedicated hardware if you
don't want it. If you do this, and you're on softlayer, I'd recommend sending
the back-channel traffic along the internal network to avoid using 2X your
real bandwidth.

You can read about how to set it up here:

[http://www.digitalsanctuary.com/tech-blog/debian/13-steps-
to...](http://www.digitalsanctuary.com/tech-blog/debian/13-steps-to-peace-of-
mind.html)

------
merrick33
ultramonkey in front of apache / php / postgresql

It was very smooth to setup with debian, but my first experience setting it up
was with redhat and that was tortuous

~~~
jdavid
what is ultramonkey like? i saw that a while back, but it looked like the
project was no longer being updated.

~~~
merrick33
I combined it with heartbeat and it works as advertised

------
jawngee
We use hardware load balancers, Cisco CSS 11501's.

------
ubudesign
what is your app server? or is this only http?

~~~
whyleyc
http only - we're using Apache at the backend.

------
wensing
nginx

