
Let's Create a Simple Load Balancer with Go - UkiahSmith
https://kasvith.github.io/posts/lets-create-a-simple-lb-go/
======
mholt
Fun stuff!

If you like this kind of thing, we are developing a very powerful and flexible
reverse proxy with load balancing into Caddy 2:
[https://github.com/caddyserver/caddy/wiki/v2:-Documentation#...](https://github.com/caddyserver/caddy/wiki/v2:-Documentation#httphandlersreverse_proxy)

It's mostly "done" actually. It's already looking really promising, especially
considering that it can do things that other servers keep proprietary, if they
do it at all (for example, NTLM proxying, or coordinated automation of TLS
certs in a cluster).

If you want to get involved, now's a great time while we're still in beta!
It's a fun project that the community is really coming together to help build.

~~~
Twirrim
Any thought or plans for some kind of back-pressure? Health checks and
response times are useful, to a degree, but there are a number of workloads
where they don't actually capture the cost of the work involved, and they can
also really trip you up something nasty under certain failure conditions :D

edit: by way of example, I used to work for a service that customers would
upload files to, that's all that the traffic was. There was wild variability
in the size, processing cost, and upload speed of each request. None of the
standard load-balancing approaches really balance "load" from a service
perspective. While things worked, it was rarely optimal.

~~~
mrkurt
HAProxy has a built in way to do some of this -- you can setup an agent check
that lets you dynamically adjust weights: [https://cbonte.github.io/haproxy-
dconv/2.0/configuration.htm...](https://cbonte.github.io/haproxy-
dconv/2.0/configuration.html#5.2-agent-check)

I think most proxies are planning to move logic like this to the control
plane. Envoy's gRPC stuff has some ways to dynamically throttle traffic to
backends.

Load balancers really need to become programming runtimes, imo. Config
languages aren't very expressive, and almost everyone needs their own logic at
the LB level.

I _just_ put together a demo of latency based load balancing using HAProxy +
awk and it's neat, but still very rudimentary compared to what I could express
in, say, JavaScript: [https://github.com/superfly/multi-cloud-
haproxy](https://github.com/superfly/multi-cloud-haproxy)

~~~
mholt
Caddy 2 has an embedded scripting language that allows this. We have to flush
it out some more but it's looking really good.

In some of our early testing on basic workloads, we found that it's up to 2x
faster than NGINX+Lua, largely because it does not require a VM. (This is a
broad generalization, and we need to specifically optimize for these cases --
but this approach holds promise.)

~~~
mrkurt
Oh neat! What language did you all settle on?

------
manigandham
Seems like most of the work is done by the `ReverseProxy` package and this
code is more about health checking.

Nice to see how simple it is now though. Go is definitely a great choice for
low-level networking, and .NET Core has recently become a great option as
well.

~~~
4gotunameagain
I get what you mean and I agree, but this is far from low level networking

~~~
snypox
I’m also a bit confused where is .NET Core coming to the picture. It’s
becoming faster and faster but it’s definitely not the best tool for load
balancers.

~~~
manigandham
Why not? The latest techempower results show ASP.NET Core saturating a 10GbE
network card with 7M+ req/sec while being a full-featured framework, compared
to custom C++ web servers. [1]

The memory management, type safety, and high-level productivity make it great
for building load balancers and other infrastructure components.

1\. [https://www.ageofascent.com/2019/02/04/asp-net-core-
saturati...](https://www.ageofascent.com/2019/02/04/asp-net-core-
saturating-10gbe-at-7-million-requests-per-second/)

~~~
tomohawk
You'd have to know how much memory and cpu was being used vs the other
solutions. They're all very close in terms of the key metric, but if Rust is
using 50% of the memory, then its not much of a competition. Also, what sort
of vm tuning, etc.

~~~
manigandham
True but you can dive into the benchmarks more if you visit the techempower
site. They run it on the same standardized hardware and in this .NET is very
competitive on cpu and ram.

------
sciurus
It's nice to see a walkthrough of what goes into a load balancer and how
simple it is to build on in Go.

One nitpick is that the autho reversed the meaning of active and passive
health checks. Active generates new traffic to the backends just to determine
their healthiness, passive judges this based on the responses to normal
traffic.

------
arberavdullahu
I found it more convenient to keep the field unnamed when using mutex in a
struct. So in the example that would be

type Backend struct {

    
    
      URL          *url.URL
    
      Alive        bool
    
      sync.RWMutex
    
      ReverseProxy *httputil.ReverseProxy
    
    }

~~~
mutatio
The problem with that is that the methods for `sync.RWMutex` spill into
`Backend`.

~~~
thegeekpirate
That isn't a problem—in fact, that's the entire point of arberavdullahu's
suggestion.

~~~
fiveturns
Still seems like a problem to me. It breaks encapsulation.

The mutex is only used inside of SetAlive() and isAlive(), they're the only
things this need to handle locking and unlocking. You don't want anything
external to that calling the methods on RWMutex.

~~~
thegeekpirate
Oh of course, if you're not using it then don't expose it.

I haven't read the code so I can't verify if that's the case here, but I read
OPs post as being worried about method clobbering (which is really a non-
issue, if the popularity of embedding mutexes shows us anything).

------
knicknic
With talk of proxy and go. I am surprised gobetween hasn’t been mentioned yet
[https://github.com/yyyar/gobetween](https://github.com/yyyar/gobetween) .
Last time I looked at the source it was very approachable.

------
hinkley
Load balancer seem like one of those problems that engineers should be cutting
their teeth on.

And yet we have only a handful and one of the most popular charges money for
cool features and does not appear to have an ABI for addons.

~~~
sciurus
> And yet we have only a handful

Apache, nginx, haproxy, envoy, traefik, and probably dozens more that aren't
on the top of my head.

~~~
nkozyra
You literally listed a handful ;)

~~~
SteveNuts
Fabio, IIS, Varnish, Kong, Squid, lighttpd, emacs (probably).

~~~
lugg
[https://stackshare.io/stackups/emacs-vs-
haproxy](https://stackshare.io/stackups/emacs-vs-haproxy)

------
kitd
This is pretty cool. But I think an implementation that avoids the mutexes
(mutices?) when allocating the backends and uses channels instead would
probably perform better.

2 channels needed, 1 for available backends and 1 for broken ones.

On incoming request, the front end selects an available backend from channel
1. On completion, the backend itself puts itself either back onto channel 1 on
success, or channel 2 on error.

Channel 2 is periodically drained to test the previously failed backends to
see if they're ready to go back onto channel 1.

~~~
biggestdecision
Channels do not avoid mutexes, they are implemented with them under the hood.

~~~
kitd
Sure, but with the above, there's 1 mutex to read the channel, not 1 per
backend. And a single thread reading the channel.

Plus you know that if the backend is on the channel, you know it's ready to
accept, and you don't need the alive flag with synchronized access.

~~~
biggestdecision
> there's 1 mutex to read the channel, not 1 per backend

Fewer mutexes isn't necessarily an advantage, there will be far higher
contention on that single mutex. Channel writes also require a lock.

~~~
kitd
One to read, and a single reader thread. That's no contention.

One to write, that's used for writing only, not both read/write as in the
published design.

Edit: it's a variation on the following, but for ReverseProxies, not worker
goroutines [http://marcio.io/2015/07/handling-1-million-requests-per-
min...](http://marcio.io/2015/07/handling-1-million-requests-per-minute-with-
golang/)

------
westoque
> Multiple clients will connect to the load balancer and when each of them
> requests a next peer to pass the traffic on race conditions could occur.

I quite don't understand what this means? What race conditions? Can anybody
explain? Thanks.

~~~
therealdrag0
I haven't really read it but I'll take a stab (with psuedo code). I think
NextIndex() is incrementing s.current and modding it with servers.length.

To do this, there are three operations, SET s.current to +1, and GET s.current
so that it can be MODDED with servers.length.

If that SET and GET are not coordinated between threads, then a race could
sour the index. For example, if two threads call this method at the same time,
they could both SET the +1 before either GETs the value, then they will both
get the same value +2 from where it started instead of +1 for each caller.

~~~
westoque
Thanks for the explanation. Forgive my lack of knowledge, but can it not be
solved by using parallelism instead of concurrency/threads? I thought Go has
first class primitives to be able to do this?

~~~
therealdrag0
> parallelism instead of concurrency/threads.

What do you mean? These seem synonymous to me.

> I thought Go has first class primitives to be able to do this?

TBH I'm only 1 week into learning Go myself (coming from Java/Scala/JS/etc).
But it looks like the article used what Go offers. They used the atomic
package which says, "provides low-level atomic memory primitives useful for
implementing synchronization algorithms."

[https://golang.org/pkg/sync/atomic/](https://golang.org/pkg/sync/atomic/)

~~~
westoque
> What do you mean? These seem synonymous to me.

What I meant was similar to this talk: [https://blog.golang.org/concurrency-
is-not-parallelism](https://blog.golang.org/concurrency-is-not-parallelism)
where there is also a load balancer example around 20:20 and using go channels
(the primitive I’m pertaining to)

~~~
biggestdecision
Channels aren't a way to avoid mutexes/locking, they are just meant to be
simpler to write & reason about. Channels are implemented using mutexes under
the hood.

------
noah-kun
Great tool here:
[https://github.com/superfly/wormhole](https://github.com/superfly/wormhole)

------
rndmio
If you don't need http smarts on the load balancer LVS is a great option that
will give you fantastic performance.

------
kasvith
Thanks for sharing the article and thanks all for the feedbacks :)

------
AdieuToLogic
The author states:

> After playing with professional Load Balancers like NGINX I tried creating a
> simple Load Balancer for fun.

And while nginx[0] certainly can perform in this role, another production
quality load balancer is HAProxy[1]. Both can do more than this, of course.

Reinventing solutions "for fun" certainly can be educational and help others
learn key concepts, but the author should clearly state what they are doing is
not meant to replace production quality solutions.

0 - [https://www.nginx.com/](https://www.nginx.com/)

1 - [https://www.haproxy.com/solutions/load-
balancing/](https://www.haproxy.com/solutions/load-balancing/)

~~~
lordleft
Doesn't 'for fun' clearly indicate that this isn't designed to replace a
professional-grade solution though?

~~~
AdieuToLogic
> Doesn't 'for fun' clearly indicate that this isn't designed to replace a
> professional-grade solution though?

Depends on the person and their understanding of English colloquialisms I
suppose.

What might be less misunderstood is:

    
    
      This is an educational project and is not
      meant for use in production systems.
    

Granted, native English speakers likely will infer this project is
research/educational in nature and benefit.

But why leave it to chance?

~~~
DuskStar
Well the repo description is "World's most dumbest Load Balancer", which might
count for something

~~~
AdieuToLogic
> Well the repo description is "World's most dumbest Load Balancer", which
> might count for something

Amongst the synonyms for "dumb"[0] is "simple" and is used commonly as such.

A definition of "simple" is[1]:

    
    
      easy to understand, deal with, use, etc.:
      a simple matter; simple tools.
    

The GitHub repo[2] has as its project URI the leaf segment "simplelb"[2].

The first non-title line of the GitHub repo[2] reads thusly:

    
    
      Simple LB is the simplest Load Balancer
      ever created.
    

Nowhere in the README.md are the words "educational", "production",
"research", or "fun."

All of this is to say, what "might count for something" may not be what the
author intends.

0 - [https://www.merriam-webster.com/thesaurus/dumb](https://www.merriam-
webster.com/thesaurus/dumb)

1 -
[https://www.dictionary.com/browse/simple](https://www.dictionary.com/browse/simple)

2 -
[https://github.com/kasvith/simplelb/](https://github.com/kasvith/simplelb/)

~~~
hombre_fatal
Some bizarre hair-splitting here.

Besides, even if a beginner somehow wanders into the project, mis-IDs the
project, and uses on their beginner server, then it sounds like a good
learning experience for them even though the README didn't give them
permission to learn by neglecting to include the word "educational", god
forbid.

Is this the catastrophic scenario you have in mind?

~~~
AdieuToLogic
> Some bizarre hair-splitting here.

I was trying to be explicit as to my reasoning. If that came across as "hair-
splitting" then I suppose I failed to adequately do so.

The whole point I was trying to make is that people find code in all sorts of
ways. And my opinion is that if a public repo, such as GitHub, has a project
which could easily be both desired (due to need) and misused (due to intent),
then it might be a good idea to put a simple declaration in the project's
README.

Given all the blow-back this concept has incurred, I would think the concept
is either wholly immaterial or now proven as needed.

~~~
grepthisab
Relevant username?

This seems to be a strange hill to die on. For all intents and purposes there
is a declaration in the readme. And anyone who knows enough to want to operate
with this will see that it's fairly basic. Others will likely just reach for a
more generic or battle tested solution.

In any case, people should be able to do what they want with their repos and
code, assuming legality of course.

