Hacker News new | past | comments | ask | show | jobs | submit login
Let's Create a Simple Load Balancer with Go (kasvith.github.io)
366 points by UkiahSmith on Nov 9, 2019 | hide | past | favorite | 78 comments



Fun stuff!

If you like this kind of thing, we are developing a very powerful and flexible reverse proxy with load balancing into Caddy 2: https://github.com/caddyserver/caddy/wiki/v2:-Documentation#...

It's mostly "done" actually. It's already looking really promising, especially considering that it can do things that other servers keep proprietary, if they do it at all (for example, NTLM proxying, or coordinated automation of TLS certs in a cluster).

If you want to get involved, now's a great time while we're still in beta! It's a fun project that the community is really coming together to help build.


Any thought or plans for some kind of back-pressure? Health checks and response times are useful, to a degree, but there are a number of workloads where they don't actually capture the cost of the work involved, and they can also really trip you up something nasty under certain failure conditions :D

edit: by way of example, I used to work for a service that customers would upload files to, that's all that the traffic was. There was wild variability in the size, processing cost, and upload speed of each request. None of the standard load-balancing approaches really balance "load" from a service perspective. While things worked, it was rarely optimal.


HAProxy has a built in way to do some of this -- you can setup an agent check that lets you dynamically adjust weights: https://cbonte.github.io/haproxy-dconv/2.0/configuration.htm...

I think most proxies are planning to move logic like this to the control plane. Envoy's gRPC stuff has some ways to dynamically throttle traffic to backends.

Load balancers really need to become programming runtimes, imo. Config languages aren't very expressive, and almost everyone needs their own logic at the LB level.

I _just_ put together a demo of latency based load balancing using HAProxy + awk and it's neat, but still very rudimentary compared to what I could express in, say, JavaScript: https://github.com/superfly/multi-cloud-haproxy


Caddy 2 has an embedded scripting language that allows this. We have to flush it out some more but it's looking really good.

In some of our early testing on basic workloads, we found that it's up to 2x faster than NGINX+Lua, largely because it does not require a VM. (This is a broad generalization, and we need to specifically optimize for these cases -- but this approach holds promise.)


Oh neat! What language did you all settle on?


F5 BIG-IP can be programmed in Tcl. While it is a programming language, I have only seen it programmed by non-developers. With copy paste code, repeated string constants all over the place and no unit tests.

I agree that load balancer a do need the expressiveness of programming languages, but ideally that would only be with some typing and ability to easily un it test.


Yes! And a good surface API, OpenResty (nginx + lua) is reasonably powerful but you're really limited based on the events they give you.

I'm really hopeful about deno (https://github.com/denoland/deno) for this. TypeScript is nice, the deno TCP perf is good, all it needs is some good proxy libraries.


Yes, aside from multiple load balancing policies, Caddy 2 has a circuit breaker that will automatically adjust the load balancing before latency to a particular backend begins to grow out of tolerances.

Both the load balancing policies and the circuit breakers are extensible, meaning it is easy to change their behavior and add new ones as needed.

I could also imagine a specific load balancing policy that adds up a cost for each request using headers such as transfer size; i.e. dynamic weights. This would be a great contribution to the project if you are interested!

Caddy 2 also has an embedded scripting language that can make this kind of logic scriptable and dynamic, but that's still a WIP.


Seems like most of the work is done by the `ReverseProxy` package and this code is more about health checking.

Nice to see how simple it is now though. Go is definitely a great choice for low-level networking, and .NET Core has recently become a great option as well.


I get what you mean and I agree, but this is far from low level networking


I’m also a bit confused where is .NET Core coming to the picture. It’s becoming faster and faster but it’s definitely not the best tool for load balancers.


Why not? The latest techempower results show ASP.NET Core saturating a 10GbE network card with 7M+ req/sec while being a full-featured framework, compared to custom C++ web servers. [1]

The memory management, type safety, and high-level productivity make it great for building load balancers and other infrastructure components.

1. https://www.ageofascent.com/2019/02/04/asp-net-core-saturati...


You'd have to know how much memory and cpu was being used vs the other solutions. They're all very close in terms of the key metric, but if Rust is using 50% of the memory, then its not much of a competition. Also, what sort of vm tuning, etc.


True but you can dive into the benchmarks more if you visit the techempower site. They run it on the same standardized hardware and in this .NET is very competitive on cpu and ram.


It's nice to see a walkthrough of what goes into a load balancer and how simple it is to build on in Go.

One nitpick is that the autho reversed the meaning of active and passive health checks. Active generates new traffic to the backends just to determine their healthiness, passive judges this based on the responses to normal traffic.


I found it more convenient to keep the field unnamed when using mutex in a struct. So in the example that would be

type Backend struct {

  URL          *url.URL

  Alive        bool

  sync.RWMutex

  ReverseProxy *httputil.ReverseProxy

}


The problem with that is that the methods for `sync.RWMutex` spill into `Backend`.


That isn't a problem—in fact, that's the entire point of arberavdullahu's suggestion.


Still seems like a problem to me. It breaks encapsulation.

The mutex is only used inside of SetAlive() and isAlive(), they're the only things this need to handle locking and unlocking. You don't want anything external to that calling the methods on RWMutex.


Oh of course, if you're not using it then don't expose it.

I haven't read the code so I can't verify if that's the case here, but I read OPs post as being worried about method clobbering (which is really a non-issue, if the popularity of embedding mutexes shows us anything).


With talk of proxy and go. I am surprised gobetween hasn’t been mentioned yet https://github.com/yyyar/gobetween . Last time I looked at the source it was very approachable.


Load balancer seem like one of those problems that engineers should be cutting their teeth on.

And yet we have only a handful and one of the most popular charges money for cool features and does not appear to have an ABI for addons.


> And yet we have only a handful

Apache, nginx, haproxy, envoy, traefik, and probably dozens more that aren't on the top of my head.


You literally listed a handful ;)


Fabio, IIS, Varnish, Kong, Squid, lighttpd, emacs (probably).



Actually Emacs as a dev proxy like Charles Proxy could be handy... ;)

Also there exists an http module for emacs: https://www.emacswiki.org/emacs/HttpServer


> > > And yet we have only a handful

> > Apache, nginx, haproxy, envoy, traefik, and probably dozens more that aren't on the top of my head.

> You literally listed a handful ;)

How many solutions to a problem are needed when the problem is well-defined before there is no longer a need to grab for more?


There are probably thousands on Github for the use of teeth-cutting like in the blogpost. I've made one. You just lose interest when you impl the easy/naive stuff and need to actually use a real solution in production.


[flagged]


Apache's httpd is perfectly good as a web server, and passable as a proxy. I can't say that it'd be my first choice, but I'm not aware that there's any special reason not to use it.


The Byzantine config file syntax was a common complaint shared by the early defectors to nginx. If I never have to edit that conf file again it will be too soon.



Ok. Three of those are forks of the others.

Several of those are ingress routers, and I’ve seen no claims of those being able to run standalone (ie without specifically k8tes) so they are technically load balancer, except they can’t operate standalone. Say I wanted to run one per server instead of NodeJS cluster mode. I could not substitute those, right?

One is F5 which nobody says anything nice about anymore, and was positioned as a hardware solution for most of its good years.

I’d also note that a number of these are pretty young, indicating that there was a power vacuum that is now filling in.


That is a really neat page. You see "funding: $250k" right next to "market cap: $1.1T". It's almost surreal.


Thanks. Don't miss the landscape view: https://landscape.cncf.io/

And we've started creating them for other fields, like visual effects: https://landscape.aswf.io/


I'm thinking of a "load balancer" that charges for basic features like querying for which downstream servers are in service..

HAProxy on the other hand has been doing some really fantastic stuff lately that is in the open source. It makes me a bit sad that the former is the "go to" and HAProxy doesn't get the usage it deserves.

To your specific point though I think it's just super tricky to get all the features people expect at the performance they expect as well. You essentially need to implement an very efficient HTTP server, also apparently not trivial, in order to get expected Layer 7 features.. Embedded scripting language support is quickly becoming de facto.. Simple traffic proxy/forwarding is the easy part but getting something competitive together feature and performance wise feels way beyond teeth cutting?

EDIT: Using golang as an example. You would need to either piggy-back the best HTTP server available for your language platform or write your own. The stdlib Golang http server is not competitive performance wise. The reasons for this are pretty well known and seem to be largely accepted by the core team(AFAICT). This won't impact most app and service developers too much, but for an LB service the number of cycles it leaves on the table my not be acceptable at all.


I’m dealing with some people that have some sort of existential dread of haproxy and I can’t get them to admit why they won’t use it. I’d send them to a therapist but I’m not their boss.


You can use fasthttp in Go to get performance within about 10% of the best in other languages. See the latest tech empower benchmarks.


Yup. It's also famously incompatible with the wider ecosystem because it's not a drop in replacement for net/http. There are other criticisms I haven't really looked into as well.

Not meant to be my own criticism, just an observation. I believe the net/http interface ossified its own design decisions making it hard/impossible to be compatible with?


The net/http interface requires too many allocations, which prevents it from running at the same speed.

That's why fasthttp uses a different design.


Implementing a simple load balancer in Go was actually a take home interview problem for me once and I have to say I thoroughly enjoyed making it.


This is pretty cool. But I think an implementation that avoids the mutexes (mutices?) when allocating the backends and uses channels instead would probably perform better.

2 channels needed, 1 for available backends and 1 for broken ones.

On incoming request, the front end selects an available backend from channel 1. On completion, the backend itself puts itself either back onto channel 1 on success, or channel 2 on error.

Channel 2 is periodically drained to test the previously failed backends to see if they're ready to go back onto channel 1.


Channels do not avoid mutexes, they are implemented with them under the hood.


Sure, but with the above, there's 1 mutex to read the channel, not 1 per backend. And a single thread reading the channel.

Plus you know that if the backend is on the channel, you know it's ready to accept, and you don't need the alive flag with synchronized access.


> there's 1 mutex to read the channel, not 1 per backend

Fewer mutexes isn't necessarily an advantage, there will be far higher contention on that single mutex. Channel writes also require a lock.


One to read, and a single reader thread. That's no contention.

One to write, that's used for writing only, not both read/write as in the published design.

Edit: it's a variation on the following, but for ReverseProxies, not worker goroutines http://marcio.io/2015/07/handling-1-million-requests-per-min...


Mutex is an abbrevation of "mutual exclusion", so afaik you cannot change the last part to " ice" due to it being an abbrevation. Mutexes.


Probably the confusion comes from the Latin word "index", whose plural is indeed "indices", although nowadays "indexes" is extremely common.


That could be a win if it makes the code clearer, but it's certainly more resource intensive.

Channels appear to be magical, since they're baked into the language. But package sync APIs are generally faster.


I've actually often read the opposite : raw mutex are usually faster than using channels (which are built over mutex).


True but mutex/channel is a false dichotomy: mutex serializes, channels orchestrate.


> Multiple clients will connect to the load balancer and when each of them requests a next peer to pass the traffic on race conditions could occur.

I quite don't understand what this means? What race conditions? Can anybody explain? Thanks.


I haven't really read it but I'll take a stab (with psuedo code). I think NextIndex() is incrementing s.current and modding it with servers.length.

To do this, there are three operations, SET s.current to +1, and GET s.current so that it can be MODDED with servers.length.

If that SET and GET are not coordinated between threads, then a race could sour the index. For example, if two threads call this method at the same time, they could both SET the +1 before either GETs the value, then they will both get the same value +2 from where it started instead of +1 for each caller.


Thanks for the explanation. Forgive my lack of knowledge, but can it not be solved by using parallelism instead of concurrency/threads? I thought Go has first class primitives to be able to do this?


> parallelism instead of concurrency/threads.

What do you mean? These seem synonymous to me.

> I thought Go has first class primitives to be able to do this?

TBH I'm only 1 week into learning Go myself (coming from Java/Scala/JS/etc). But it looks like the article used what Go offers. They used the atomic package which says, "provides low-level atomic memory primitives useful for implementing synchronization algorithms."

https://golang.org/pkg/sync/atomic/


> What do you mean? These seem synonymous to me.

What I meant was similar to this talk: https://blog.golang.org/concurrency-is-not-parallelism where there is also a load balancer example around 20:20 and using go channels (the primitive I’m pertaining to)


Channels aren't a way to avoid mutexes/locking, they are just meant to be simpler to write & reason about. Channels are implemented using mutexes under the hood.


An increment like s.current++ isn't atomic either. If two threads increment simultaneously the value may only increase by 1.


When you have 2 threads running in parallel and the outcome of the computation depends on the order in which they do their job first, that's a race condition.



If you don't need http smarts on the load balancer LVS is a great option that will give you fantastic performance.


Thanks for sharing the article and thanks all for the feedbacks :)


The author states:

> After playing with professional Load Balancers like NGINX I tried creating a simple Load Balancer for fun.

And while nginx[0] certainly can perform in this role, another production quality load balancer is HAProxy[1]. Both can do more than this, of course.

Reinventing solutions "for fun" certainly can be educational and help others learn key concepts, but the author should clearly state what they are doing is not meant to replace production quality solutions.

0 - https://www.nginx.com/

1 - https://www.haproxy.com/solutions/load-balancing/


Doesn't 'for fun' clearly indicate that this isn't designed to replace a professional-grade solution though?


> Doesn't 'for fun' clearly indicate that this isn't designed to replace a professional-grade solution though?

Depends on the person and their understanding of English colloquialisms I suppose.

What might be less misunderstood is:

  This is an educational project and is not
  meant for use in production systems.
Granted, native English speakers likely will infer this project is research/educational in nature and benefit.

But why leave it to chance?


Well the repo description is "World's most dumbest Load Balancer", which might count for something


> Well the repo description is "World's most dumbest Load Balancer", which might count for something

Amongst the synonyms for "dumb"[0] is "simple" and is used commonly as such.

A definition of "simple" is[1]:

  easy to understand, deal with, use, etc.:
  a simple matter; simple tools.
The GitHub repo[2] has as its project URI the leaf segment "simplelb"[2].

The first non-title line of the GitHub repo[2] reads thusly:

  Simple LB is the simplest Load Balancer
  ever created.
Nowhere in the README.md are the words "educational", "production", "research", or "fun."

All of this is to say, what "might count for something" may not be what the author intends.

0 - https://www.merriam-webster.com/thesaurus/dumb

1 - https://www.dictionary.com/browse/simple

2 - https://github.com/kasvith/simplelb/


Some bizarre hair-splitting here.

Besides, even if a beginner somehow wanders into the project, mis-IDs the project, and uses on their beginner server, then it sounds like a good learning experience for them even though the README didn't give them permission to learn by neglecting to include the word "educational", god forbid.

Is this the catastrophic scenario you have in mind?


> Some bizarre hair-splitting here.

I was trying to be explicit as to my reasoning. If that came across as "hair-splitting" then I suppose I failed to adequately do so.

The whole point I was trying to make is that people find code in all sorts of ways. And my opinion is that if a public repo, such as GitHub, has a project which could easily be both desired (due to need) and misused (due to intent), then it might be a good idea to put a simple declaration in the project's README.

Given all the blow-back this concept has incurred, I would think the concept is either wholly immaterial or now proven as needed.


Relevant username?

This seems to be a strange hill to die on. For all intents and purposes there is a declaration in the readme. And anyone who knows enough to want to operate with this will see that it's fairly basic. Others will likely just reach for a more generic or battle tested solution.

In any case, people should be able to do what they want with their repos and code, assuming legality of course.


If you race through chains of synonyms, changing between definitions along the way, you can get almost anywhere. Why does this matter?

And here's the full list of synonyms listed there:

> airheaded, birdbrained, bonehead, boneheaded, brain-dead, brainless, bubbleheaded, chuckleheaded, dense, dim, dim-witted, doltish, dopey (also dopy), dorky [slang], dull, dunderheaded, empty-headed, fatuous, gormless [chiefly British], half-witted, knuckleheaded, lamebrain (or lamebrained), lunkheaded, mindless, oafish, obtuse, opaque, pinheaded, senseless, simple, slow, slow-witted, soft, softheaded, stupid, thick, thick-witted, thickheaded, unintelligent, unsmart, vacuous, weak-minded, witless


> If you race through chains of synonyms, changing between definitions along the way, you can get almost anywhere. Why does this matter?

I am not racing through chains of synonyms.

What I was trying to elucidate was that the use of "dumbest", when the repo is ".../simplelb", and the very first line of the README says the project "is the simplest Load Balancer ever created" might, just might, lead people to think that "dumbest" is being used in this context as a synonym for "simplest".

Which might, just might, cause people to consider it for uses this project is not intended to satisfy.


I would really love to meet the person that knows the English language well enough to understand that dumb can sometimes mean simple but also would interpret that sentence to mean “world simplest Load balancer”.

Very strange axe you have to grind for a very unlikely hypothetical. Just because something is possible doesn’t mean it is probable.


"Very strange axe to grind" is right. The repo has 25 commits from start to finish. Who is in danger of thinking this is production ready?


> but the author should clearly state what they are doing is not meant to replace production quality solutions.

If you know you need a load balancer you should already know this.


> > but the author should clearly state what they are doing is not meant to replace production quality solutions.

> If you know you need a load balancer you should already know this.

If you know you need a load balancer, you go and look for one. If you happen upon one in a programming language you use, then it may be more appealing.

Hence the need for an explicit disclaimer.


But if you don't do any research to find out if it is production ready, you aren't a production ready dev anyway. A disclaimer isn't going to save them.


[flagged]


Yes, if you will use anything that doesn’t say “NEVER USE THIS IN PRODUCTION. SERIOUSLY THIS IS FOR EDUCATIONAL PURPOSES ONLY” then you are going to get burned by a lot more than a LB.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: