
Million WebSockets and Go (2017) - riobard
https://gbws.io/articles/million-websocket-and-go/
======
maxpert
So some time ago when I was playing around with my toy project (RaspChat) I
noticed creating 2 channels and a go routine for every incoming websocket
connection is not the answer. I was designing RaspChat to work on a 512MB
Raspberry Pi; and I was bottle-necked by GC, and memory consumption around 3 -
4K connections. After loads of optimizations I got it around 5K. Digging
deeper and found well I have to maintain a pool of go routines (like
threadpool) and I have to write event loop. I was instantly pulling my hair. I
was sacrificing so much simplicity, and flexibility of Node.js just because I
was trying to avoid event loop and wanted to use channels (I did too much
Erlang months before starting project and couldn't think anything other than
process and messages). I got a backlash on my release
([https://github.com/maxpert/raspchat/releases/tag/v1.0.0-alph...](https://github.com/maxpert/raspchat/releases/tag/v1.0.0-alpha))
from go community telling me how I was using desierializers/leaving loop holes
in file upload and I didn't know shit about language.

At that time I found uws
([https://github.com/uNetworking/uWebSockets.js](https://github.com/uNetworking/uWebSockets.js))
that easily got me to 10K easily, and I was like "I would rather bet on a
community investing on efficient websocket event loop rather than me writing
my own sh*t". Don't get me wrong; I love Golang! Seriously I love it so much I
have been pushing my company to use Golang. I just don't want to glorify the
language for being silver bullet (which it's fanboys usually do). I would
never implement complicated business logic that involves many moving pieces.
When my business requires dealing with shape of an object and mixing matching
things to pass data around; I would rather choose a language that lets me deal
with shapes of object. Go has it's specific use-cases and strengths, people
advertising it as move it to go and it would be faster than Java/C#/Node.js
etc. have not done it or have not dealt with complexity of maintaining it.

~~~
d33
Pardon the obligatory throwing in of Rust, but it sounds like you were okay
switching languages anyway - have you considered Rust as an option? It doesn't
have GC and has a very healthy ecosystem (recently with async primitives
officially supported by the syntax). It also has the pattern matching you seem
to mean. Perhaps it would help you solve your optimization needs? Otherwise,
I'd love to hear why it's not a good use case for it since I'm still exploring
the language myself.

~~~
T_A_3423
This kind of promotion creates the tense atmosphere around Rust in the
community.

I wonder if anyone has read the linked article?

The overhead of goroutines are well known. The article describes the problem
and a solution.

Now someone who got bitten by the overhead of goroutines complains with a
(understandable) little bitter tone. He has a good explanation for the issue
and why he didn't use Rust but Node.

Citation:

>> I started exploring various options ranging from Rust, Elixir, Crystal, and
Node.js. Rust was my second choice, but it doesn't have a good, stable,
production ready WebSocket server library yet. Crystal was dropped due to
conservative nature of Boehm GC, and Elixir also used more memory than I
expected. Node.js surprisingly gave me a nice balance of memory usage and
speed.

Then someone didn't seem to have read all the stuff comes around and smartly
calls "Use the awesome Rust".

Even as a Rust user myself I get annoyed.

~~~
Buge
Where is that citation from? Are you quoting from somewhere? I can't find it
in the article.

------
andrewmatte
This is still super interesting, two years later but does anyone have an
update?

Susheel Aroskar, a Netflix engineer, did a talk about push notifications
[https://www.infoq.com/presentations/neflix-push-messaging-
sc...](https://www.infoq.com/presentations/neflix-push-messaging-scale/)
(2018)

~~~
andrewmatte
[https://lwn.net/Articles/775238/](https://lwn.net/Articles/775238/)

Dave Doyle and Dylan O'Mahony did something pretty amazing related too with
websockets for Bose.

------
phoboslab
It's surprising to me that you apparently have to fight for memory usage for
these cases when using Go.

A while ago I ran a (quite naively written) nodejs application that maxed out
at ~700k WebSocket connections per server - using only 4GB of RAM. Here CPU
became the bottleneck.

~~~
sansnomme
Go's concurrency design trades off memory usage for productivity; instead of
red-blue functions where you have to explicitly design for function
interrupt/yield points with the async keyword, you can just write sequential
code and the runtime will handle the rest. The downside to this approach is
that often the stack will have to be copied during the switching process vs
the stackless approach preferred by Node.js, Rust, C# etc.

See the excellent Fibers Under a Magnifying Glass paper by Microsoft Research:
[http://www.open-
std.org/JTC1/SC22/WG21/docs/papers/2018/p136...](http://www.open-
std.org/JTC1/SC22/WG21/docs/papers/2018/p1364r0.pdf)

~~~
Thaxll
This is not how Go "works" overall, you're talking about the size of the
goroutine stack which is by default 4KB, so in a scenario with a lot of
connections yes it's going to add up if you use 1:1 connection / goroutine,
but outisde of that Go uses less memory than Node / C# / Java / Python ect ...

So I woudn't say "Go trades off memory usage for productivity" since Go is
widely used for low memory footprint.

Same reasons why Go makes sense in services like Kubernetes where each pods
are in the range of 2 digits MB, it woudn't be possible whith languages
mentioned above.

Edit: In your edit context it makes more sense :)

~~~
apta
> where each pods are in the range of 2 digits MB, it woudn't be possible
> whith languages mentioned above.

Not sure about NodeJS and Python, but it's certainly doable with Java and C#.
It's just that people don't take the time to configure the JVM/CLR correctly.

There's nothing magical about golang that you can't do in C# (and soon enough,
in Java with the addition of value types). Arguably, C# and Java's value type
implementations are superior anyway.

~~~
Thaxll
Kubernetes released in 2015, at that time it wasn't possible no to run some
Java / C# servers with settings bellow 256MB ( -XMS ), I'm pretty sure it's
still the case with Java as of today. Try to run some service with -XMX -XMS
128MB and tell us how it goes.

~~~
apta
> Try to run some service with -XMX -XMS 128MB and tell us how it goes.

What does the service do? An API call that returns the current time? A batch
processor? A payment portal? The memory usage depends on the type of work
performed obviously.

Furthermore, there are already offerings like
[https://quarkus.io/](https://quarkus.io/), micronaut, and others that make
use of native image compilation for even smaller footprints.

------
jbmsf
There's a brief mention of the load-balancer (nginx) in front of the Go
servers; I'm curious if there's anything interesting happening there. I'd
imagine that if you lose a server, all of the clients will try to reconnect
and traffic will be spread across the existing servers. That's all find and
good, but presumably when you bring up a new server to replace the failed on,
it'll be seriously underutilized. Is there some easy solution here in nginx-
land?

~~~
toredash
For websocket? Yes
([https://github.com/SocketCluster/loadbalancer](https://github.com/SocketCluster/loadbalancer)),
but you would have to introduce another layer (AFAIK) that would detect
failure and reconnect to a healthy target without informing the client.

------
fasteo
For mail.ru, I was expecting [1] you would use tarantool for this task

[1][https://hackernoon.com/tarantool-when-it-takes-500-lines-
of-...](https://hackernoon.com/tarantool-when-it-takes-500-lines-of-code-to-
notify-a-million-users-11d340523493)

------
maurodelazeri
nothing beats this
[https://github.com/uNetworking](https://github.com/uNetworking)

