Having said that, I will add that I think it is good to have Elixir.
I use long lived processes and had to come up with some magic to work around the supervisor behavior with high child counts, etc.
Roughly I randomly assign worked to a node in the cluster if they have not yet been assigned. (there is some logic tracking total nodes on cluster and max / target that influences the decision). I verify if the remote (or local) worker is alive or migrating by checking a fragmented process Registry and unique identifier via :rpc (because I do some recovery logic if it's offline and let the caller specify if messages should force a spawn or can be ignored if the process is offline) and then pad the call with some meta information for rerouting so the receiver can confirm it is the same process the message was initially sent to (since processes cycle so frequently that the initial process may have died and a new process may have spawned in the time it took to forward the message).
If the process has changed mid transit or the worker has been flagged for migration the message gets rerouted to the new destination+process. If a process does not yet exist a hash based on the worker type + id and available worker pools is used to select which of the auto generated but named and registered (WorkType.PoolSupervisor_123) worker supervisors spawns the child node.
It's a trip, and needs to be heavily documented. Starting from scratch i'd probably change some things and it probably needs some refinement later this year before the next batch of 250-500k devices get added to the network, but the costs per reporting device are fantastic with plenty of low hanging fruit for improving the cogs further so I'm happy.
Concretely, its it the case, for an application where Elixir/Erlang/Beam are a great choice, but also, another language would be fine, that the equivalent Elixir application results in less downtime/pages than the alternative. Anything from the perfect app to something with a ton of races/leaks.
Is this a fair question (maybe I'm presuming too much of BEAM/supervisor pattern, I zero experience with it)?
I have zero on-call for Go.
I had very few for Elixir. But the bug were in logic code.
Same with Ruby.
But it's a disaster with Node. We used TypeScript so it catch lot of type issue. However, the Node runtime is weird. We run into DNS issue(like yo have to bump the libuv thread pool, cache DNS). JSON parsing issue and block the event loop etc...max memory...
Don't have any hard data to compare but having been involved in debugging running Erlang systems. It's very nice having the ability to restart separate supervisors while the rest of the processes handle requests. Being able to do hot code loading to say fix bugs or add extra logging. And my all time favorite -- live tracing after connecting to a VM's remote shell. You can just pick any function, args, and process and say "trace these for a few seconds if a specific condition happens". None of those individually are earth shattering but taken together they are just so pleasant to use. I wouldn't enjoy going back to anything didn't those capabilities.
And yes, that restarting of sub-systems (supervision trees) happens automatically as well. There were a number of cases were it turned a potential "wake up 4am and fix this now, cause everything crashed" into a "meh, it's fine until I get to it next week" kind of a problem.
Overall I would say this book is a good start https://www.erlang-in-anger.com/
Supervisors are just a general pattern in Erlang. Any book will have something about it. I like this one: https://learnyousomeerlang.com/supervisors
Restarting frequency and limits are just one of the parameters you specify. So don't need to do anything fancy or special there.
Hot code loading might not be as obvious: http://erlang.org/doc/reference_manual/code_loading.html but is essentially just compiling the module on the same VM version (or close by, no more than 2 version away), copying it to the server in the same path as the original. The original could be save to a backup file. The do `l(modulename)` to load it.
For tracing I recommend http://ferd.github.io/recon/. Erlang in Anger book will also have example of tracing. http://erlang.org/doc/man/dbg.html has some nice shortcuts too, but be careful using it in production is it doesn't have any overload protection. So if you accidentally trace all the messages on all the processes, you might crash your service :-)
* Are the teams that use certain languages comprised of more experienced people?
* How mature is the company and project? I.e., a faster moving startup cutting more corners, where time was decided to be of the essence (rightly or wrongly) will likely produce more on call incidents than a slower, more established company that can takes its time
The general idea is combining queries from different HTTP requests into a single database query/transaction, amortising the (significant) per-query cost over those requests. For simple use-cases it doesn't add a whole lot of complexity, can reduce both load and latency significantly, and doesn't lose transactional guarantees.
Not 100k/sec writes on my laptop, mind you :-).
E.g. please give me guidance on how to better structure my database model so that it doesn't effectively end up as a huge spaghetti heap of global variables. My personal horror: updating a single database field spurs 20 additional SQL queries creating several new rows in seemingly unrelated tables. Digging in I find this was due to an after_save hook in the database model which created an avalanche of other after_save/after_validation hooks to fire. The worst of it: Asking for how this has come to be I find out that each step of the way was an elegant solution to some code duplication in the controller, some forgotten edge case in the UI, some bug in the business logic. Basically ending up with extremely complex control flows is the default.
So of course, if your code has next to no isolation, batching up queries produces incalculable risks.
One mitigating factor, this sort of optimisation should be applied to frequent queries more than expensive queries. In some use-cases the former kind may be simple ("Is this user logged in?") even if the latter is not.
And on keeping that complexity down: the traditional story has been "normalise until you only need to update data in one place," but often requirements don't line up well to foreign-key constraints etc. The newer story can work, though: "Denormalise until you only have to update in one place, shunt the complexity to user code, and serialise writes." It's anathema to many, but it is becoming more common (usually in places that don't use RBDMSs though.)
does anyone know how does 100k connections compare with other servers?
The order of magnitude(s) differentiator for server performance really comes down to whether or not the architecture is blocking or non-blocking.
Also, assuming it scales up linearly is a bit risky, although I agree with that kind of conn/s I am sure it will be sufficient.
that being said, this article was pretty informative. The bit about the proposed SO_REUSEPORT socket option was really interesting. Really fun to read about performance bottleneck detection and improvement.
edit: wow, downvoting for making a simple joke about liking elixir. Cool.
Maybe we should add something about this to https://news.ycombinator.com/newsfaq.html.
I think that the inclination towards "meaningless" humor makes it too much like Reddit. These folks want SUBSTANCE! (Well, that's why _I_ come here, at least!)
Also, any update on your previous article? https://news.ycombinator.com/item?id=19094233
> Efficiency in the BEAM is mainly in service of its primary goal of fault-tolerance. If one process crashes unexpectedly, the others should continue. By the same logic, if one process is CPU-intensive or IO-blocked, the others should keep making progress smoothly. And if processes are good for isolating errors and performance issues, they should be cheap enough that we can run a lot of them at once. Those assumptions are baked into how the BEAM manages processes.
If raw speed is your only goal, the BEAM probably isn't the best choice. If consistent speed and stability matter, it may be.
More on this at https://dockyard.com/blog/2018/07/18/all-for-reliability-ref...
- Go wants to be performant at high concurrency scale
- Erlang/Elixir wants to keep running at high concurrency scales, whatever the issues are in your application code. Performance comes second.
There's no clear cut answer to your question; I guess if you trust yourself to write servers that will hold a large number of connections while doing a lot of processing then Go has an advantage, otherwise you should probably trust the man-centuries behind the BEAM VM and follow the various blog posts/presentations explaining how you can fine-tune your machine to get to super large scales.
I want to state that performance is too generalize here.
BEAM VM also have a goal of low latency which can be consider as performance. I'm not entirely sure if GO is aiming for that or not. I would never do any numerical stuff on BEAM though, it's very slow.
This article is a bit dated but is interesting between Go and Erlang:
If it's pure benchmarks, then Go is usually going to come in a little bit ahead.
When you get into comparing language design, underlying architectural decisions, problems solved/created/avoided by those decisions it gets more complex.
I did a big write up for code ship a couple of years ago. Had a solid discussion on HN and the comparison remains fairly accurate.
Both of them give you less flexibility than is necessary to achieve highly efficient use of all threads on a multiprocessor system. For that, you'll need something like a pool of event loops using async/await. This is the system most common in high performance networking in C++, C, and Rust.
Erlang and Go both sacrifice efficiency to improve maintainability and safety by offering a model that allows you to approach concurrency from a more synchronous mindset. Erlang in particular goes beyond Go in that the Actor model is considerably easier to avoid deadlocks and other concurrency bugs in at the expense of a much more opinionated system. Erlang is also less focused on reducing average latency as much as keeping latency predictable at scale.
Long story short, Erlang, Go, and the rest are not apples to apples comparisons, and it takes investment in each language to understand the tradeoffs and use cases for each. You should also view them holistically, as in, what language can my team support, and will the wins from Erlang's message queues outweigh the smaller community, or will Go's mid tier performance be enough to avoid writing on top of the low level libevent and building a custom thread pool or fine tuning Go's scheduler.
The question is if you cannor want to write better concurrent code by rolling it yourself.
Some benchmark about pure CPU computation:
Erlang is really slower than Java
Go and Java now:
You see the big difference, Go and Java are on part but Go usually takes way less memory than Java.
You can't have everything, message passing / immutability with no performance hit.
Are you juggling lots of messages concurrently and orchestrating across complex topologies of nodes? A BEAM language is going to excel. That's why Whatsapp, Discord, and RabbitMQ use Erlang/Elixir.
Are you trying to go really fast in a straight and simple concurrency scenario? Go/Java/C++/Rust is going to be faster than a BEAM language in those scenarios.
You won't want to implement a complex concurrency run-time in Java whereas Elixir is not a good choice for a 3D game engine.
Still, there's nothing wrong with using both.
EDIT: I believe this is partially due to Go being a lot more CPU efficient overall than Erlang (see below). So for simple servers, Go and Erlang will match performance, but for slightly more complex web servers that need to crunch some data, Go [and Rust] will outperform the Erlang VM. https://stressgrid.com/blog/benchmarking_go_vs_node_vs_elixi...
Can someone educate me on what they might talking about here ?
CPU is ~45% in their final graph. I don't know what network latency means in this context though. Roundtrip for a TCP handshake ? That seems unlikely.
14 core machine comparing .net core with other top webservers:
But since they are benchmarking Elixir, there is some amount of overhead involved in that framework's management of connections and requests. If I knew Erlang/Elixir, that would be a fascinating thing to explore.
Edit: I'm assuming the saturated CPU comes from Elixir and not the OS. It would be strange for 100k/sec to saturate the TCP stack with 36 cores.
To run this test, we used Stressgrid with twenty c5.xlarge generators.
100K/sec was achieved by yours truly 10 years ago on a contemporary xeon with nothing but nginx and python2.6 - gevent patched to not copy the stack, just switch it. (EDIT: and also a FIFO I/O scheduler)
Why does this require 36 cores today??
They are purposely holding the connections around for 1+10%seconds. So first of all, it means that, for a rate of 100k conn/s, they are going to have around 200k open connections after a second. This already imposes a different profile than 100k single request connections per second.
You are also assuming that they need 36 cores to achieve 100k connections per second, which is likely not the case since they quickly moved the bottleneck to the OS. I am assuming they have other requirements that force them to run on such a large machine and they want to make sure they are not running into any single-core bottlenecks (and having a large amount of cores makes it much easier to spot those).
> What this means, performance-wise, is that measuring requests per second gets a lot more attention than connections per second. Usually, the latter can be one or two orders of magnitude lower than the former. Correspondingly, benchmarks use long-living connections to simulate multiple requests from the same device.
Yes. Which is not what's being discussed here.
Why this is even here then?
I think limiting factor might be not number of cores and outside of erl scope, that is eth card they used, network infrastructure, etc. Even Elixir could be something that impacts the tests.
The work in some unknown state is at https://code.google.com/archive/p/coev/
Without the business logic (which was in django IIRC) and deployment details, obviously. Very outdated and some later patches might be missing. No one was interested, you see.
I'd be surprised if there were problems with network, and if there were, that should have been obvious in the metrics.
Maybe the metrics were inadequate
Can't see how this can be replicated as a controlled experiment nowadays, unfortunately.
But if you define exactly what's a request, what's a response, and what the connection/response ratio is let's have a race.
Like, you set the parameters, and whoever serves that on lower-capability hardware wins. Py3 plus low-level C/Rust hacks vs Elixir, say.
I offered to beat whatever you've done by tweaking the Py3 stdlib. Not by writing a plain C implementation.
If you for some reason doubt that this old python thing is of the real world - let me disappoint you. It was done because nothing else could do those 100K rps back then. And it did the thing for five years, until the whole stack was ditched.
As an Elixir user who had to deal with high connections/s in the past, I found it interesting and useful. I use Elixir for reasons that have nothing to do with performance so a language comparison isn’t particularly interesting.
What does that mean? You keep qualifying "connections." It's a connection. It holds onto it's connection for X period of time. An HTTP request is just a single-request connection, which is NOT what this article is discussing.
I admit I didn't first see that they actually don't do any i/o over those connections.
Well, you know, handling x accepts() per second and holding onto y fds is even less than nothing to be proud of.