The isolation is really not that high. erlang:process_info allows one process to observe an awful lot about another. sys:get_status  often works (depending on software design choices) and provides even more data. I would not run two applications that needed isolation for security from each other in the same BEAM or as part of the dist network.
Not quite: the messages sent are kept in shared memory (which is reference-counted). Cycles can't occur due to immutability. But with everything else, you're correct. One reason for this, is that every thread has its own garbage collector! This allows multithreaded garbage collection without stop-the-world pauses.
Source: Did a bunch of research into BEAM for implementing my own multithreaded GC.
BEAM really is beautiful!
This was one of the things that really floored me when I learned about how BEAM was designed. It's incredible that how all the seemingly small changes come together in a way that works really well for the problem domains Erlang was used for.
BEAM is truly a modern engineering marvel.
Only large binaries and slices of them use reference counting, other things are copied. And of course, if the process is remote, everything is copied :)
Bin0 = <<0>>, %% 1
Bin1 = <<Bin0/binary,1,2,3>>, %% 2
I'm not sure if you have answers, but I have questions:
1. How do they determine what size to start sharing/refcounting at? Now that I'm actually considering it, it's obvious that they'd copy small stuff--an obvious lower limit would be sizeof(<reference_counter_type>) + sizeof(<pointer_type>) because smaller than that, it actually is faster/less memory to copy. But the numbers you're mentioning are higher than that--is there some metadata being stored beyond the reference count and pointer?
2. Is copied data stored in thread storage and garbage collected by that thread's garbage collector? That would be my guess.
Obviously the source of truth for all these questions is the code, but it's been a while since I dug in there. Maybe it's time to go again.
2. See: https://www.erlang-solutions.com/blog/erlang-garbage-collect...
There's no thread-local GC, it's really per process. For instance a scheduler thread that's running a process and hits the need to do an expensive GC may delegate the task to a dirty scheduler (another type of thread, dedicated to long blocking tasks), and go back to running other processes in the meanwhile.
It's frequent for processes to move around between schedulers, they steal work from each other.
I’d bet optimizing local message passing would be a good feature of a BEAM implementation in Rust since the ref counting and other details would be less worrisome. Another research topic would be how BEAM would compare to say Java/Dotnet runtimes on cpu cores with lessened hardware shared memory guarantees and relying on message copying between cores instead.
Edit: and for cross-system dev (e.g. debian to windows)
Containers are all about consistency of the environment. For example, maybe your Elixir app wants to log out to something like Splunk or DataDog...if those agents aren't installed on the host, it doesn't matter what facilities BEAM provides. Or maybe your app depends on certain configs, libraries, or files being available at specific paths on the system. Things like that would be captured in the image your containers are running and guarantee consistency across machines and environments.
libc builds don't work on Alpine
IMO, in most use cases it makes sense to just use whichever you're more familiar with and will be more productive with. After all, that is the entire point of said tools...
I say done right because kubernetes doesn't allow hot code reload without restarts.
Running BEAM on docker in kubernetes is great, a lot is added and operations smoothed by doing so.
And horizontal scaling between the two do not overlap either, not sure what you mean. Packing containers (processes) efficiently into a cluster of nodes is not something Erlang does.
Edit: Erlang/OTP does offer fail over for what it calls "distributed applications" but based on a static set of nodes -- not horizontal scaling, anymore than other languages/frameworks do by letting you spawn new instances...
And with Docker you still need libraries like, y'know Docker for your code to run.
>Packing containers (processes) efficiently into a cluster of nodes is not something Erlang does.
How much have you used BEAM? It definitely does do that. Scalability is one of Erlang/Elixir's primary benefits, and node clustering is exactly how it does that.
Over a decade professionally.
The closest thing Erlang has to it is pool http://erlang.org/doc/man/pool.html
We never used pool. The nodes were mapped onto heterogenous machines sharing the host with a 3rd-party daemon. It's configuration changes even took place through a module update hook written in Erlang itself. We both deployed new code and distributed work "manually" across them entirely on OTP.
[NOTE] It it surprising, or was to me, that there are problems with having a fairly small number of nodes fully connected. I'm lucky enough to have avoided learning this the hard way, but imagine this could serve as a painful backbone to an "Erlang deployment war story".
I worry about, and have seen this both in the first hype phase of Erlang a number of years ago, the misconception about what Erlang offers and the resulting frustration, blaming it on the tool and quitting.
Hoping in the chapters I'm working on for https://adoptingerlang.org/docs/production/ I can better explain the benefits of running on k8s (or similar), while also making clear it certainly isn't the right choice in all cases.
There are many cases where having an Erlang cluster and doing ‘naive pooling’ at the application level would get you 80% of what k8s + routing layer would get you. I’m assuming more of an an all Erlang/Elixir environment which many small startups could get by with. Even then Docker + Fargate or whatnot would still simplify deployment. Personally I harbor a secret desire to see if I could replicate part of the k8s interface using Nerves images + some otp tooling. Probably better things to do with my life though. ;)
Also by ‘heart’ are you referring to how ‘epmd’ works or Erlang distribution?
I think part of the confusion is you in general don't need k8s, but when your size and requirements get to a point that k8s makes sense (which arguably lowers as hosted k8s becomes better) it is not in conflict with your also use of Erlang/Elixir.
I find distributed Erlang much nicer in k8s env (I don't have to maintain it obviously) where I get an IP per node, can add a k8s service making it possible to use DNS SRV queries to find all other nodes and letting k8s worry about where pods run and keeping them up.
Plus there is configuration management and consistent storage (resources in etcd).
So I'd imagine there's potential utility in still running inside docker, as a means to deploy the expected OS, the correct assets, the version of BEAM itself, etc. And there are reasons to isolate all that with VMs, to prevent sharing of machine resources like disk, file descriptors, sockets etc.
As we brought on our first few customers I kept expecting we would need to increase our hosting capacity. Here we are 2 years later and we are still running the whole thing with our initial hosting setup.
It seems it definitely hits a niche market in the current time but say in 2030, I’d imagine most people will have migrated over their phone setups to a modern text-supported by default one ?
> Finally, I've seen various reports that the practical size limit of a BEAM cluster is in the range of 50-100 nodes. The reason for this is that BEAM cluster establishes a fully connected mesh (each node maintains a TCP connection to all other nodes), so at some size this starts to cause problems. As far as I know, the OTP team is working to improve this, but as of OTP 22 it is still not done.
I've run clusters of 1-2k machines at my last job (maybe it was bigger, but I can't remember for sure). Holding a TCP connection to each other node is not a problem --- we certainly had a lot more connected clients than connected servers, tuning memory for buffers can be an issue on low ram systems. Global can get to be a problem, I'm not sure of the state in open source OTP, but if you have multiple nodes contending on the pg2 global lock for a group, it can get really slow; there's ways to make that better, but you do need to be careful not to introduce new deadlocks. If it's still using the simple method of try to lock everyone, if unsuccessful unlock and wait a bit and try again doesn't work well under significant contention or if one (or more) nodes is unhealthy and running slowly, but staying online.
The quality of network needed really depends on your tick timeouts, and the amount of data you're transmitting. Dist will work with slow and lossy networks as long as it can get a ping transmitted often enough. I think the default tick time is 30 seconds, and four failed ticks disconnects, so you really just need pings coming through once every two minutes, and for your OS not to give up on the TCP connection.
It wouldn't work well for mobile, but between two reasonably connected datacenters, it should be fine. Anyway, dist should only be used between nodes at the same trust level --- anything you can do on one node can be done from the other node; consider it a bidirectional shell. I've debugged plenty of cases where an intermediate link was congested resulting in very low throughput, and tens of minute message delays on dist; it was still working ok --- just anything synchronous would take forever.
This is exactly why I wasn't excited about LiveView. It felt like a step backwards in terms of human-centric design. Another tool that makes us consider our network bars first and our life second.
In general, I'm kind of disappointed that Elixir isn't leading the way on decentralized and offline-first technology, but I guess it's a limitation of BEAM running on small/low-powered devices?
If you need a live server to provide you with updates what difference does it make whether it's live view or some other json/websocket setup?
Even things like UI changes require the server connection.
I understand that apps require data anyway, but I find offline-first design to be much more compelling for most apps.
Things shouldn't halt or break if your connection goes out or is spotty.
Maybe I'm wrong and the future will be everyone cooking in a giant microwave of internet waves :)
What does he mean by that exactly? Erlang provides some persistent state storage? Or is he just saying he used Erlang database drivers to access Redis / MongoDB?
Yes, it's called Mnesia.
If you need non blocking, concurrency aware memory storage that can eventually be serialized to disk, look at ETS/DETS. They are part of the stdlib.
But you could also use Riak, which is entirely written in Erlang.
Most of the time Erlang and OTP provide what you need already without having to reach for an external tool. (Obviously depending on your use case)
Has anybody here played/worked with this ?
Go has no such guarantees.