> distribution, for example, is a much lauded feature of elixir/erlang but if you look into the implementation it's really just a persistent tcp connection with a function that evals code it's sent on the other end...
I mean, this is just straight up incorrect. Yes the underlying transport is TCP, but using remote evaluation is definitely _not_ the common case. Messages sent between nodes are handled by the virtual machine just like messages sent locally, that is the main benefit of distributed Erlang - referential transparency. Yes, you _can_ evaluate code on a remote node, which can come in handy for troubleshooting or orchestration, but it is certainly not the default mode of operation.
> there's no security model
I mean, there is, but it isn't a rich one. If one node in the cluster is compromised, the cluster is compromised, but the distribution channel is very unlikely to be the means by which the initial compromise happens if you've taken even the most basic precautions with its configuration. It would be nice to be able to tightly control what a given node will allow to be sent to it from other nodes (i.e. disallow remote eval, only allow messaging to specific processes), and I don't think there are any fundamental blockers, its just not been considered a significant enough issue to draw contribution on that front.
> the persistent connections won't scale past a modest cluster size
I mean, there is already at least one alternative in the community for doing distribution with large clusters, Partisan in particular is what I'm thinking of.
> these are both very crude key/value stores with only very limited query capabilities
What? You can literally query ETS with an arbitrary function, you are limited only by your ability to write a function to express what you want to query.
You shouldn't use them in place of a database, but they are hardly crude or primitive.
> elixir/erlang are excellent for software that runs on appliance style hardware where you can't simply add machines to a cluster. it is, in fact, what erlang was designed to do. what this ignores though is that this is a terrible model for a service exposed over the internet that can run on any arbitrary machine in any data center you want
I think you are misconstruing the point of "doing more with less" - the point isn't that you only need to run a single node, but that the _total number of nodes_ you need to run are a fraction of those for other platforms. There are plenty of stories of companies replacing large clusters with a couple Erlang/Elixir nodes. Scaling them is also trivial, since scaling horizontally past 2 nodes doesn't require any fundamental refactoring. Switching from something designed to run standalone in parallel with a bunch of nodes versus distributed _does_ require different architectural choices, and could require significant refactoring, but making that jump would require significant changes in any language, as it is a fundamentally different approach.
> elixir/erlang's features that increase it's reliability on a single machine are a cost you pay not an added benefit. the message passing actor model erlang built it's supervision tree features around are a set of restrictions that are imposed so you can build more reliable stateful services on machines that don't have access to more conventional approaches to reliability (like being stateless and pushing state out to purpose built reliable stores)
I'm not sure how you arrived at the idea that you can't build stateless servers with Erlang/Elixir, you obviously can, there are no restrictions in place that prevent that. Supervisors are certainly not imposing any constraints that would make that more difficult.
The benefits of supervision are entirely about _handling failure_, i.e. resiliency and recovery. Supervision allows you to handle failure by restarting the components of the system affected by a fault from a clean slate, while letting the rest of the system continue to do useful work. This applies to stateless systems as much as stateful ones, though the benefits are more significant to stateful systems.
> the idea that these features are appropriate for a totally standard http api running in aws or digital ocean or whatever backed by a postgres database and a memcache/redis cluster is not really born out by reality however. if it were surely other languages would have incorporated these features by now? they've been around for 30 years and the complexity (particularly of distribution and ets) is low enough you could probably implement them in a weekend
The reason why these features don't make an appearance in other languages (which they do to a certain extent, e.g. Akka/Quasar for the JVM which provide actors, Pony which features an actor model, libraries like Actix for Rust which try to provide similar functionality as Erlang) is that without the language being built around them from the ground up, they lose their effectiveness. Supervision works best when the entire system is supervised, and supervision without processes/actors/green threads provides no meaningful unit of execution around which to structure the supervision tree. Supervision itself is built on fundamental features provided by the BEAM virtual machine (namely links/monitors, and the fact that exceptions are implemented in such a way that unhandled exceptions get translated into process exits and thus can be handled like any other exit). The entire virtual machine and language is essentially designed around making processes, messaging, and error handling cohesive and efficient. Could other languages provide some of this? Probably, though it certainly isn't something that could be done in a weekend. No language can provide it at the same level of integration and quality without essentially being designed around it from the start though, and ultimately that's why we aren't seeing it added to languages after the fact.
sending a message to a remote node is just a special case of eval. instead of arbitrary code you're evaling `pid ! msg`. and what is spawning a remote process if not remote code eval?
when i say there's no security model i mean there's no internal security model. you can impose network based security (restricting what nodes can connect to epmd/other nodes) or use the cookie based security (a bad idea) or you can even implement your own carrier that uses some other authentication (i believe there a few examples of this in the wild) but the default is that any node that can successfully make a connection has full priveleges
as for ETS, you can query any data structure with arbitrary functions. that's exactly what i mean when i say there's limited query capabilities. all you can really do is read the keys and values and pass them to functions
my experience and the experience of others is that elixir and erlang are not significantly more efficient than other languages and do not lead to a reduction in the total number of nodes you need to run. whatsapp is frequently cited as an example of "doing more with less" but it's compared to bloated and inefficient implementations of the same idea and not with other successful implementations. facebook certainly wasn't using thousands of mq brokers to power facebook chat. no one is replacing hundreds of activemq brokers with a small number of rabbitmq brokers
you can absolutely build stateless servers with erlang/elixir (and you should! stateless is just better for the way we deploy and operate modern networked services). my point is that many of the "advantages" of elixir/erlang are not applicable if you are delivering stateless services
when i said you could deliver erlang/elixir features in a weekend, i did not mean all of them. i meant specifically distribution and ets. you are right that the actor model, supervision trees and immutable copy-on-write data structures are all necessary for the full elixir/erlang experience. i generally like that experience and think it is a nice model for programs. i don't think however it is very applicable to writing http apis. java, rust, python, go, ruby and basically every other language are also great at delivering http apis and they don't have these same features
> sending a message to a remote node is just a special case of eval. instead of arbitrary code you're evaling `pid ! msg`. and what is spawning a remote process if not remote code eval?
They are not equivalent at all, sending a message is sending data, evaluation is execution of arbitrary code. BEAM does not implement send/2 using eval. Spawning a process on a remote node only involves eval if you spawn a fun, but spawning an MFA is not eval, it’s executing code already defined on that node.
> as for ETS, you can query any data structure with arbitrary functions. that's exactly what i mean when i say there's limited query capabilities. all you can really do is read the keys and values and pass them to functions
You misunderstood, you can _query_ with arbitrary functions, not read some data and then traverse it like a regular data structure (obviously you can do that too).
> my experience and the experience of others is that elixir and erlang are not significantly more efficient than other languages and do not lead to a reduction in the total number of nodes you need to run.
I’m not sure what your experience is with Erlang or Elixir, but you seem to have some significant misconceptions about their implementation and capabilities. I’ve been working with both professionally for 5 years and casually for almost double that, and my take is significantly more nuanced than that. Neither are a silver bullet or magic, but they excel in the domains where concurrency and fault tolerance are the dominant priorities, and they are both very productive languages to work in. They have their weak points, as all languages do, language design is fundamentally about trade offs, and these two are no different.
If all you are building are stateless HTTP APIs, then yes, there are loads of equally capable languages for that, but Elixir is certainly pleasant and capable for the task, so it’s not really meaningful to make that statement. Using that as the baseline for evaluating languages isn’t particularly useful either - it’s essentially the bare minimum requirement of any general purpose language.
i was not claiming the distribution code literally calls eval, just that it is functionally equivalent to a system that calls eval. you agree that it is possible to eval arbitrary code across node boundaries, yes?
i used erlang for 4 years professionally and elixir for parts of 5. i think both are good, useful languages. i just take issue with the misrepresentation of their features as something unique to erlang/elixir
advocates should talk up pattern matching, supervision trees and copy-on-write data structures imo. those are where erlang and elixir really shine. instead they overhype the distribution system, ets, the actor model and tools like dialyzer which are all bound to disappoint anyone who seriously investigates them
> i was not claiming the distribution code literally calls eval, just that it is functionally equivalent to a system that calls eval.
Not really. The distribution can only call code that exists in the other node. So while the system can be used as if it was an evaluator, it is not because of its primitives, but rather due to functionality that was built on top. If you nuke "erl_eval" out of the system, then the evaluation capabilities are gone.
I agree it is a thin line to draw but the point is that any message passing system can become an evaluator if you implement an evaluator alongside the message passing system. :)
> elixir/erlang's features that increase it's reliability on a single machine are a cost you pay not an added benefit
Agreed! Erlang/Elixir features should not be used to increase the reliability on a single machine. Rather, they can be used to make the most use of individual machines, allowing you to reduce operational complexity in some places.
I mean, this is just straight up incorrect. Yes the underlying transport is TCP, but using remote evaluation is definitely _not_ the common case. Messages sent between nodes are handled by the virtual machine just like messages sent locally, that is the main benefit of distributed Erlang - referential transparency. Yes, you _can_ evaluate code on a remote node, which can come in handy for troubleshooting or orchestration, but it is certainly not the default mode of operation.
> there's no security model
I mean, there is, but it isn't a rich one. If one node in the cluster is compromised, the cluster is compromised, but the distribution channel is very unlikely to be the means by which the initial compromise happens if you've taken even the most basic precautions with its configuration. It would be nice to be able to tightly control what a given node will allow to be sent to it from other nodes (i.e. disallow remote eval, only allow messaging to specific processes), and I don't think there are any fundamental blockers, its just not been considered a significant enough issue to draw contribution on that front.
> the persistent connections won't scale past a modest cluster size
I mean, there is already at least one alternative in the community for doing distribution with large clusters, Partisan in particular is what I'm thinking of.
> these are both very crude key/value stores with only very limited query capabilities
What? You can literally query ETS with an arbitrary function, you are limited only by your ability to write a function to express what you want to query.
You shouldn't use them in place of a database, but they are hardly crude or primitive.
> elixir/erlang are excellent for software that runs on appliance style hardware where you can't simply add machines to a cluster. it is, in fact, what erlang was designed to do. what this ignores though is that this is a terrible model for a service exposed over the internet that can run on any arbitrary machine in any data center you want
I think you are misconstruing the point of "doing more with less" - the point isn't that you only need to run a single node, but that the _total number of nodes_ you need to run are a fraction of those for other platforms. There are plenty of stories of companies replacing large clusters with a couple Erlang/Elixir nodes. Scaling them is also trivial, since scaling horizontally past 2 nodes doesn't require any fundamental refactoring. Switching from something designed to run standalone in parallel with a bunch of nodes versus distributed _does_ require different architectural choices, and could require significant refactoring, but making that jump would require significant changes in any language, as it is a fundamentally different approach.
> elixir/erlang's features that increase it's reliability on a single machine are a cost you pay not an added benefit. the message passing actor model erlang built it's supervision tree features around are a set of restrictions that are imposed so you can build more reliable stateful services on machines that don't have access to more conventional approaches to reliability (like being stateless and pushing state out to purpose built reliable stores)
I'm not sure how you arrived at the idea that you can't build stateless servers with Erlang/Elixir, you obviously can, there are no restrictions in place that prevent that. Supervisors are certainly not imposing any constraints that would make that more difficult.
The benefits of supervision are entirely about _handling failure_, i.e. resiliency and recovery. Supervision allows you to handle failure by restarting the components of the system affected by a fault from a clean slate, while letting the rest of the system continue to do useful work. This applies to stateless systems as much as stateful ones, though the benefits are more significant to stateful systems.
> the idea that these features are appropriate for a totally standard http api running in aws or digital ocean or whatever backed by a postgres database and a memcache/redis cluster is not really born out by reality however. if it were surely other languages would have incorporated these features by now? they've been around for 30 years and the complexity (particularly of distribution and ets) is low enough you could probably implement them in a weekend
The reason why these features don't make an appearance in other languages (which they do to a certain extent, e.g. Akka/Quasar for the JVM which provide actors, Pony which features an actor model, libraries like Actix for Rust which try to provide similar functionality as Erlang) is that without the language being built around them from the ground up, they lose their effectiveness. Supervision works best when the entire system is supervised, and supervision without processes/actors/green threads provides no meaningful unit of execution around which to structure the supervision tree. Supervision itself is built on fundamental features provided by the BEAM virtual machine (namely links/monitors, and the fact that exceptions are implemented in such a way that unhandled exceptions get translated into process exits and thus can be handled like any other exit). The entire virtual machine and language is essentially designed around making processes, messaging, and error handling cohesive and efficient. Could other languages provide some of this? Probably, though it certainly isn't something that could be done in a weekend. No language can provide it at the same level of integration and quality without essentially being designed around it from the start though, and ultimately that's why we aren't seeing it added to languages after the fact.