What's special about Erlang and Elixir?

notemaker · on May 16, 2023

This is a great talk that answers many of your questions, https://youtu.be/JvBT4XBdoUE

I am so far merely an enthused observer, so I can't give any insights into the experience of delivering business software with it. Here's my observations though

- Engineers that work with this stack love it. The two most common issues I've seen are lack of static typing and difficulty in hiring

- Great for IO bound tasks due to collaborative scheduling. For CPU bound tasks performance is better offloaded to e.g Rust (see rustler)

- Exception handling is done by processes having supervisors. My mental model for this is similar to pods & deployments à la K8s, but at any abstraction level you want inside your component.

- Elixir pleases many crowds: ruby, FP, performance, resilience

- Phoenix (an elixir library) is seen as a next step from ruby on rails.

- Concurrency model is considered to be simple & powerful

bedobi · on May 16, 2023

Big thanks for the reply

> Great for IO bound tasks due to collaborative scheduling

Right, so this seems similar-ish to Loom in Java, Coroutines in Kotlin etc, ie Erlang and Elixir have already had for a long time and fully utilized by default this concept which is only recently making it into many other languages?

> Exception handling is done by processes having supervisors. My mental model for this is similar to pods & deployments à la K8s, but at any abstraction level you want inside your component.

Right, based on this and what others are saying about the same thing, basically Erlang and Elixir have already had for a long time and fully utilized by default this concept which other languages don't really have natively but instead rely on external cloud infra to provide?

In any case I guess my question is if you're eg experiencing a timeout because your code is calling another service that isn't responding, you can shut down that calling process and restart it all you want, it's not going to really fix the problem, you still need to address why that other service isn't responding? (though it may prevent the VM from being eaten up exclusively by waiting for responses that will never come - is that the idea?)

cschmatzler · on May 16, 2023

Hiring isn’t really that bad. Definitely fewer candidates, but they tend to be higher quality.

japhib · on May 16, 2023

My company uses Elixir exclusively in the backend and we just hire solid backend developers, and teach them to use Elixir. That's how I was a couple of years ago. The learning curve is pretty gentle.

jacquesm · on May 16, 2023

satvikpendem already linked that talk (see other comment)

elcritch · on May 16, 2023

> Let's skip over "low-latency", because virtually all programming languages and VMs are "low-latency".

Perhaps under minimal load. However, the BEAM VM's style of micro preemption allows it to retain really low latency under load, much better than traditional VM's on average.

> "distributed" and "fault-tolerant" are what's most often touted as "special".

The distributed piece is that theres not much difference between communicating to a local actor or a remote one. Also you get a lot of built in tools so you don't need a message bus, a cache, and a task manager, etc. Its convenient if you can stay below ~100 nodes.

Fault tolerance is just how the supervisors encourage structuring the application. Java and C++ land tends to blur together things and it's difficult to disentangle the errors. Also, errors recovery tends to focus less on just "restarting" and more on trying to recover the current state.

toast0 · on May 16, 2023

> Its convenient if you can stay below ~100 nodes.

There's not much of a node limit. When I was at WhatsApp (through 2019) we ran thousands of nodes in some of our dist clusters. Unmodified pg2 was challenging at that scale (and modified pg2 was still sometimes challenging), but the new pg module doesn't rely on global:set_lock (or any form of global locking), and eliminates that scaling hurdle. Mnesia would likely get unmanagable with many nodes in a shared schema, but I'm not really sure why one would want to have that many nodes in a shared schema (with persistent nodes, we'd usually run 4 nodes in a mnesia schema cluster, with ocassional additional nodes added to facilitate resplitting data across more schema clusters. with ephemeral nodes, I believe it was typically 6 nodes).

I was always confused when other groups suggested a node limit like 100 nodes. I just don't know where it comes from, and it doesn't match my experience. Hard limits come from the atom table and port limits, both of which are pretty easy to expand. Certainly if we could manage on the order of millions of connections to clients (that specific number dropped over time as the server did more work per connection and then when switching to the smaller node types that were the path of least resistance in FB hosting), a couple thousand for dist wasn't going to impinge on port limits.

weatherlight · on May 17, 2023

https://www.dcs.gla.ac.uk/~amirg/publications/DE-Bench.pdf

lliamander · on May 16, 2023

If I was starting a project from scratch, there are three reasons I would go with the BEAM:

- The runtime has so much devops/SRE stuff built right in. Hot code reloading, trace debugging, a remote shell. It's very easy for developers to do their own ops. Check out the book Erlang in Anger for more info.

- The language design (for all BEAM) languages is top-notch. Subjectively I appreciate the functional style of those languages, but aside from that they are also very expressive and have relatively few sharp edges compared to most languages

- Writing idiomatic Erlang/Elixir code means you will be able to have a system that is responsive even under load.

> In a run of the mill corporate Java app, if you experience an exception eg calling another service, unless you're doing something very strange, your app will not stop running, it will just keep error logging those exceptions until the problem is resolved. (whether by the target service coming back online or you send out a fix for the call or whatever) Really the only time an app will outright crash and completely stop is if it experiences errors attempting to boot in the first place, and the correct solution there is to not allow instances that don't respond 200 OK on /health to take traffic. You certainly don't want to attempt restart the system as it won't do any good.

I don't have the time to describe in detail, but I will just say that coming from a Erlang system to a Java/K8s/Microservices architecture, is was surprising for me just how much extra work it takes to replicate things that are built into Erlang.

tel · on May 16, 2023

A lot of what Erlang/Elixir offer have been cribbed by cloud architecture, so in terms of direct comparisons you may find that a sufficiently well-architected microservices system has similar properties and will make Erlang/Elixir less desirable.

But Erlang/Elixir can probably accomplish many of those properties at a fraction of the cost.

The core advantages of these languages are (a) a runtime which is highly optimized for running a large set of preemptable concurrent tasks with very minimal shared memory and (b) an ecosystem designed around building fault-tolerant applications out of many independent communicating processes that "supervise" one another to detect failures and gracefully heal them.

There are many details about how those things work, but I'll reiterate: together they offer a system with many robustness properties similar to that of an idealized large microservices architecture but with much less complexity and cost.

jacquesm · on May 16, 2023

> A lot of what Erlang/Elixir offer have been cribbed by cloud architecture, so in terms of direct comparisons you may find that a sufficiently well-architected microservices system has similar properties and will make Erlang/Elixir less desirable.

> But Erlang/Elixir can probably accomplish many of those properties at a fraction of the cost.

And far more elegantly and easily.

tel · on May 17, 2023

Completely agreed.

I'd go further as to say that a "sufficiently well-architected cloud architecture" is nearly unaffordable by most teams and thus, without loss of much generality, will not be obtained by you.

And a similar Erlang/Elixir system can just be made by one person.

jacquesm · on May 17, 2023

I think we can do a nice riff on the 'Common Lisp' aphorsim, which is that "Any sufficiently complicated Java,Python,Go or Javascript program contains an ad hoc, informally-specified, bug-ridden, slow implementation of half of Erlang/OTP."

fredrikholm · on May 22, 2023

http://rvirding.blogspot.com/2008/01/virdings-first-rule-of-...

bedobi · on May 16, 2023

I see, this is super helpful and interesting, big thanks

jacquesm · on May 16, 2023

What's different is not so much the language (though it is likely unlike anything you've ever used before) but the extreme focus on reliability, error handling, OTP, BEAM and all the other Erlang goodies. All of the parts are impressive individually, they work together to create something that is in my opinion unparalleled in the software world. Read this paper to get a much better idea of what makes this a very unique combination of elements: https://erlang.org/download/armstrong_thesis_2003.pdf

Good luck!

erenyeager · on May 16, 2023

It’s an expressive functional language with a ruby-like feel and very ergonomic. Also the ecosystem with things like phoenix, nerves, etc is getting better each day. The creator of elixir (Jose Valim) put a lot of love into the language and helped organize a lot of the community. It’s just a joy to use for me, outside of usecases of BEAM and such.

darkmarmot · on May 16, 2023

I’m not aware of another systems language that pulls off nine nines of reliability. Having been paid to write Java, C#, and Elixir systems, I’ve found the BEAM to provide vastly better fault tolerance. My current project has requirements for zero downtime and zero data loss, actively replicates across 4 data centers across the US and handles the real-time medical data millions of hospital patients.

bedobi · on May 16, 2023

> vastly better fault tolerance

I really want to get this but I don't

if you're eg experiencing a timeout because your code is calling another service that isn't responding, you can shut down that calling process and restart it all you want, it's not going to really fix the problem, you still need to address why that other service isn't responding? (though it may prevent the VM from being eaten up exclusively by waiting for responses that will never come - is that the idea?)

sshine · on May 16, 2023

An answer stolen from here [1]:

Erlang is fault tolerant with the following things in mind:

- Erlang knows that errors WILL happen, and things will break, so instead of guarding against errors, Erlang lets you have strong tools to minimize impact of errors and recover from them as they happen.

- Erlang encourages you to program for success case, and crash if anything goes wrong without trying to recover partially broken data. The idea behind this is that partially incorrect data may propagate further in your system and may get written to database, and thus presents risk to your system. Better to get rid of it early and only keep fully correct data.

- Process isolation in Erlang helps with minimizing impact of partially wrong data when it appears and then leads to process crash. System cleans up the crashed code and its memory but keeps working as a whole.

- Supervision and restart strategies help keep your system fully functional if parts of it crashed by restarting vital parts of your system and bringing them back into service. If something goes very wrong such that restarts happen too much, the system is considered broken beyond repair and thus is shut down.

[1]: https://stackoverflow.com/questions/3760881/how-is-erlang-fa...

jacquesm · on May 16, 2023

You should really do a write-up if your employer allows this.

darkmarmot · on May 16, 2023

We’ve actually released multiple open source Elixir libraries (I wrote the HL7 one, but one for MLLP - those are for medical messaging). My coworker has done several talks on the system, one linked here: https://www.erlang-solutions.com/blog/how-hca-healthcare-use...

choiway · on May 16, 2023

You're not missing anything if you can and don't mind building a framework of tools that run long running processes, handle the start up, failure and state of those processes and be able to introspect the status and internal state of those processes in a standard way.

The way isolated processes can be created and managed allows for the "let it crash" ideology in Erlang. For example, when you visit a Phoenix Framework site, you have your own process. If I was visiting the same site and encountered a state that crashed my process your process would be unaffected. The "exception" would only affect me until you ran into the same state that caused the crashed.

If I reported the crash to the developer, the developer could fix the bug and soft start the entire application without affecting your process. This isn't possible because of the language per se but really because of the entire Erlang ecosystem around fault tolerance.

*edited for more stuff

satvikpendem · on May 16, 2023

I don't use Elixir much anymore due to the lack of static types, but what initially got me into it was watching this talk by the Elixir programmer Saša Jurić, called The Soul of Erlang and Elixir: https://www.youtube.com/watch?v=JvBT4XBdoUE

agd · on May 16, 2023

The creator of Elixir is José Valim, not Saša Jurić.

satvikpendem · on May 16, 2023

Ah that's right, I believe Saša wrote the Elixir in Action book and might work closely with the core team.

twelve40 · on May 16, 2023

it does have specs, dialyzer and gradient for type checking, so there is hope...

POiNTx · on May 16, 2023

"Systems that run forever self-heal and scale" by Joe Armstrong (2013) [0] is a great talk too.

[0] https://youtu.be/cNICGEwmXLU

ipnon · on May 16, 2023

A good way to think about Elixir: instead of optimizing for bare metal performance and assembly code with a clunky process and thread API, you optimize for an elegant process and thread API then let the battle-hardened BEAM handle the bare metal performance (it is excellent at this). The magic (pun intended) of Elixir is that by thinking of concurrency first and processors second, your code is naturally structured in a way that makes applications Just Work in our massive distributed world.

josevalim · on May 16, 2023

Hello, Elixir creator here.

Indeed! Almost everything that Erlang/Elixir (I'll mention only Erlang from now on) offers can be achieved elsewhere - but Erlang offers it all in a unified and cohesive package.

In Erlang, all of your code runs inside cheap lightweight threads of execution called "processes". They are all isolated from each other and run at the same, which brings us concurrency. When they need to coordinate, they do so by exchanging messages with each other. Message passing is location transparent too: you can transparently communicate between processes in the same or in different machines.

Can you build distributed software in other languages? Yes, but it often requires bringing in additional frameworks and libraries. In Erlang, I do `spawn_node(othernode@localhost, fun() -> ... end)` to start a process in another node and that's it.

Can you scale in other languages? Definitely. But Erlang offers both vertical scalability (via processes) and horizontal one (via distribution) out of the box.

> In a run of the mill corporate Java app, if you experience an exception eg calling another service, unless you're doing something very strange, your app will not stop running

Correct. But this is achieved by having the web server or the framework wrap all user code into try/catch or similar constructs. It is a defensive style of programming where "forgetting a try/catch" somewhere leads the whole system to fail. In Erlang, everything run in isolated processes, so the _default_ mechanism is that crashes do not cascade.

---

You probably noticed that I wrote "processes" a lot - and that's the point. You can build scalable, distributed, and fault-tolerant systems in other languages, but it often requires picking specific solutions to each of those problems. A thread-pool/scheduler for certain concurrency patterns, a network library (or an external system for distributed communication), plus defensive coding. Erlang offers a single abstraction, called process, that encapsulates all of these concerns.

I often like to draw comparisons with k8s too. Certainly it was possible to orchestrate and deploy large systems before k8s but putting everything into a unified package does wonders to streamline the experience. I wrote comparisons between the two in the past too (if that's helpful): https://dashbit.co/blog/kubernetes-and-the-erlang-vm-orchest...

I am not sure what we could do better on the marketing - but if you believe we could improve something based on the additional context in this thread, let me know.

PS: +1 to Sasa's talk which others have already mentioned.

cwisecarver · on May 16, 2023

The community around Elixir and Erlang is also one of its best features. Everyone I've interacted with, like Jose, is kind and helpful.

bedobi · on May 16, 2023

This is indeed a huge plus and super important! I'm not gonna name names but there are certainly some languages that have a nicer communities than others!

jacquesm · on May 16, 2023

Thanks for all your hard work on this, it is much appreciated and I wished I had more free time to dedicate to Erlang/Elixir/BEAM. Top of my list for when I decide to stop 'real work'.

bedobi · on May 16, 2023

Big thanks for the clear and super helpful reply, I understand a lot more now and will check out those resources!

japhib · on May 16, 2023

Here is actually a good article comparing the JVM with the BEAM: https://www.erlang-solutions.com/blog/optimising-for-concurr...

--

And some of my own thoughts with a few more links:

The runtime system that Erlang runs on, called the BEAM, is special in that it is engineered for distributed systems and fault tolerance in a way that the JVM isn't. Everything runs in an isolated "process" (the Erlang term for their special green-thread type thing). Talking to a process on another node in the cluster is exactly the same as talking to another process on the _same_ node, so it scales in a way most languages don't. Message passing and the guarantees of processes means you simply never have to worry about mutexes or race conditions in the same way you do in almost any other language.

https://stackoverflow.com/questions/3760881/how-is-erlang-fa...

https://stackoverflow.com/questions/3172542/are-erlang-otp-m...

And as for Erlang vs. Elixir: Erlang is an old, somewhat unapproachable language with weird syntax and obtuse build tools. It's getting better, but it's fairly hard to use. Elixir is a modern language with a lot of nice things, that compiles to the same bytecode as Erlang, just like e.g. Scala or Kotlin for JVM bytecode. It adds powerful compile-time features like near-LISP-like macros that allow for powerful DSLs and overall contribute a lot towards a good developer experience.

Elixir is also awesome because of Phoenix, notably the most-loved web framework from the 2022 StackOverflow dev survey. Here's an excerpt from a recent talk by the original creator of Phoenix giving a demo of stuff that "just works" when using Elixir/Phoenix because of its concurrent nature, that would be a lot harder with any other language.

https://youtu.be/FADQAnq0RpA?t=1153

--

I can also give my own personal experience working with Elixir. I had never heard of it before I applied for a job at a company that used it in early 2021. I have now been at that company working in Elixir full-time for not quite 2.5 years. (I also have 5 years post-college software dev experience working in various other languages, most notably Java, Go, and Node.JS)

At first, functional programming is a paradigm shift and difficult to get basic tasks done. I had to learn to think a little bit differently, since the typical "for" loop in any other language is no longer available. But with stuff like streams in Java, you already are learning how to think functionally, so it's not really that different.

The runtime advantages of Elixir are great. Unfortunately it's never quite that simple, since you always have databases, SQS, Kafka, other services, etc. so you do still have to worry about some of the problems around race conditions and distributed systems that you would in another language.

However, it does make things within an Elixir service a lot more robust. For instance, in other languages you can have issues where a rogue regex or bug or whatever can cause runaway CPU consumption and either crash the node, or starve resources from other processes; well, this isn't as big of a deal on Elixir because the BEAM uses pre-emptive scheduling to prevent individual processes from causing big issues.

It also makes anything involving single-node concurrency easier. For instance, if you want multi-processing in Java or Go, you always have to worry about race conditions -- due to how processes and message passing are implemented on the BEAM, this is simply a non-issue. Plus, there are a lot of utilities in the Elixir standard library that make concurrency super easy to use.

The compile-time advantages of Elixir are a mixed bag. While I love having cool DSLs to describe my GraphQL schema and permissions, etc. these tend to lengthen compilation times. Yes, Elixir is a compiled language even though it is also a dynamic language. I mainly work in my company's monolithic codebase with about 800k lines of code, and compilation with even a simple change can sometimes take a full minute. A full re-compilation including dependencies is like 20 minutes. It's a bit much. Although, I should note that on a similar-sized Java project, I would expect similarly slow compile times. I think Go and Node.JS are better. The main issue here is that the Elixir compiler is implemented in Elixir and Erlang, which are dynamic languages... Imagine implementing a Java compiler in Python. You can only optimize it so far.

Last thing: I'm a really big fan of statically typed languages, yet I'm okay with Elixir even though it's dynamically typed. The reason is that there's not such a dependence on methods. In plain Javascript it's common to have a line of code like this: `myObject.update()` How are you supposed to know what code that is running? It's completely dependent on the runtime type of myObject. However, in Elixir, the same code would normally look like this: `Wimbles.update(myObject)` and so the majority of the time, you know exactly what code is running - it's the update function in the Wimbles module, which is not dependent on the runtime type of myObject.

Anyways, hope this is informative. Happy to answer more questions if desired.

to11mtm · on May 16, 2023

> Here's an excerpt from a recent talk by the original creator of Phoenix giving a demo of stuff that "just works" when using Elixir/Phoenix because of its concurrent nature, that would be a lot harder with any other language.

TBH Phoenix Templates are -still- easier for me to grok than ASPNET Razor, Despite only having 3 years of Elixir experience compared to almost a decade of ASPNET.

> Last thing: I'm a really big fan of statically typed languages, yet I'm okay with Elixir even though it's dynamically typed. The reason is that there's not such a dependence on methods.

Agreed. Elixir is one of the few Dynamic languages where the rest of the language sugar (namely pattern matching) makes the dynamic-ness as painless as could be imagined... as long as you remember to have fallback cases (i.e. if all else fails have a non-matched overload that will warn/error-log the unhandled case).

I'll also add, ironically, with an evolving application (as most are) the knock-on effects of being dynamic are actually -better- than a lot of alternatives; you wind up using Maps a lot of the time, and that, if nothing else, means you aren't fighting with serialization issues due to underscrutinized changes... and when it -does- happen it's just a matter of adding the right pattern match rather than dealing with a bunch of serialized data that can no longer be properly read.

I will however say that mnesia can have footguns, if you don't pay attention to how it works and proper clustering practices...

kaba0 · on May 17, 2023

> Message passing and the guarantees of processes means you simply never have to worry about mutexes or race conditions in the same way you do in almost any other language

Race conditions are impossible to avoid, so that part is absolutely not true. You may be thinking of data races, which is a (small) subset of all race conditions - but things like dead/live locks can’t be avoided statically with agents, so unless Erlang has some inbuilt runtime system to prevent live locks, this claim is not true in this form.

> Although, I should note that on a similar-sized Java project, I would expect similarly slow compile times. I think Go and Node.JS are better

Javascript itself doesn’t need recompilation, but the JS ecosystem does plenty of code transformations and it manages to be slower sometimes than full machine code compilation, it is ridiculous.. the frontend builds are almost always taking much longer than the order of magnitude bigger backend Java code in my experience.

bedobi · on May 16, 2023

> in other languages you can have issues where a rogue regex or bug or whatever can cause runaway CPU consumption and either crash the node, or starve resources from other processes; well, this isn't as big of a deal on Elixir because the BEAM uses pre-emptive scheduling to prevent individual processes from causing big issues.

Right, so where other languages would rely on external cloud infra to kill nodes that are seizing, BEAM does it out of the box. That's cool. Though I do think eg Kotlin coroutines are probably capable of similar things. (but they are only recently coming out, vs BEAM which has had this out of the box by default for ages)

toast0 · on May 17, 2023

> Though I do think eg Kotlin coroutines are probably capable of similar things. (but they are only recently coming out, vs BEAM which has had this out of the box by default for ages)

Most other green-threads end up only yielding on I/O; at least I've not seen any other system where instructions between yield points is bounded[1]. In that case, if you have enough green-threads things that want to run a CPU loop, you can lock up all the OS threads that run green-threads. If you have just one in a CPU loop and you ever need to do anything across all the OS threads, that could get you stuck, too.

[1] Erlang doesn't have a specific bound; you could make a really long function, but without a loop construct, you'll be forced to call a function eventually, and function calls are yield points.

japhib · on May 17, 2023

I’m not sure most other languages such as Kotlin implementing coroutines/green threads would be able to support it in the same robust way though. Pre-emptive scheduling means that the runtime will actively stop a resource hungry routine, so that other routines can have a turn. If a goroutine or Kotlin coroutine is spinning in a tight loop, it may just spin forever if it doesn’t do a syscall or whatever the runtime uses to check in and say, hey, do any other processes need time?

josevalim · on May 16, 2023

> I mainly work in my company's monolithic codebase with about 800k lines of code, and compilation with even a simple change can sometimes take a full minute. A full re-compilation including dependencies is like 20 minutes. It's a bit much.

Agreed. My focus on Elixir v1.15 (rc to be released this week) was to improve the compilation time and I have seen increases of up to 30% in compile and boot times. It requires Erlang/OTP 26 as well. Once you upgrade, I would love to hear if it did change anything for you.

japhib · on May 17, 2023

Awesome, glad to hear it! Unfortunately we’re stuck on older versions of Elixir and OTP but eventually we will get past the few things still blocking us from upgrading.

Additionally, another thing making compile times very slow is actually a very large Absinthe schema, so I believe it’s Absinthe that would need to be optimized to improve the situation, more than Elixir itself.

josevalim · on May 30, 2023

I dropped the ball on this but check the persistent term compiler for Absinthe Schemas. It should make a difference. :)

japhib · on June 3, 2023

just saw this, thanks for the suggestion! I'll give that a look.

japhib · on June 14, 2023

By the way, we enabled this, and it ended up reducing compilation times by almost a whole minute! Highly recommend for local development.

ralphc · on May 17, 2023

Do you think this applies to all OOP-centric languages as opposed to FP-centric? OOP languages need types to keep things sane?

to11mtm · on May 16, 2023

> Let's skip over "low-latency", because virtually all programming languages and VMs are "low-latency".

It's not fair to skip that one, because the way the beam scheduler works is pretty dang magical.

My (admittedly naive) understanding is that rather than hoping that 'yielding' constructs in a given language play well with the code, erlang/elixir processes are given a certain number of 'quantums' in the scheduler. [0]

The other thing worth considering here, is that it is fair; there are benchmarks out there [1] that show other languages may have lower latency in some cases, but the overall fairness of the BEAM scheduler is better, and per above has less traps/footguns.

As for the 'distributed' part, Erlang/Elixir have very specific rules about BEAM process inter-communication; with a few exceptions things are always copied by value, so you don't have to worry about mutation/etc. YES other languages may have similar concepts but BEAM specifically has a certain level of optimization around these cases so that the penalties are less in many cases.

> It's also not me advocating corporate Java (shudder) - I'm just using it as an example of a common technology that also seems to tick all the same boxes as Erlang and Elixir as far as these claims go.

I'd argue that the closest you'd get to Erlang/Elixir in the JVM world is Akka. And while I've only worked on/with Akka.Net directly [2] I can say that also having worked on an Elixir App, Elixir is far simpler even if the language has it's own idiosyncrasies [3]

[0] - My understanding is that yes, poorly written FFI (AKA C library calls) -could- cause problems still, but thankfully the elixir community does a good job of making sure such things are handled well.

[1] - Can't remember where but I've seen them.

[2] - I've definitely looked at enough pre-license-change JVM Akka code to be able to read Scala (if not write), for whatever that's worth.

[3] - tl;dr- Due to Elixir's typing/argument binding model, your best bet in most cases is to use Maps everywhere.

pdpi · on May 16, 2023

> The other thing worth considering here, is that it is fair; there are benchmarks out there [1] that show other languages may have lower latency in some cases, but the overall fairness of the BEAM scheduler is better, and per above has less traps/footguns.

This is a recurring topic in a lot of engineering problems: optimising for the best average case possible, versus optimising for the best worst case possible. Like e.g. a game that averages 200fps but has occasional dips to 20 fps versus a game that is rock-solid at 100fps and never dips. The absence of dips matters more than the peak performance.

toast0 · on May 16, 2023

> My (admittedly naive) understanding is that rather than hoping that 'yielding' constructs in a given language play well with the code, erlang/elixir processes are given a certain number of 'quantums' in the scheduler. [0]

IMHO, the secret sauce here is that all function calls are potential yield points and construction of BEAM languages prevents avoiding function calls (you have to write loops as recursion), which makes yielding effectively preemptive. In most other languages, you have to choose between OS threads which are actually preemptive but at significant cost, or hoping none of your green threads loop without I/O.

(Yes, you can break this in FFI)

cnnrjcbsn · on May 16, 2023

> Let's skip over "low-latency", because virtually all programming languages and VMs are "low-latency". I believe the latency claims here are about running programs with multiple threads of execution. The Erlang VM (which Elixir runs on) uses preemptive scheduling and (I'm forgetting the exact number atm) each process (which is a userspace process) gets to run for a set period of time (I'm simplifying here) before the VM preempts it and lets another process run. If you have a rogue thread in something like Node, that thread can eat up all your CPU time. In Elixir/Erlang, it can't.

> I'm not sure what "distributed" means here? Any app in most languages can be coded to be "distributed" both on the scale of within a single machine and at the scale of deploying multiple small instances of it. (eg Kubernetes, running it on multiple EC2s etc etc) The BEAM has a bunch of the "distributed" bits baked in that you then (mostly) don't have to think about when writing your program. If you have two machines, A and B, your program can run on those machines as if it was conceptually a single machine. This is somewhat different from most "distributed" applications.

In general, I find a lot of the patterns around fault tolerance and concurrency to be conceptually easier to work with than languages like Java, C++, etc.

For instance, in Elixir, you might structure a connection to your database as a supervisor and a pool of 10 connections. If something happens to corrupt one of those connections, the process will simply die, the supervisor will see that the process died and spin up a new one to replace it and then re-run any pending queries the dead process was in charge of. You can still handle known exceptions in the process, etc, but the idea is to quickly get back to a good state. The runtime was originally built for managing phone connections so this design choice makes sense.

In terms of concurrency, I find the actor model (which other languages use too!) simpler to reason about vs a process forking model like in Ruby. Essentially, similar to Go, you get parallelism by composing concurrent functions. Unlike Go, however, the composition is at a module level rather than within individual functions. So each function in Elixir/Erlang runs in series, but you can compose modules together to run concurrently and pass messages back and forth.

In general, I would think about Elixir as a Ruby/Node/Python competitor rather than a Java/Rust/C++ competitor. You can very quickly build scalable web applications without a ton of code. My pitch for Phoenix is that it's essentially Rails but built 10 years later; so it has 10 years of learnings (+ solid concurrency support) but doesn't have 10 years of maintenance baggage.

I wrote this very quickly so it's possible some bits aren't clear/precise/etc and I would be happy to elaborate/clarify on some points later when I have time. But hopefully this was helpful. I would also endorse many of the links others have posted if you are interested in more details! Especially the YouTube talk and the Armstrong paper.

aeturnum · on May 16, 2023

I think the aspect you're missing is Erlang's total focus on 'green threads' (erlang processes - or just process) as the basic unit of computation. Just like Unix's "everything is a file" philosophy it's a mundane description that had wide-ranging benefits for building long-running services.

First, in the most basic situation, the VM will manage all of your parallelism. It will figure out how many physical CPUs and how much memory it has to work with and will maximize those resources. It will dynamically limit the resources consumed by a particular process such that it will remain responsive. Because erlang doesn't allow modifying values and uses message passing, memory corruption is extremely rare. This has pretty heavy performance and efficiency costs, but what you gain is extremely high flexibility in terms of scale.

To a first approximation, Erlang actually cares very little about how many machines an Erlang program is running. Once you embrace processes as the basic unit - which machine is running a process, as long as all machines can pass messages, matters very little. Also, again, all the VMs know what resources they have and can re-balance processes to best utilize those resources across many nodes. All without special configuration[1]. The very brave can even live-swap the code running inside a VM - existing processes spawned on the new code will execute and close as normal (no time limits, the VM has the old code still), while you swap your network interfaces over to your new code.

The process-centric design has other nice implications. Test can be run fully in parallel[2], which keeps a low-efficiency-per-thread VM very snappy in practice. Each process can fail fully interdependently as well, which means that cascading failures are quite rare. There are still lots of reasons not to live-console into production, but in a language that doesn't allow you to change memory values and also gives your shell full isolation, it's a lot more practical than in other languages.

It's worth saying that none of this is "magic" or from a special quirk of the language - any language that totally committed to message passing and allowing the VM to manage greenlet processes would get the same benefits. IMO these tradeoffs are very worthwhile for the kinds of things Erlang / Elixir specialize in (long running network-connected services) and I would encourage anyone who's been using Python for these kinds of things to consider this platform as an alternative.

[1] This is over-stating it quite a bit - you do need to care about multi-node issues and you do want to configure things if you're getting into weird resource bottlenecks...but it also will "just work" a lot of the time. Generally you should twiddle setting if you expect a weird workload in the cluster generally and if you have an occasional spike it will "mostly work."

[2] This isn't totally true in practice, you often have resources outside Erlang (databases, etc) that need to be shared between a large number of processes which limits parallelism.

sph · on May 16, 2023

I read a lot of rationalising in your post and not enough playing with the actual thing. Why is that?

Are we supposed to change your strong preconceptions about something you have never used, can't be bothered to learn from and is literally free?

It is quite rude of you to expect we have to do all the effort of changing your already-set mind for you.

Do the work, then feel free to offer your criticism after you have used it, if you found it lacking. In the past three days there have been a few pro-Elixir posts that are answering this question in detail, yet here you are still unsatisfied, hoping someone spends even more time to entertain your lazy skepticism.