It's been said Greenspun's 10th Rule, when applied to distributed systems, results in Erlang.
What are the lessons of Erlang, and how do you abstract as much of that as possible away from the language/ecosystem to others? I'm looking for an answer a little more accessible than "you have to learn Erlang to find out."
Part of what makes Erlang Erlang is that a lot of this stuff is integrated very deeply into the language and its libraries. It's not just something you can bolt on in many cases.
However, I think the "why" of Erlang is pretty compelling in and of itself, especially if you're interested in extremely high-availability telecom hardware with an operational life measured in decades.
Joe gives good talks. Here's one he recommends as an intro to the capabilities and philosophy of Erlang: https://www.youtube.com/watch?v=u41GEwIq2mE
The entire Erlang thesis falls out of one simple fact: for a program to be highly available, it must run on more than on piece of hardware.
When you run on more than one piece of hardware, you need to be able to communicate between pieces of hardware, detect failures, and when something fails you need to be able to restart it (hacks: systemd, SMF) or reassign responsibility to a different piece of hardware that hasn't failed yet.
Monitoring and restarting in itself isn't simple. What if what you're monitoring is on another physical machine? You need built in networking and for simplicity, you need built in location transparency. If I start process Bob on machine A and process Kat on machine B, I should be able to interact with Bob on either machine without specifying where it's running (i.e. RMI that works).
And all of this is the Erlang VM so far—nothing to do with the language itself. The language adds nice features like pattern matching and atoms and tuples and cons cells and binaries and bignums, but those features are independent of the "Erlang-ness" of the runtime.
There are a lot of great articles written about rationales behind all these topics and a lot of great talks up on YouTube. Just search YouTube for Joe Armstrong talks and/or general Erlang intro talks. You'll soon see how the basics of Erlang are being re-implemented in 2015, except half thought through, and it'll take another 10 years for these ad-hoc remade components to become "I'd bet the farm on it" reliable.
* L1: Today we are stuck building distributed and highly concurrent systems. We don't have a choice. Single CPU, single machine, where speed doubles every 18 month era is in the past. Because of the internet and lot of data most systems today are distributed. (How many startups do you know that ship a standalone desktop program these days?)
* L2 : Distributed and concurrent systems to be useful have to be fault tollerant. We don't want a segfault or panic caused by one corner case, triggered by one client, to bring down the rest of the server and the rest of the million connected clients. That would be a horrible page/phone call to get at 4am
* L3 : For the systems to be tollerant they have to be built out of isolated components, such that when they fail that failure doesn't spread through-out the whole system.
* L4 : Isolation can be achieved in a few ways:
- A runtime system that prevents sharing memory -- OS processes do this. Erlang's VM does this as well, except you can have millions processes on a single machine.
- Proving memory won't be shared. Rust's compiler can do this. It can prove at compile time that you won't have data races between your threads. Rust is the best and most interesting new thing in languages in the last decade probably.
- Running in a container, VM or a different machine. At the end of the day of course, if your service is running on a simple (non-mainfraim-y) single machine, it is not fault tollerant.
* L5 : These isolated concurrent units also send messages to each other to communicate (instead of say reading from a shared memory with a mutex thrown in there some place). These units are often referred to as "Actors" and there is a whole class of frameworks, libraries that implement that besides Erlang (Akka, Orleans, etc.)
* L6 : Erlang in addition to these basic blocks also comes in with:
- Functional programming approach. Your data and variables are immutable. So it is easy to look at a piece of code and understand what is being changed. State updates are very explicit. And that is on purpose.
- A framework of patterns used to build/monitor/distribute these concurrency units. This is called OTP.
- Monitoring and debugging capabilities. You can connect to a running VM node, inspect, trace, debug even live code update without stopping the system.
- An decades long ecosystem and experience building these kind of systems.
* L7 : If you are afraid of non-curly braces syntax and do not like typing , instead of ; take a look at Elixir. It is a new language but runs on the Erlang VM.
Hopefully this helps!
OS-level virtualization is also completely orthogonal.
Erlang's VM is like a mini OS for your backend system except that it can handle a million isolated processes while the OS can handle a few orders of magnitude less.
The latest problem Erlang is addressing is regarding leap-seconds and how they can cause all kinds of strange behavior in applications. In Erlang, this is called Time Warp . I have yet to see another language try to tackle time correctness, since most are still too busy ironing out the performance of their garbage collector or how to do concurrency.
But if IRCC the whole premise of TeaTime is to not actually do time "correctly", but rather aim for "mostly predictable ordering of events".
Maybe it's time to dust off the old Croquet ideas, and mix the oculus rift with a platform based on Elixir?
I discovered it while reading the parent article. I think it's a good checklist that will contribute to resiliency better than a language choice. Of course, adding a language with strong type and memory safety to such a good development process will certainly drive quality up from there. Efficiency, too, as recent article indicate.