Hacker News new | past | comments | ask | show | jobs | submit login

If you want to really understand the philosophy that makes Erlang ( and Elixir ) beautiful ( and why it made me a better programmer ), this conference by Greg Young is a kind of eye opener : https://vimeo.com/108441214 .

You realize then that clustering, hot reload, availability etc... are not only features but the logical consequence of a beautifully crafted environnement that aims at developer productivity.

I'm sometimes amazed on how easy I can achieve stuff on the Erlang VM that would take ( if it's not impossible at all ) at least 10 times the time in a more usual language ( Ruby or PHP when you work in the web industry as I do ).

My last example was when I needed to batch sql inserts in an events database. In a normal language I would have needed a queue, the libraries for it, workers, new deployments and infrastructure to monitor, monitoring, supervision, etc... In Elixir, in 20 lines of code, it's done.

If you do not need complex calculations, the Erlang VM can basically become most of your architecture. It's already per se a SOA.

The way I find it simplest to "enlighten" people about Erlang's peculiar philosophy is its approach to scheduling:

The Erlang VM is "reduction"-scheduled. This means that the given Erlang process currently running on a scheduler thread can get pre-empted, but only as a result of executing a call/return instruction. (Effectively, the pre-emption is a check inside the implementation of the call/return opcode.) As long as you don't execute any call/returns (don't call functions and don't return from your own function), your function body can run as long as it likes.

This is a design choice: because processes won't be pre-empted "in the middle" of a function, any Erlang process can feel safe executing an instruction that calls into native code, while not having to worry that that native code could itself be pre-empted and leave dirty state in the Erlang process's heap while some other process gets scheduled and tries to then message or introspect that process. It gives you a lot of leeway "for free."

So how does Erlang ensure that processes don't hog a core forever, given that you could theoretically just write a loop that spins forever? Well, in Erlang, you can't write a loop. Instead of loops, you have tail-calls with explicit accumulators, ala Lisp. Not because they make Erlang a better language to write in. Not at all. Instead, because they allow for the operational/architectural decision of reduction-scheduling. Without loops in the language, every function body will execute for only a finite amount of time before hitting one of those call/return instructions, and thus activating the reduction-checker.

The Erlang "platform" has been shaped around the choices of how to best construct a production runtime that gives you "hard things" (like calling into native-code libraries while maintaining thread-safety) for free. Or rather, you could say that where everyone else pays these costs when they hit the particular problem, Erlang pays the cost up-front in the design of the language+platform and how you're forced to code at all times, in order to make these hard things easy.

The same is true of so many other Erlang things:

- how synchronous messaging has to be implemented on top of asynchronous messaging with expected reply-refs and timeouts, so as to make the sender process, rather than the receiver process, be the thing that defaults to crashing if the receiver doesn't recognize the message;

- how OTP-framework code has to be structured as delegate functions that return to the framework, so that the framework can "be there" in each process to handle hot code upgrades and process hibernation;

- how sockets either block (when {active, once}), or will saturate a process with packet messages (if just active) until that process crashes on overload--because the network listener is a separate part of the runtime that lives in a hot loop and wants to just be given a place to stuff packets into, and isn't allowed to do anything that's not an O(1) operation, like expanding the size of a process's message inbox;


Erlang is not a programming language in the sense that other languages are. Erlang was not designed from the language in. Erlang (ERTS) is a runtime, and was designed from the runtime out, with Erlang being effectively a pure side-effect: the language that ended up being required to interact with the features of the ERTS runtime.

Of course, you can also go back and apply some design sense to the language, and then you get something like Elixir. But, despite large visual differences "in the small", your large Elixir app will end up looking very much like a large Erlang app. And this is because a large part of what you're doing in an ERTS language is not programming using the language, but rather weaving together the features of the runtime. (Contrast: using DirectX vs. OpenGL to manipulate the GPU. Two very different APIs, but one "runtime" they're both speaking to, consisting of features like shaders et al.)

>My last example was when I needed to batch sql inserts in an events database. In a normal language I would have needed a queue, the libraries for it, workers, new deployments and infrastructure to monitor, monitoring, supervision, etc... In Elixir, in 20 lines of code, it's done.

Can you provide more details on this? AMQP is pretty recent and people have been batching SQL inserts for much longer than it has been around. An external queue is not a requirement of non-Erlang languages, but I'm curious about specifically about how the implementation in Erlang would substantially differ/allow new approaches from that in other languages.

A simple GenServer ( a OTP behaviour ) linked to an ETS ( erlang in memory data store ) table would do the trick. Basically, It receives by message the inserts, and once the counter reaches x or timer reaches y secs, it inserts in the db. Thinking about it, 20 lines of code is already a bit verbose for it :)

In Elixir, you can use GenStage.

(In fact, I'm writing a GenStage consumer right now.)

True. But haven't played with it yet. It seems nice and straightforward also.

That was a great video, thanks! :)

how easy is it, even if you do need complex calculations, to get the best of the Erlang VM and call out to say, Python/Numpy or C when necessary? Can these external processes still be supervised, for example? Are decent sized matrices (for example 100x20000 floats so an 8MB data structure) easily movable around the Erlang VM via message passing?

IE is it viable in your opinion still to use Erlang as a system for distribution and routing of lots of heavy calculations to many users, if said calculations are performed outside of BEAM? I am looking at building a multivariate financial calculation engine, which must be interactive for up to 1000 users, with large firehose of real time data coming in, being massaged, and then distributed, with the calculation graph being customizable for each user interactively.

It is possible to start external processes from BEAM and interact with them. I've blogged a bit about it at http://theerlangelist.com/article/outside_elixir

You can also write NIFs (native implemented functions) which run in BEAM process (see http://andrealeopardi.com/posts/using-c-from-elixir-with-nif...). The latter option should be the last resort though, because it can violate safety guarantees of BEAM, in particular fault-tolerance and fair scheduling.

So using BEAM facing language as a "controller plane" while resorting to other languages in special cases is definitely a viable option.

I spent 30 minutes looking at NIF, but I was scared away. My understanding is that if the NIF crashes then BEAM crashes. Which leads me to think that if you need NIF then you need safety guarantees on the Native side that C can't provide.

Think of NIFs as Erlang's equivalent to Rust's unsafe{} blocks. It's where you write the implementations of library functions that make system calls, and the like. But, like unsafe{} blocks, you do as little as possible within them.

For example, if you want to call some C API from Erlang where the C API takes a struct and returns a struct, you'll want to actually populate the request struct--and parse the return struct--on the Erlang side, using binary pattern matching. The C code should just take the buffer from enif_get_binary, cast it into the req struct, make the call, cast the result back to a buffer and pass it to enif_make_binary(), and then return that binary. No C "logic" that could be potentially screwed up. Just glue to let Erlang talk to a function it couldn't otherwise talk to. Erlang is the one doing the talking.

On the other hand, if you have a big, fat library of C code, and you want to expose it all to Erlang? Yeah, that's not what NIFs are for. (Port drivers can do that, but you're about the right amount of terrified of them here: they're for special occasions, like OpenSSL.)

The "right" approach with some random untrusted third-party lib, is to 1. write a small C driver program for that library, and then 2. use Erlang to talk to it over some IPC mechanism (most easily, its stdio, which Erlang supports a particular protocol for.)

If you need more speed, you can still keep the process external: in the C process, create a SHM handle, and pass it to Erlang over your IPC mechanism. Write a NIF whose job is just to read from/write to that handle. Now do your blits using that NIF API. If the lib crashes, the SHM handle goes away, so handle that in a check in the NIF. Other than that, you're "safe."

Precisely, which is why I always advise to consider ports first :-)

However, in some situations the overhead of communicating with a port might be too large, so then you have two options:

  1. Move more code to another language which you run as a port.
  2. Use a NIF
It's hard to generalize, but I'd likely consider option 1 first.

If you go for a NIF, you can try to keep its code as simple as possible which should reduce the chances of crashing. You can also consider extracting out the minimum BEAM part which uses the NIF into a separate BEAM node which runs on the same machine. That will reduce the failure surface if the NIF crashes.

I've also seen people implementing NIFs in Rust for better safety, so that's another option to consider.

So there are a lot of options, but as I said, NIF would usually be my last choice precisely for the reason you mention :-)

Aren't dirty NIFs on the horizon as well which help with the whole scheduling issues currently associated with NIFs?

Dirty schedulers can help with long running NIFs, but they can't help with e.g. a segfault in a NIF taking down the entire system.

Apparently people are working on this using Rust for writing NIFs https://github.com/hansihe/rustler

Love your blog and book Sasa. Could elaborate on the fair scheduling disruption by NIFs? Don't recall ever reading about that

Thanks, nice to hear that!

Basically a NIF blocks the scheduler, so if you run a tight loop for a long time, there will be no preemption. Therefore, invoking foo(), where a foo is a NIF which runs for say 10 seconds, means a single process will get 10 seconds of uninterrupted scheduler time, which is way more than other processes not calling that NIF.

There are ways of addressing that (called dirty schedulers), but the thing is that you need to be aware of the issue in the first place.

If due to some bug a NIF implementation ends up in an infinite loop, then the scheduler will be blocked forever, and the only way to fix it is to restart the whole system. That is btw. a property of all cooperative schedulers, so it can happen in Go as well.

In contrast, if you're not using NIFs, I can't think of any Erlang/Elixir program that will block the scheduler forever, and assuming I'm right, that problem is completely off the table.

As linked elsewhere here, tight loops that never preempt are being fixed in Go 1.8/1.9[0]. Looks like a flag may been added to Go 1.8 called "GOEXPERIMENT=preemptibleloops" that adds a preemptible point at the end of a loop. It's behind a flag for performance/testing reasons, but they are working on it.

[0] https://github.com/golang/go/issues/10958

Won't pre-emptible loops lead to more irreproducible race conditions as a negative consequence, unless the preemption is done deterministically?

Are you asking about BEAM or Go? Preemption already works in BEAM and doesn't lead to race conditions because of nothing shared concurrency.

I was asking about Go. I understand BEAM's advantages in that area

There are libraries that allow C programs and Java programs to interact with Erlang as though they were Erlang processes. From the Erlang Interoperability guide: http://erlang.org/doc/tutorial/overview.html#id61008

The team at Github took another approach, described here: https://github.com/blog/531-introducing-bert-and-bert-rpc It's an old article - so I no idea if it's still in use. The Github sources haven't been updated since 2010.

I don't know how effective it is to chuck around 8MB data structures via message-passing. I have no experience of this myself.

> I don't know how effective it is to chuck around 8MB data structures via message-passing.

It's my understanding that above a certain size and within the same running Erlang VM (machine), a reference is passed instead of a full copy.

only for binaries. But if your matrix is mainly modified outside of erlang, it may make sense to only use it as an opaque binary inside.

Or compile Elixir to native code with HiPE by adding:

    @compile [:native, {:hipe, [:verbose, :o3]}]
to the top of the module with the arithmetic. It might not be as performant as C but it can be about 10-15X more performant than pure elixir (in my tests anyway).

I've no idea of the relative performance against numpy...

You can write NIFs in Rust (which is made even simpler by supporting libraries like Rustler). scrogson, for instance, is using this to fiddle with lower-overhead json.

Re: supervising external processes, an easy hack if you're writing the processes is to add a deadman's switch to both sides, and then launch the processes from a port in BEAM-land.

This effectively makes them supervised; kill the beam process and the external process will die, and whether they both get relaunched depends on the restart strategy the supervisor launched the child with.

Another solution is to make your non-erlang process run as a foreign erlang node ( example of a Go library to implement this : https://github.com/goerlang/node ). There are some other messaging libraries for python and ruby that do exist for this.

It's doable. Erlang is used to talk to the hardware and C code. Can build a C functions (NIF), an driver (for IO for example), spawn a process, or implement the logic of Erlang distribution protocol (what is used to talk between VMs) in C.

With 20.0 coming up it will be even easier. A nifty feature called "dirty schedulers" will become stable and so building long-running calculation in C will be much easier. Previously had to take care not to block the running scheduler thread.

All-in-all Erlang is really good at connecting to and managing things, C and hardware being one of those things.

Very easy - many solutions to do just this. Porcelain is one such library for Elixir that lets you call C executables, interact with CLIs. I have a lib called pricing on my github that uses the lib to price options using a simple C executable.

You may want to have a look at talks like this one

https://www.youtube.com/watch?v=xj3smNjGLaE It is a common pattern to use erlang has a controller plane.

This has been a very informative sub-thread. Many thanks to all contributors for your thoughts and experience. You have helped to move me forward in confidence on using Erlang/Elixir(basically BEAM) for the distribution and routing side of my enterprise-scale soft realtime data-interpretation project. A great testament to the quality of contributors on HN. I will post at a later stage on progress.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact