Why Erlang

jlouis · on Sept 28, 2008

Statelessness: Erlang is not side-effect free. Sending a message is a side-effect. You depend on receives of messages from other processes as well. Truth is that you can write side-effect-free code in any language and those functions are easier to debug. The functional semantics of erlang makes this easier, but beware that you can't cheat and get a variable.

Tail recursion: Even better. Erlang has Tail-call-optimization. Any tail-call reuses the stack-frame.

Lightweight processes: Large chunks of (immutable) binary data is shared among processes. This is crucial for keeping memory usage down. Also, there are ETS-tables which can be seen as a shared in-memory datastore among processes.

Messaging: Sending a message is fast. But remember you are doing a copy of the data into the other process memory space (unless you are sending a binary, see above). For small messages, this doesn't matter, but you shouldn't be tempted to send that one million node binary tree ;)

Hotpatching: The hotpatching is mostly centered around the ability to runtime-upgrade between versions of phone-exchange software. People tend to think of Common Lisp style hotpatching where you can dynamically build up your application from scratch without restarting the CL environment. Erlang doesn't really support that, but you can hack it to give you somewhat of the same.

Fast: Erlang uses a threaded-code bytecode interpreter called BEAM (by default, there is the HiPE project). Now, the rule of the thumb is that interpretation is a factor of 10 slower than compilation. The claim of a factor 3 is wildly exaggerated. Floating points are all boxed so it is easy to get a factor 30 in difference there. The persistent memory model also costs compared to C. Some data structures are impossible to get directly. A string of 100 bytes takes up 1600 bytes on a 64-bit machine.

And my own add:

Parallelism: The language semantics maps really nice onto a parallel world where we explicitly control the parallel processes. But the SMP-interpreter in the Erlang kernel is still pretty young. Note that the language was originally built for concurrency rather than parallel computation.

Conclusion: Erlang is a really cool language, but it is not built for everything. If people think they have solved the multi-core problem by using Erlang, the world of parallel computation is just dawning on them. Remember, there is no silver bullet.

tsuraan · on Sept 28, 2008

Just a few clarifications (upon reading, it doesn't look like hn supports html markup; I'm leaving the tags so that quotes are at least visible...):

<i>Messaging: Sending a message is fast. But remember you are doing a copy of the data into the other process memory space (unless you are sending a binary, see above). For small messages, this doesn't matter, but you shouldn't be tempted to send that one million node binary tree ;)</i>

Two points here: as you stated in the point above this one, large binaries are shared, so sending a large binary as a message isn't terribly expensive. For everything else, copying messages is an implementation detail, not a specified behaviour. If you run erl -hybrid, then all processes do share a single heap and messages are not copied (unless sent between nodes, of course). Hybrid mode is still very unstable and you don't want to use it unless it really does happen to work for you, but it's proof that the copying of message data isn't a requirement of erlang, but instead is just a current implementation detail.

<i>A string of 100 bytes takes up 1600 bytes on a 64-bit machine.</i>

Erlang strings (i.e. any string literal, such as "hi there") are implemented as linked lists of integers. If you're doing string processing on large strings, you probably want to use binaries, which are 8 bits per character (plus the binary head, which is a varint, IIRC). These can be stated as <<"Hi there">> and are much more memory-efficient, although not as convenient to work with as linked lists.

PieSquared · on Sept 28, 2008

If you want italics, put whatever you're italicizing in asterisks. This is italics, and I typed that by writing "(asterisk) This is italics (asterisk)"

tsuraan · on Sept 28, 2008

Cool, thanks.

yariv · on Sept 28, 2008

"People tend to think of Common Lisp style hotpatching where you can dynamically build up your application from scratch without restarting the CL environment. Erlang doesn't really support that, but you can hack it to give you somewhat of the same."

I (almost) never restart my Erlang server, and that lets me develop just as you described. Why do you say it's not supported?

(Btw, it really sucks to go back and use languages that don't have hot code swapping. Developing in them feels terribly slow.)

Fast: if you put is_float() guards on your function definitions, Erlang will optimize their computation. If you worry about memory consumption of strings, use binaries.

"Note that the language was originally built for concurrency rather than parallel computation." I don't follow this argument. Isn't the point of concurrency to implement parallelism?

jlouis · on Sept 28, 2008

I am not too sure that an is_float() guard will yield a performance increase in the BEAM interpreter. It will give an increase with HiPE though I am sure. As per strings and memory consumption: Yes, you can use binaries or even better: iolists, but I think, then, that the string type has to go. A good (immutable) string library based upon binaries in OTP would be nice.

Concurrency and parallelism: No, they are not the same. A concurrent problem is one where you want to cope with concurrent activities: web servers, phone switch exchanges, routers, etc. A parallel problem is one where you worry about how to split the problem across multiple CPUs/Nodes so that you utilize as many resources as possible while achieving the best speedup possible.

The latter is usually number crunching on big clusters of several hundred machines -- or number crunching on a grid of computers loosely spread all over the world. The first is about coming up with a nice model for computation that makes you model the problem in a logical and straightforward way.

Erlang has a very nice concurrency-model. Together with the rock-solid and stable VM you have a very good outset for solving concurrent problems. But -- you can implement concurrency well on a single CPU, the key point is that you have a nice model in which to work when trying to explain the problem to the computer.

In a parallel problem, it is crucial how you split the problem up. Some problems have subproblems independent of each other. These are called 'embarrassingly parallel' problems. All of these are map-reducible. But how about a problem which is not? Numerical solving of differential equations via successive over-relaxation is a classical problem because each subpart needs to communicate its findings to neighboring subparts. You are worrying about the speedup when adding more CPUs or by Gustafsson: keeping the time-frame constant and adding more CPUs - how much bigger a problem you can solve.

Erlang, being about 30 times slower than C on numerical computation, needs at least 30 times the speedup to compete. Depending on the problem, that can be achieved with, say 32 CPUs, or perhaps not at all (if the problem has no inherent parallelism), where the C program only uses a single CPU. That is the gap Erlang is currently competing with. If you take the de-facto standard, MPI, and use that on C, you pay a lot of development time for a message passing interface which even allow for asynchronous messaging (crucial for hiding the latency of computation).

I hope this makes the difference clear.

qhoxie · on Sept 28, 2008

Despite this being largely common knowledge to those who have researched or used Erlang, he gives a really solid overview of the perks of the language. Great summary read for the uninformed about Erlang.

noahlt · on Sept 28, 2008

Is there a good online introduction to Erlang, like Dive into Python or _why's poignant guide to ruby?

tsuraan · on Sept 28, 2008

I think the best intro is http://www.pragprog.com/titles/jaerlang/programming-erlang, which comes in PDF form but also costs $22.00. As far as free resources, there's http://erlang.org/doc/getting_started/part_frame.html , which isn't nearly as fun as _why's guide, but is informative. I personally just bought the book; it's really well written and actually got me interested in using the language.

sammyo · on Sept 28, 2008

Funny name?

Cheap & Easy? Woo! (thread creation that is)

I think it will not really take off until the general infrastructure (millions and millions of cores) is more generally pervasive.

bstadil · on Sept 28, 2008

The name is in honor of a Danish mathematician that invented Queue theory. A key component earlier in the life of telephony. How to man switchboards and the like.

The telecom connection is via Erickson that invented the language.

http://en.wikipedia.org/wiki/A._K._Erlang

heyadayo · on Sept 28, 2008

He lost me when he said erlang is less than 3x slower than c

jmtulloss · on Sept 28, 2008

3x Is an exaggeration, but it compares pretty well with very fast languages like Java and C#.

http://shootout.alioth.debian.org/u32q/benchmark.php?test=al...

gm · on Sept 28, 2008

Answer: Who gives a crap? Since when did a variation of an answer matter more than the problem?

If a new computer language solved a problem otherwise unsolvabe, then it would raise eyebrows. If Erlang had a chance, Scheme and all those stack-based languages would be popular. Anyway...

nostrademons · on Sept 28, 2008

Erlang solves a problem that's currently very difficult, namely massively concurrent, distributed, fault-tolerant systems.

gaius · on Sept 28, 2008

Well, that's not strictly true. What Erlang does is make the knowledge of a niche that does know how to do that - telco switch engineers - available to the masses. I think even Erlang advocates sometimes for get that it's not new - it's been a serious production language for years now.