Hacker News new | comments | ask | show | jobs | submit login
What's all this fuss about Erlang? (pragprog.com)
155 points by DanielRibeiro on Aug 5, 2012 | hide | past | web | favorite | 99 comments

Some neat Erlang projects:

* ErlGuten, PDF generator: https://github.com/hwatkins/erlguten

* OpenCL binding: https://github.com/tonyrog/cl

* gen_smtp, SMTP library: https://github.com/Vagabond/gen_smtp

* erl_img, image manipulator: https://github.com/evanmiller/erl_img

* ErlyDTL, Django template compiler: https://github.com/evanmiller/erlydtl

* BossDB, evented database abstraction layer: https://github.com/evanmiller/boss_db

* Jerome, RTF/HTML/BBCode/Textile processor: https://github.com/evanmiller/jerome

* Cowboy, web server with WebSockets: https://github.com/extend/cowboy

* Chicago Boss, full-featured web framework: http://www.chicagoboss.org/

Those are just my favorites, there are many more listed here: http://erlagner.org/

Things I would use Erlang for:

* Network programming

* Web programming

* Sending a lot of emails

Things I would not use Erlang for:

* Numerical programming

* Video decoding

* Desktop or mobile applications

Basically Erlang is really good at receiving, parsing, and sending data, and bad at anything remotely processor-intensive (or graphical). I think this situation might change with Erlang's OpenCL binding (second link above), which lets the programmer skip the overhead of Erlang's run-time and go straight to the CPU or GPU for numerical tasks.

Message passing is fantastic... when you can decompose your problem into lots of small, independent jobs, like a web server. However not many algorithms are like that. If you can't decompose your problem into small independent jobs then you will be left with some kind of shared job. For example, an airline seat booking system, where at some point you need to modify a shared set of seat states (no, you can't divide the seat state up, because then you can't take massive bookings if needed). Another example is sorting algorithms. If you divide and conquer, at some point you need to combine results.

In Erlang, the normal way to do this is to have a single actor handled that shared job. That then becomes a huge bottleneck, and with more cores it gets to be more and more of a bottleneck. This is Amdahl's law. This is where Erlang breaks. In a language with shared state (that includes Haskell or ML with mutable references) I can use very fine grained locks, or perhaps transactional memory, over this shared part of the problem, and continue to scale beyond what Erlang was capable of. Probably not near n-times for n cores, but much closer than what Erlang would allow.

Erlang's message passing creates bottlenecks! And then there's no way out. In shared state you can be nice and message passing, but then also have the power to use very fine grained locks or TM to break down those bottlenecks.

Sorting algorithms? Let's take mergesort, for example. In Erlang a process could divide its portion of the collection in half, hand each half to another process, wait for two sorted collections to come back, merge them and pass the results up the chain. Conceptually simple and good multicore use. The only time you're down to a single core is on the final merge, between two sorted halves of the whole collection.

The merge is an inherently sequential process. Are you claiming that somehow you can merge two sorted arrays in parallel using fine-grained locks? How exactly?

No, the parallelism in that algorithm is fundamentally limited. That bottleneck is not going away, and was an example of where neither Erlang nor any other model of parallelism will help you.

Do you have any good ideas about how I can solve the seat booking example using Erlang's model of parallelism? There is a block of seats. I want to be able to handle seat bookings in parallel. The seats that people might want to book could be in large groups - as large as all of the seats.

You shouldn't, you should use Erlang's model of having a single process ultimately manage the state make implementing transactions easy. Several processes can concurrently try to reserve seats, but ultimately they all have to get in line and submit their reservations, receiving either e.g. {ok, NewState} or {conflict, NewState}.

That's the limitation of Erlang, right there. They get in line, creating a bottleneck. You could have 10,000 Erlang processes, but they all wait in line for this one process to service them.

If you use transactional memory with threads and mutable state, then all your booking processes can access the board state in parallel with ACI properties. Obviously, parallelism is still limited, as it always it, but as long as your booking processes are accessing different parts of the seat data structure, they can run in parallel. If they try to access the same part of the structure at the same time, all but one will be rolled back, but that one still gets through. Hopefully, they access different parts of the structure and sail through.

Your suggested Erlang program has no parallelism in it at all (for the isolated part of the seat booking of course), and will not scale to 2 cores, let alone the 64 cores I have in my machines at work.

Erlang is not about parallelism, it's about fault tolerance. It's about "making reliable distributed systems in the presence of software errors" (as Armstrong's thesis is titled).

How will your STM example behave if the computer managing the transactions suddenly has a massive hardware failure?

I completely agree with you about fault-tolerance. However Erlang, CSP, actors and message passing are heralded as the key to parallelism. The Erlang book still promises n-times speedup for n-cores, with not enough of a caveat.

I'm not familiar with the plane scheduling problem, there is probably room to parallelize it further. I suspect people are far less likely to book seats on both sides of the aisle(s) at the same time, for example. Knowing more about those things, airline regulations, and so on will probably present other opportunities.

You could do as jlouis suggested and have each plane or flight manage its own seats/places. The state of different flights or planes isn't necessarily related and can be handled differently, giving you a smaller isolated unit than "everything at once".

This gives place to parallelism as soon as you have more than one plane to contend resources for in your concurrent model, which is most likely the case in the real world.

If you have only one plane, then you likely won't need to scale.

Here are some ideas for your two problems.

First, the problem is sorting. One very efficient algorithm is to utilize a parallel sampling sort algorithm. The idea is to start off P processes and sampling randomly to obtain P-1 splitting elements. Then the Array of P-1 splitters are distributed to each of the P processes and input data is now sent around so each processor has its batch of data. Next, each process sorts its own batch. It turns out to be pretty fast in practice to run this kind of operation and it may be fast enough.

Note the distinct advantage of message passing here: It works, even if a process is running on another physical machine - shared memory doesn't. Also note that Erlang is not really built for parallel computation, and your example is one. My guess is that you can achieve a pretty good speedup with a sampling sort variant on multiple cores.

As for your flight seat problem, it really isn't. Keep the state of a plane in a process. That is have a single process assigning seats in the plane. On very large planes you may have 500-1000 seats. Hardly something which will dwarf a single process. Chances are, however, that you may want to assign seats to multiple planes at the same time. In that case you get your concurrency and a possible speedup.

As for getting linear speedup in Erlang systems:

First, your problem must expose the necessary amount of work. That is, there must always be something to do for an idle processor. In Erlang, this is easy to achieve if there are multiple agents interacting with the system at the same time, or you can create it yourself by spawning off some processes for each incoming job.

Second, your program must avoid contention around a single process. That is, you must ensure processors are not all looking for the same resource. We have tools like percept, the lock counter and dtrace for that.

An Erlang process works very much like transactional memory if written correctly. But TM has the same problem: contend around a resource and you are in trouble. It might be that the breaking point for contention is different due to different characteristics in overhead, but in principle the contention problem is also there.

Finally, there is a "cheating" possibility. Erlang provides ETS which is a tuple space for erlang terms. It allows parallel read and write access outside the process context. And very fast such RW access. For certain problems it can be used to allow multiple processes read access to the same data which can in turn speed their work up considerably.

But ETS is not distributed, oh the horror! Enter mnesia, which distributes ETS over multiple machines and optionally provides persistence. Mnesia is really a memcached/redis distributed in memory key/value store with DB-like query properties. Limited - but oh so powerful for certain problems.

Akka (Scala actor library on the JVM) allows for STM on multiple actor states.

Most of the comments here talk about additional advantages of Erlang without commenting on the point of the original article -- which is that servers would move towards multi-core systems where Erlang excels. Of course, this hasn't really happened, because we've moved to cloud services instead, which typically offer a single virtual CPU.

I'm not arguing against Erlang. It's a great language and folks could certainly choose to run it on their own multi-core system or scale across multiple virtual servers. But startups often pick the technology they can get running with minimal investment, then scale from there: a VPS or two with a single CPU and constrained RAM. So although Erlang might be a great long term investment, it requires a lot of RAM up front and hyped features like multi-core support don't matter as much to young startups choosing technology at the early stages.

It would be great to see more discussion about how well Erlang scales across multiple VPS systems -- and not just theoretically, but how many systems you can practically scale across, how it impacts performance, RAM usage, etc. Plus how this would compare to just using an independent messaging queue/bus to accomplish the same thing in another language.

I'm by no means an expert, but I have written production Erlang applications (the instapaper-like functionality on bostonglobe.com). We initially had it running as a network of 5 VPS systems (three web-facing servers with copies of the entire db in memory and two machines keeping disk copies), which performed more than adequately. Even with the db in memory, RAM usage really wasn't an issue for us. The runtime was initially designed to run embedded in telephone switches so it's pretty lightweight, all things considered.

I've heard talk that things start to get a little hairy once you scale to around 100 nodes, due to the sheer amount of messages getting passed around (there are ways around this, like creating hidden nodes that communicate between clusters). Performance across multiple VPS systems is really going to vary dramatically between different applications and VPS setups. An app that has to make lots of requests to other nodes, which are located across the globe, is going to suffer a lot more than one that has relatively little internode communication between VPSs in the same server rack, but that's just common sense. I'd be really interested to know what companies like Basho (creators of Riak) and Facebook have experienced with large setups, as horizontal scaling is one of the language's selling points.

More so than scaling across multiple cores and machines, I find that one of its strongest points is it's scheduler. If one request to a server is particularly processor intensive (or hangs), it's not going to affect other requests, which is an awesome feature to have even with a single core.

Erlang's definitely not for everything, but once yo get the hang of it, it's really easy to prototype in, scales well, and is generally lightweight enough to run on a cheap cloud instance (or embedded on a Raspberry Pi).

Most of Erlang's goodness evolved before SMP was available. IIRC, Erlang didn't support smp before its R11 release. Before then if we wanted to light up all the cores on a box, we'd fire up multiple nodes on a machine and use Erlang's distributed messaging.

Here are some benefits to single core erlang, off the top of my head.

* Erlang's VM is phenomenal at handling concurrent I/O even on a single core. Free epoll/kqueue backed async i/o and iovec ops.

* OTP, applications, supervision trees, state encapsulation, fault isolation, cheap GC.

* Built-in distribution, node/process monitoring, code hot-loading, mature tools and introspective capabilities on runtime state/performance.

* Some degree of "free" scale when a company is ready to deploy that app on a machine that has more cores.

I'm not sure what you mean by requires a lot of RAM up front -- compared to C/C++, yes.. but, relative to ruby, python, or the JVM? That's not been my experience.

I suspect that most PaaSes use a single virtual CPU because all the "high-productivity" languages are single-core (Ruby, Python, Node).

This articles is now 5 years old - any Erlang programmers who can enlighten me about the current state of Erlang compared to the predictions in this article?

The library support and community have both deepened. Riak has been built, Facebook Chat has proven its scalability, open source projects have mostly moved to GitHub, #erlang and #erlounge IRC channels on freenode.net are very friendly and well populated.

In the web space, there are now several options if a small team wants to build something to scale to 100k+ concurrent connections on a single server. Hackers in #erlounge are regularly benchmarking systems with 500k+ users. It's crazy talk.

Actually, I was talking to a few facebook guys a couple weeks ago and they rewrote the chat app in c++ a few years ago. The primary motivator was that nobody knows Erlang, while most of their engineers already know c++. I found that really surprising coming a from a company that prides itself as having the smartest people around, and, as awesome as Erlang seems, after hearing that I might think twice about using it for a project if there's a possibility that I'd be hiring other engineers to work on it with me.

Or you can choose to use it and base your hiring policy on hiring great people, rather than hiring "I can only speak C++" engineers.

I think that certain realities take over when you get as big as facebook is. It's not the little start-up that could anymore, it's now the 800 lb Gorilla that churns though staff. While you might be able to have the smartest people around at your beck and call you'll want to keep the system accessible to as many of them as possible so your options are always open and there are always going to be more C++ programmers around than Erlang programmers.

This is weird though because erlang is way easier to wrap your head around than C++. A C++ programmer should be able learn erlang in a a couple weeks max. If they can't do that then I probably don't want them doing C++ code for me either. Now if they rewrote it in java, because java programmers are easier to find, that might be less weird but C++?

erlang is a quite old language, as such, it doesn't move very fast.

Erlang R15 brought some nice features, like (FINALLY!) line numbers in error reports, there will be a better (built-in) dict rather soon.

> (FINALLY!) line numbers in error reports

Are...you serious?

Yup - because the language extensively uses tail call optimisation, inlining and other rewriting in the execution it used to be hard to reconstruct the original state of the code on a run time failure. It is less of a pain than it seems not to have line numbers in error reports - but it is still a pain.

Actually, the inliner is not used unless you explicitly enable it (for the core). The runtime BEAM backend may decide to inline as well, but that is another part entirely.

But yeah, it is not much of a problem having no line numbers when your functions tend to be of a 4-5 line size.

For the people who don't like Erlang's syntax, there are alternatives. I've used Elixer (http://elixir-lang.org/) with some success. It has a Rubyish syntax and gets rid of a lot of the warts newcomers seem to dislike.

The language is actually the easiest part, the biggest hurdle seems to be organizing a complicated application to take advantage of OTP. Erlang is really about OTP; not the language.

I feel like Erlang had a lot of publicity a few years ago... but it doesn't seem like it has experienced increased popularity.

Does anyone experienced with Erlang have some insights on why that is? Or am I wrong and is it being used in more and more important things?

I think the model makes sense and maybe forces will yet conspire to make it more popular. But I've heard that it feels "old", i.e. lacking in conveniences you would expect in a modern programming language. And it seems like it should interact better with other programming languages -- i.e. you shouldn't have to assume that every node is written in Erlang. There's no migration path to Erlang in that case.

I think of Erlang as a very specific tool.

Many projects, you could write an initial version in Ruby, Python, Java or whatever, and you're mostly going to be ok, at least while you explore the product/market fit. Maybe you'll have to redo some of it later, but that's likely anyway, so the best approach is to use something that can do a lot of different things reasonably well, and lets you develop quickly.

Erlang always feels to me like it is the best in the world at a few things, and not so good at all for others, meaning it's a bit more dangerous: veer out of its sweet spot, and you're probably better off switching to something else.

Here's something I wrote about it a while back:


I also think that Erlang is more along the lines of Smalltalk or Lisp: pioneered some great ideas, but someone's likely to take them and do them in a more palatable format for the masses. Go and Scala, for example seem to borrow some concurrency ideas from Erlang. Node.js is also a "worse is better" competitor which has a huge advantage in terms of the language it uses (Javascript is orders of magnitude more common than Erlang), even if the actual code written for it is not nearly as elegant as Erlang.

Erlang has experienced increased popularity. Yet, it has not experienced the avalanche popularity increase you see with e.g., Ruby or Javascript. I think Erlang is lucky that this has not happened since the community would not be able to keep its current excellent condition if that happened.

Erlangs forte is in a very specific kind of system. If your problem is

* Stateful * Concurrent * State has to "act on its own" * Distributed over multiple machines

Then Erlang is a nice tool for the job.

The language is not particularly old. Note you are not allowed to call a language old if the "newer" language is based on C or its semantics :)

> The language is not particularly old. Note you are not allowed to call a language old if the "newer" language is based on C or its semantics :)

What newer language?

In my opinion

java /c#/ruby/python share little few semantics with c(other then some basic types).

Have you built something cool which fits the bill? I'd love to hear about any successes!

There are plenty of successes out there: Riak, Ejabberd, Whatsapp messenger, Klarna, Demonware, Soundrop, Wooga... Just to name a few

I wonder if those apps run n-times faster on n-cores. I think probably not, due to inherent limitations in the parallelism available in them, but perhaps I should do a study to verify that. Erlang claiming n-times faster on n-cores is snake oil. First of all, not all problems have n-times parallelism in them!

> Erlang claiming n-times faster on n-cores is snake oil. First of all, not all problems have n-times parallelism in them!

I keep reading this claim in Erlang articles, and I wonder why it continues to be made. Its obviously false for the theoretical reason of the limited amount of parallelism in a problem you just stated. It is empirically false on every benchmark I've seen on a real-world architecture (i.e. actual multicore, with actual memory and IO, as opposed to virtual processes). What's strange is that the OP seems to be completely aware of this, but still leads the article with the claim in bold letters before qualifying it later on.

To my mind, Erlang is pretty cool as it is, even before a discussion of multicore performance. But I guess if you "have been waiting 20 years for this to happen, and now it’s payback time" (payback for what?) you are OK with a bit of hyperbole.

I was half of the team that created the instapaper-like functionality on bostonglobe.com using Erlang. It fit the bill perfectly and has run (mostly) untouched since it launched. Features like being able to get a REPL on a production node and quickly add and remove machines from a cluster were quite nice to have.

Sounds like a cool project!

"is it being used in more and more important things?"

Riak is written in Erlang.

Same with CouchDB, which was very fashionable when the article came out.

That was sort of my point -- erlang was fashionable like couchdb, but neither ended up living up to the hype :). In that sense couchdb doesn't count.

We use both Erlang and CouchDB at Cloudant and have great success with both

We use Riak extensively at ideeli. It looks poised to stick around for the long-haul.

I wrote another comment above questioning whether some of the early multicore promises of Erlang have become less relevant with the rise of single-emulated-core VPS servers. That could have dampened the hype.

Also, I'm not an Erlang expert but Erlang does seem to have had a much larger influence than just the uptake of the language. Scala and Clojure have certainly borrowed heavily from it, and that has had a further trickle-down effect into other higher order languages (eg, the actor model).

This is probably a HN phenomenon rather than being Erlang specific. I suspect that as HN ramped up in users, we learned all these great new technologies and now they are 'old' to the community.

Thats why you dont see as many posts about nginx, node, nosql, erlang, haskell, etc. At least not nearly as much as 2,3,4 years ago.

The most popular language I see now on HN is clojure, which tells me a lot about which language stuck with people.

Is the Erlang ecosystem and supporting libraries strong enough to support committing/betting resources on using it for production?

We're using Clojure now for our back-end processes because it's a language I can live with, and I get the safety of java interop.

I saw a great talk by Jonas Bonér (guy behind Akka) at the S.F. Scala group a few years ago. He said he fell in love with Erlang but eventually gave up trying to persuade people to deploy it and that Akka was basically an attempt to bring the core Erlang principles over to the JVM.

Personally I think people are too paranoid and should worry less about deploying a new VM but a lot of commercial IT shops are very conservative.

He must have taken a detour throught Groovy because he wrote this just before Akka.


Facebook chat is based on Erlang, at least 1-2 years ago when they made a presentation [1] about it at CUFP.

[1] http://cufp.org/videos/functional-programming-facebook

Clojure is a toddler compared to Erlang w.r.t age. Clojure started around 2007 or so, admittedly on a Lisp basis with Rich Hickey as the designer (which is to its benefit!)

Erlang has been in production since the mid 90'es in real systems. For what it solves, it solves it really well.

Erlang has been in production, in mission critical situations, for longer than the world wide web itself.

In terms of web platforms and databases, etc, the answer is again yes.

For one sampling of one companies clients check out: http://basho.com/company/production-users/

From what I've heard and seen Erlang is great, but this article doesnt tell me anything about why it is great. Message passing? Come on! You can do message passing in C++, or Python, or any other mainstreem language. Its a matter of convention and system architecture. Its easier to hire developers for a mainstream language, and its probably easier to teach them these conventions then a whole new language + development environment. So, whats all the fuss about Erlang?

PS. I dont mean to be critical about Erlang, I just dont think the article addresses the question in its title.

Well, it all depends on what you mean by message passing. If you mean call a method or a function passing it a value type, then yes you can do message passing in any language.

If on the other hand you mean passing data between completely isolated active entities (processes) that get scheduled for execution independently based on available CPU cores, I/O operations, receiver mailbox size... Then you need an multitasking operating system, which is in fact what the Erlang VM is.

If on the other hand you mean passing data between completely isolated active entities (processes) that get scheduled for execution independently based on available CPU cores, I/O operations, receiver mailbox size...

Or you can use Python multiprocessing on Linux, or Scala actors, or Haskell actors, or Clojure agents...

I was wondering the same thing about using the model itself rather than a language. I asked this on StackOverflow and received some interesting feedback http://stackoverflow.com/questions/10323393/is-the-actor-mod...

The short version: apparently you can do this in other languages, but erlang (and possibly other languages) have this concurrency baked-in much better.

p.s. also worth mentioning RabbitMQ on the list of famous erlang projects

I think people are upvoting on title alone.

I really like Erlang and have been using it on and off for about a year. Recently I starting playing around with Golang and much to my surprise goroutines are a very good homage to Erlangs concurrently model.

From what I can tell, Go seems to have about the same concurrency primitives as Erlang, but is much more approachable. Could someone start a new project in Go that would have required Erland a couple years ago?

They are both based on CSP, but Go channels are a little bit different: http://www.youtube.com/watch?v=f6kdp27TYZs&t=4m57s

The main difference between CSP in Go and the actor model in Erlang is that actor mailboxes are many-to-one whereas CSP channels are many-to-many. If you were to model CSP using actors, channels would be actors.

And you can send a channel over a channel or store it in a data structure or pass it to a function.

There is not much that is CSP about Erlang.

One of the things the author failed to mention, when he asks if Erlang is difficult, is it's verbosity for anything non-trivial. Just try building something using WebMachine for example. You will end up repeating the same tedious boiler plate code for each and every resource.

Very, very, fortunately the Elixir project seems to be making some headway and may just bring some decent language usability to Erlang. Erlang, is great architecturally, but in a lot of ways it's a toy language. It is going to need to evolve a bit to become palatable to the mass web development market.

Hey, being verbose has never impeded languages like Java and Python.

I don't know much Python but as far as I k now it is capable of extension via mixins and Java can do similarly through AoP and inheritance. Erlang can do neither of these out of the box.

What I'm really referring to is how dry a set of modules is capable of being, avoiding repeated boiler plate code being employed each and every time.

In the Erlang world, you do that by creating behaviors.

About this section in his article

" The world IS concurrent. It IS parallel. Things happen all over the place at the same time. I could not drive my car on the highway if I did not intuitively understand the notion of concurrency; pure message-passing concurrency is what we do all the time."

i dont get it i dont think this is correct when i drive my car into a gas station i wait for the person before me to finish filling his gas tank as he is using (locking) the gas stand anyone thinks what im saying is incorrect?

Now that we're seeing lots of cores on the machines of ordinary people, what would be really nice is if languages like Erlang, which are by design better for parallel computing, could efficiently target JavaScript while taking advantage of web workers.

As more and more application development becomes web based, it's really a shame to see almost-always-idle core after almost-always-idle core added to user end CPUs.

Doesn't the N-times speedup on N cores really require you to write perfect Erlang? I feel like that's a really optimistic prediction. Even the authors said they needed to give their original programs some tweaking, and they're Erlang pros. At this point, I remain skeptical of this claim.

Not as much as you would expect. The natural way you build systems in Erlang is across erlang-processes (micro processes). Even if you are building an application running on a single core, you use erlang-processes extensively. This means for a little TCP-IP server, you use one erlang-process per incoming connection on one core or 32 -- the only difference is how they are schedule, when you are on 32 cores the Erlang VM starts up more threads and then schedules the erlang-processes across the VM threads and those threads are scheduled across physical processors.

The trick is -- when you build software in Erlang -- you start up new processes for the true concurrency (in a web application, the true concurrency is number of connections, so they get there own processes). This means that Erlang applications routinely have hundreds of thousands of processes -- which makes it easy to schedule across a mere 32 or 64 or 128 physical cores.

I'm blogging my way through Seven Languages In Seven Weeks. Reading this makes me really look forward to the Erlang chapter. Only the last bit of Prolog and Scala stands between Erlang and I now!

I had to use Erlang for a uni project and didn't mind it but it's not my favourite language.

Erlang is a secret weapon.

Erlang is never going to be the really "hot" or "cool" language because it has an unusual syntax and that syntax is enough to keep it from becoming fashionable.

But this is a feature, actually. Because the really good engineers, they will invest the week or two (really!) to learn the language and once they do that they come to love it, and as a result, erlang has some of the best engineers working with it.

What Erlang does, you simply can't do with a library or bolt on solution, or anything involving ruby, python or the JVM. And what erlang does-- concurrency done right-- is so valuable these days that anyone solving real problems has run into it.

Erlang as been quietly winning in production and building out just about every library you could want. And because the language is so elegantly designed, for many projects its really accessible- you can dive into the code and comprehend it. (I get lost in the riak sources, because it does so much it overflows my stack, but everything else is easy.)

Erlang is not only a secret weapon, but it is a productivity multiplier.

> building out just about every library you could want

It's unlikely that an open source language is both "secret", and has oodles of libraries - things just don't tend to work that way.

Here's a simple anecdote: Erlang doesn't seem to have a good image manipulation library/interface to Graphics/ImageMagick. That's something most popular languages have at least one of.

That said, when you have a task that's a good fit for Erlang's sweet spot, Erlang is very good. The runtime is a very fine bit of software craftsmanship.

>It's unlikely that an open source language is both "secret", and has oodles of libraries - things don't tend to work that way.

That is true, but, at the same time, irrelevant. For example, at my company we do e-mail archiving. Does Erlang have any libraries to facilitate that? No. Not even close. But Apache Lucene does. So, what we do is use the Erlang OTP Java interface [1] to allow Erlang to handle the message passing and parallelization, while Lucene handles the actual indexing and search.

The way I see it, Erlang makes it very easy to add a message passing layer on top of whatever single-threaded process you currently have. This allows you to scale your applications without having to muck about with their internals too much - you launch multiple instances of your existing single threaded code and use Erlang to handle all the messiness of message passing.

I think that is what makes Erlang such a secret weapon. The fact that you can very easily add a message passing layer to anything with Erlang greatly simplifies the job of parallelizing existing programs.

[1] http://www.erlang.org/doc/apps/jinterface/java/com/ericsson/...

> Erlang doesn't seem to have a good image manipulation library/interface to Graphics/ImageMagick

Ordinarily this would be a problem, but ImageMagick (since you mention it), in my experience, is something you do not want to link directly to. Despite being a quite old and (theoretically) mature project, it's quite brittle, leaky and buggy, prone to segfaulting and weird freezes. These days I just fork and exec the `identify` and `convert` libraries to do image conversion (and simple manipulations), and for graphics drawing I use other, less buggy libraries such as Cairo.

> It's unlikely that an open source language is both "secret", and has oodles of libraries - things just don't tend to work that way.

It really depends on the history and the typical domain that language is used in. Yes, Erlang doesn't have good image manipulation library, but it has built in support for a multi-node distributed system setup. Applications with automatic failover running on 2 nodes. Also a distributed database and so on. These are not trivial things to just whip up on other languages, but Erlang has them in the standard library.

It has also been used for telecom applications for a long time before it became open source, and thus it accumulated a decent size library over the years, which it might not have had it been a "secret" language started from scratch by a PhD student 5 years ago...

Jose Valim made an awesome language for the erlang vm called elixir


I have been playing around with it and it simplifies a lot of the things that bother me about erlang. It breaks some functional conventions for code readability such as rewriting the same variable

Thank you for this! Ruby syntax on top of the erlang VM - I wonder why this is not more popular.

It's really, really new. We've been playing w/it a lot @ Inaka (and we do about 30% Erlang consulting projects). It will catch on, i'm sure.

could you tell us what you are typically using it for ? Coming from a ruby-centric world, I am having difficulty visualizing how ruby and erlang could co-exist. what cant you get from Eventmachine ?

Erlang is a powerful language, IMHO it stands out because of 3 features:

- Actor Model

- Pattern Matching

- Hot code replace

Yet HotSpot is more performant VM then erlang's VM is. Now, if those features could be implemented on top of the JVM it would (to me at least) render erlang to be irrelevant.

So let's examine what is currently known.

- Actor Model: An example of a massive implementing effort of Actor model would be scala's Akka framework http://akka.io/

- Hot code replace: I don't know the answer to whether HotSpot lame code swapping implementation can be fixed to act like erlang does it. http://scala-programming-language.1934581.n4.nabble.com/how-...

- Pattern Matching: As for pattern matching, to my understanding this can be implemented on top of the JVM, if one wishes to. http://news.ycombinator.com/item?id=2784086

The main power of Erlang comes from the ability to build fault tolerant and highly available systems. Fault tolerance was the main motivator behind Erlang's existence, not any of the thing you mention. Two of those things (actor model and hot code replace) fall out of that one requirement. Fault tolerance needs process isolation, that leads to actor model. Minimization of shared state leads to a immutable data structures + a functional approach. High availability and tolerance for upgrade downtime leads to hot code replace.

And fault tolerance is exactly what JVM doesn't have. It shares a common garbage collector and there are other things.

That is the main problem with those who claim "oh here is a faster Erlang library in language X", yes, one can copy some of the features into language X but unless full process isolation is copied, it is not really an Erlang replacement library/lanuage.

You can add the soft realtime characteristics of the VM to the mix. That's not something you can just add to the JVM as a library either.

You're either realtime or you're not. If you have hard and fast time requirements to perform certain actions then you're real time. (e.g. 3 ms. to decide whether or not to turn the cruise missile or it hits a hospital). There is no 'soft' realtime.

There is soft realtime. You don't have to have hard and fast time requirements, or nothing, you can have more relaxed statistically defined requirements. That is soft realtime.

Systems are not divided into cruise missile control systems and everything else. There is a continuum in between. A phone system where the calls need to routed in the 100 milliseconds or less 99.999% of the time is acceptable for example, while one where no such guarantees can be provided at all is not acceptable. It is not a realtime systems but it is pretty close, so it would be called soft realtime.

Erlang does not have a common runtime for its processes? What happens to the isolated Erlang processes when the VM fails to allocate more memory from the OS? Does the whole VM crash, or just the process that caused the memory allocation attempt?

That is one of the few cases that will take down the whole VM (at least, as of a few versions ago). Due to Erlang's concept of process isolation, there's not really a sensible course of action for the VM to take if it runs out of memory. Maybe just killing the individual process or running a GC pass on every process and trying again might be a better approach than hosing the whole runtime, maybe they've fixed it. Either way, the frequency with which this happens in production Erlang apps seems to be low enough that it's essentially a non-issue for most of the Erlang community.

> What happens to the isolated Erlang processes when the VM fails to allocate more memory from the OS?

It suspends your process, then logs into newegg.com and buys more memory sticks for your sever. You know it is just a regular VM and doesn't have magic in it. If you use up all the memory on your server you are screwed probably.

You could set up a monitor to see which erlang process is eating memory and what rate and then decide what to do, kill the process, restart the whole node, or report someplace.

EDIT: see http://www.erlang.org/doc/man/memsup.html

You're forgetting about OTP, Erlang's battle tested framework that covers about any edge case conceivable in a concurrent world. By themselves, you could probably produce hacky actors, pattern matching and hod code swap implementations in most dynamic languages (and on VMs like NET or JVM) nowadays, but it's the whole package that has remained unmatched so far.

Not to mention Erlang's live console and logging capabilities which are indispensable when it comes to diagnosing crashed services or inspecting live systems and gathering stats. The fact that Erlang's data structures are immutable allows OTP to give you a very detailed report containing the exact state before the crash and the message that led to it, while in Java and most other languages your best bet is just an exception trace - and then you would have to guess what exactly went wrong because you don't have the system state anymore and can't recreate it easily for that matter. You can imagine that this makes debugging an Erlang application a joy - at least as joyous as it can get.

One other important aspect of OTP are supervisor hierarchies - Erlang is built around the idea that if something can fail, it will - so systems should be designed in such a manner that failure in one component doesn't tear down the whole system. Services are carefully subdivided into autonomous submodules, and if a module crashes OTP will attempt to restart it and continue its operation if possible. Imagine a png conversion submodule crashing because of a faulty png header - in a Java system this would most probably throw an IndexOutOfBoundsException and down comes your whole backend. Erlang/OTP would just note that the converter has crashed and then restart it. It can do that because the converter doesn't affect or depend on some interconnected, global, "complected" state and thus can be easily started and stopped at will.

Of course, Erlang's solution to concurrency is no silver bullet - you can still create deadlocks if you aren't careful and you can still create global state by abusing named processes, but it makes a lot of things a lot, lot easier.

edit: one more thing - Erlang's hot code swap support is really one of the most beautiful things I've seen when it comes language features. Now, swapping code by itself isn't such a big problem - you can do this in C as well by just reloading a DLL or in Java by un/reloading a specific class - but the tricky part is about swapping state, and this is where most imperative languages fail hard.

Said otherwise, when you update your service, you not only want to update your code but also your data structures. For example, if you have a class/struct like this:

    struct person {
        string name;
        int age;
and would like to add an additional field `ageString` to it, for whatever purposes, how would you go about doing this in a more conventional language? You would have to track all usages of `person` somehow, and then replace the in-memory representation with the new version while making sure that no thread is reading or writing data, then swap the code while being careful that no new code is reading old data or vice versa. In fact, in most systems I've worked on such a thing would be almost impossible.

Erlang's processes and immutable data structures allow you to do just that - and as painless as possible. You provide a function that converts old state to new state (in this case containing the field `ageString`), the new code which operates upon the updated data structures and let Erlang do the rest. All messages that come in while the swap is taking place will be queued up and delivered later.

Moreover, if the update introduces a bug, you can as easily downgrade as you did upgrade. And all this while your service is up and running!

Exceptions can be quite problematic. This is why JActor has taken such pains in its exception handler. One particularly nice feature is that, since JActor supports 2-way messages, the default exception handler passes exceptions to the active exception handler of requesting actor. https://github.com/laforge49/JActor

Good stuff. Reminds of me Common Lisp's http://clhs.lisp.se/Body/f_upda_1.htm#update-instance-for-re...

Scala and Akka solve many of these problems on the JVM, with a strongly typed language to boot.

I love Erlang, but it's not all roses either.

Take a look at JActor, a high-throughput Java actor framework. It gains its speed by operating mostly synchronously and delivers between 80 and 200 million messages per second, depending on the mode of delivery. https://github.com/laforge49/JActor

Hotswapping: JRebel.

"What Erlang does, you simply can't do with ... anything involving ... the JVM."

As a matter of fact, Clojure runs atop the JVM and has a concurrency model that parallelizes even better than Erlang's, although Erlang's model of concurrency is presumably more fault tolerant.


"While we make heavy use of the core of Clojure, we don't use its concurrency primitives (atoms, refs, STM, etc.) because a function like pmap doesn't have enough fine grained control for our needs."

Not sure what you mean by "parallelizes better". However Erjang and Kilim on the JVM have managed to outperform the Erlang VM (thanks to a shared heap they provide zero-copy messaging unlike the Erlang VM) and with InvokeDynamic Erjang should be getting a nice performance boost as soon as they integrate it.

How so?

By conceptually separating synchronization from mutual exclusion and abstracting concurrency into separate models (atoms, agents, refs) that correspond to separate use cases (synchronous and independent mutable access across threads, asynchronous and independent access, and synchronous and coordinated access). It's higher level than using message passing for all types of concurrency and, therefore, more expressive.

When the goal of concurrency is parallelization, I'd want the most abstract and powerful tool for the job, which for me means Clojure over Erlang. When the goal of concurrency is reliability, I imagine I'd want separate lightweight processes for fault tolerance, which is where Erlang seems to excel.

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact