Hacker News new | comments | ask | show | jobs | submit login
Erlang/OTP by Example (erlangbyexample.org)
218 points by jxub 4 months ago | hide | past | web | favorite | 105 comments



If you get to the part about monitors, consider erlang:demonitor(Ref, [flush]) in your own code. This also removes the monitor message from your mailbox if it arrived in between, which is a real problem in an async setting.

Though in some situations, it is better to just ignore the spurious message when it arrives by tracking what monitors you have enabled in the process state. Unkonwn monitors are just gracefully ignored. The same pattern is useful with timeouts as well. Cancel the timeout, but if it arrives in your mailbox while the cancel is happening, you can just detect an old timeout and ignore it.


demonitor flush can be harmful in some instances as it invokes a selective receive to yank the down message from your process's mailbox. In the event that you are demonitor+flushing on a process that has a growing message queue, every flush gets more and more costly - especially if the monitor has already delivered a DOWN message for it to yank.

it's almost always better to ignore down messages that you don't care about.


In addition, two other highly informative resources:

* http://spawnedshelter.com/

* http://beam-wisdoms.clau.se/en/latest/


This is also a fantastic Erlang learning resource: https://learnyousomeerlang.com/


I'm kind of sad that Elixir is getting all the love and the cool resources since I much prefer Erlang's syntax. Thanks for this page!


I love Erlang too, but I don't think Elixir's popularity is stealing love away from Erlang. If anything, I think Elixir compliments Erlang and provides a path for people to enter the world of Erlang, BEAM, and OTP that they didn't have before. Elixir helped to lower the bar of entry. My understanding is that the Elixir team has also helped drive some improvements in Erlang, which is nice.

Personally, I love both Erlang and Elixir and hope to ride out the rest of my career on these platforms.


I agree wholeheartedly. Elixir gave me an opportunity to start to play around with Erlang and OTP. While I prefer using Elixir, its ability to mix and match with Erlang has made it a great gateway.


As someone who has worked two jobs now writing, deploying, and operating Erlang clusters, I recommend switching to Rust. Erlang requires a lot of TLC to get right, it's super slow, and it's hard to burst. Like, super hard to burst. Erlang nodes are meant to cluster as a k graph and never go down. Modern ops, especially container ops, does availability through ephemerality of services. The BEAM just doesn't like to be treated like cattle. Also Erlang has notoriously bad error messages, very little abstraction, and converting between binary strings and lists is a pain. Gaining Erlang operational knowledge also takes a while. We eventually had to rewrite things like gen_server, ditch mnesia etc. as we scaled.

So why Rust? Like Erlang, it's damn good at concurrency and enables functional programming. It also enables event driven programming through tokio, which is a better fit for web servers than green threads (you're mostly waiting on the network). Unlike Erlang, it's super fast (even at math), has a great type system, amazing error messages, low memory usage, and the community is already quite a bit bigger.


I'm going to disagree with you on Rust. It's a very different, very verbose language. It doesn't have any of the stories around immutability that Erlang does.

You say you had to rewrite Mnesia, but Rust doesn't even have transactional memory to start with.

Suggesting that event-driven programming is anything like language-level threading like Erlang and Go is crazypants. A common error in event-driven languages is that you end up writing code that gets slow, and blocks the entire event loop, and everything falls apart, and you get paged at 2 AM, until you add another event loop.

One of the best parts of BEAM is that since processes are isolated and preemptively scheduled, you don't have to manage your own call-backs by hand, and although things may get slow, they'll typically only get slow for that one given process.

In addition to this, the GC in Erlang is great, compared the lack of GC in Rust. I think most of us can agree that unburdening yourself of having to memory management code is a good thing.

Of course, BEAM isn't perfect, after all, it's had no more as much investment as the JVM, and CLR, but I believe its semantics are right for writing predictable, low-latency code.

Also, containers have nothing to do with ephemerality. Cluster management systems which dynamically schedule containers may result in scheduling.

Erlang isn't really a dataplane runtime. Often times, you implement your control plane in Erlang, and farm out your dataplane to something NIFs, ports, or something else entirely.

You're right, disterl is a fucking mess. But, it's better than nothing, and having to write your own IPC.

I suggest you read Joe Armstrong's thesis, or a History of Erlang for more.


>You say you had to rewrite Mnesia, but Rust doesn't even have transactional memory to start with.

He said ditch Mnesia, not rewrite.


> I'm going to disagree with you on Rust. It's a very different, very verbose language.

Erlang is incredibly verbose. It has very few abstractions. Rust has a lot of abstractions. I've rewritten a few Erlang projects now and Rust and I've been able to come out with close to or under the same LOC (I always use specs though).

> It doesn't have any of the stories around immutability that Erlang does.

let expressions are immutable by default. You specifically have to ask for mutability and even then, the Rust borrow checker will always enforce a single writer. Rust definitely has a story around immutability.

> You say you had to rewrite Mnesia, but Rust doesn't even have transactional memory to start with.

Which is good, because Mnesia is crap and I've had to deal with chucking it out the window several times now. Rust has good library support for STM that is completely optional.

> Suggesting that event-driven programming is anything like language-level threading like Erlang and Go is crazypants.

They're both models of concurrency, and you can achieve parallelism through either. Oftentimes one is better than the other for a given task (usually determined by the bottleneck), such as serving web requests bottlenecked by IO.

> A common error in event-driven languages is that you end up writing code that gets slow, and blocks the entire event loop, and everything falls apart, and you get paged at 2 AM, until you add another event loop.

Most high performance web servers use event loops. See the paper "An Architecture for Highly Concurrent, Well-Conditioned Internet Services" for an overview. There are lots of issues with green thread models. See some of the work done by Brian Cantrill for examples, and why it may be a bad idea to bake them into a language.

> One of the best parts of BEAM is that since processes are isolated and preemptively scheduled, you don't have to manage your own call-backs by hand, and although things may get slow, they'll typically only get slow for that one given process.

I think you're caught on the idea that Rust is Javascript or Python. In Rust you can parallelize an iter chain by changing one method call. You can also use multiple event loop to dispatch to handlers. There is no one size fits all solution to concurrency.

> In addition to this, the GC in Erlang is great, compared the lack of GC in Rust. I think most of us can agree that unburdening yourself of having to memory management code is a good thing.

Disagree strongly. See Steve Klabnik's latest posts on static garbage collection in Rust for an enlightening take. Rust does have a GC, and it has no runtime performance hit.

> Of course, BEAM isn't perfect, after all, it's had no more as much investment as the JVM, and CLR, but I believe its semantics are right for writing predictable, low-latency code.

Erlang isn't low latency. It has predictable latency.

> Also, containers have nothing to do with ephemerality. Cluster management systems which dynamically schedule containers may result in scheduling.

This is pedantic. Any non-trivial container deployment will have to deal with ephemerality. If you're replacing an Erlang cluster, it will be even moreso an issue because you'll need some level of fault tolerance from the orchestrator.

> Erlang isn't really a dataplane runtime. Often times, you implement your control plane in Erlang, and farm out your dataplane to something NIFs, ports, or something else entirely.

NIFs are extremely dangerous. We've had critical bugs that have taken down entire clusters thanks to NIFs. The architecture your describing is also exceedingly rare. Most Erlang deployments are handling soft real time workloads like routing chat, queue messages, web requests, that have no language separation between control and data.

> You're right, disterl is a fucking mess. But, it's better than nothing, and having to write your own IPC.

In many cases it is better than nothing. It will take a huge amount of wasted effort to fix some of the scaling issues I'm currently having with our Erlang cluster.

> I suggest you read Joe Armstrong's thesis, or a History of Erlang for more.

I've read Joe's thesis. I'm assuming your point is that I somehow don't know anything about Erlang despite having worked on it for years professionally and attended multiple Erlang Factorys/given talks on the subject.


Nah, Joe's thesis and the History of Erlang was about the reasons why Erlang is good at a very specific subset of things.

It's awesome to meet a Rustacean who's familiar with Erlang.

So, I've deployed Erlang pretty happily, and I've tried to pickup Rust for a hobby project -- this was around 12 months ago. I tried again about 6 months ago.

I found myself trying to write immutable code, but ended up spending a bunch of time writing out blah.copy, passing it into a lambda, or another very short function to hand it over.

This only got worse when I tried to use libraries that expected Cell, and dealing with Rc.

I also found myself trying to use channels in Rust as a way to do message passing between threads, which started to feel really awkward. My fear with locks, or on-the-fly unwrapping is that I'll end up in a shitty, impossible to debug situation. Is there a more structured approach to concurrency, and immutability together which prioritizes safety over speed?

In my attempt to find off-the-shelf libraries for concurrent and I/O, I found that I was doing callback-hell, or callback chaining out the wazoo. Are there any good examples of being able to write sequential code (a la OS-level threading, Go, or Erlang) and handling I/O and compute concurrently?

As far as async / event-based programming goes -- I agree, that the underlying implementation probably wants to use a event loop in some manner, but I think having to reason about the preemption during compute is kinda annoying.

I'm going to challenge you on containers though. I've been doing containers for a while now -- as long as I've been doing Erlang in fact. Containers and schedulers are two different things. If you want to look at statically-scheduled containers, plenty of people run Docker, and LXC (or LXD) without a reactive scheduler.

Generic schedulers don't provide the registration logic, nor the fault-recovery logic that Erlang does of treating every request independently. There are cluster managers, like Akka's, or Apache Helix that do this, but I the closest thing I know to a generic scheduler that does this is Kubernetes. That comes with a whole lotta other baggage.


If you don't mind, can you give some more details on what you don't like about mnesia? Everybody says it's awful, and it surely has warts, but I've seen it scale ok to 512GB+ datasets in disc_copies tables, and it works alright with a couple caveats.

a) Not sure if it changed, but mnesia startup was very brain damaged -- fixing up the local data from disc, then throwing it all away to load from peers is a lot of wasted time. It's much faster to remove all the local tables on disk before starting the node so it short circuits to copying from the peers. Even when it's faster, sending over half a terrabyte of data takes a while. Some sort of persistent transaction logs for peers would be nice.

b) network partitions aren't fun at all

c) we direct mnesia read and write for a key into a specific process to enforce serialization, and then we use dirty read/writes; so we skip all the locking.

d) we've certainly patched a lot of things in transaction sending and receiving over long distances, especially needed if your network isn't clean.

Anyway, thanks for your thoughts all over this thread.


- One big issue is that when there is a replication stream going between peers it will bottleneck how many fragments can be updated.

- Large tables make restarts really slow (have to read everything from ETS).

- Table dumps cause spikes in load.

- Uses a lot of RAM.

For net partitions, I just alert on it and fix it manually. Mnesia isn't bad as a cache, bad as a database.


disterl is definitely worse than writing your own ipc. just use a language that has good sctp or h2 support and something like gRPC for the application layer. for peer discovery you have consul, etcd, zookeeper...


That's essentially what we're doing now.


Your complaints about Erlang give me the impression that you may have picked the wrong language for your problem domain, and paid for that mistake by having to do things like rewriting gen_server.

Erlang was designed to be reliable and scalable. It was not designed to be fast. If fast is a hard prerequisite for you, you're right, you should go with another language. But Erlang also does a ton of things correctly (again, in the domains it was designed and optimized for), and is battle-tested far beyond Rust, since it has been around for much, much longer. That's not to say Rust is a bad language, but one should probably not judge its merits by its current popularity. That holds true for any language or tool.

Anyways, if you want abstraction (the lack of which was one of your complaints about Erlang), you should take a look at Elixir. Specifically, Phoenix is excellent for web programming, especially if you need real-time messaging at scale.


I'm aware of all of these claims. Any Erlang developer has heard this shtick a thousand times. The reality is that you do eventually have to twist the BEAM once you reach a certain level of scale, and there are plenty of bugs. Like I said, it's also difficult to burst, so it's not always as black and white as you say. It's also missing a lot of the introspection you need to operationalize at scale.

We've had to patch the Beam many times to keep moving forward, and work with super dangerous nifs to talk to other systems.

The issues I've had with Erlang had nothing to do with the products I've worked with. Erlang was a 'good fit' for them. Elixir and Phoenix doesn't really solve any of these issues, and neither of them are particularly 'battle tested'.


This is exactly what whatsapp experienced, but with more recent OTP versions I would have assumed that those went away and the main problem once you reach 'scale' is that the distributed erlang network topology gets saturated. I'm curious what difficulties you ran into?

edit: nevermind I saw your other post, but confirms what I expected.


Correct


As an Erlang fan, who hasn't used it in production services, I'm wondering if you can let us know some specific scaling issues you encountered.


Out-of-the-box, it's only really designed to scale to a certain point. It's all generally nice, predictable, and with with low latency up until that point. But because the nodes are fully meshed, the TCP heartbeats alone kill performance once you go past that point. So for ex. 100 nodes gives you 5050 TCP connections (100+99+98+97...etc). As parent says, there are methods to deal with this (Riak Core would be an example). But they're non-trivial.

+ you possibly need to bear in mind the design goals: that Erlang is designed to run as a highly reliable, self contained system in _a single geographic location_, with that system possibly left to run on its own for long periods of time (years)


So, at one level, 5000 tcp connections is a lot, but at another level, some teams (including mine) are running hundreds of thousands of tcp connections to our clients from our front end Erlang nodes.

I've never thought about the dist heartbeats as a scalaing problem. If you have thousands of dist nodes, and your nodes have small memory, dist buffers for each connection to add up -- I think the default is 8mb, you can tune it, but it's a scaling concern. Especially, if you have nodes far apart from each other.

Really, the root design of Erlang was for two nodes colocated in a single chassis. That said, it turns out the design scales pretty well to much larger numbers of nodes, and nodes farther apart, but you have to be careful with some things. pg2:join and leave operate under a global lock, which will be slow if you have contention on the lock, or if one of your nodes has some problem where it's still up but very slow. Mnesia doesn't do well with queuing without a lot of help, schema operations under queuing is definitely a bad idea as well.

If you want to run Erlang at larger scales, you will need to be ready to poke around in OTP, and ocassionaly in BEAM as well. If you're running big systems, IMHO it makes the most sense for your Erlang nodes to fill your physical nodes, so I don't see much need for containers, but if you do use containers, you need to figure out how to get their names consistent for Erlang, or it's going to be confused. (OTP has a concept of a 'diskless' node which would seem to be a good fit for an ephemeral systems environment, but I must admit I haven't played with that)


> If you want to run Erlang at larger scales, you will need to be ready to poke around in OTP, and ocassionaly in BEAM as well.

That's essentially what I've had to do in my career as an Erlang engineer. Erlang requires way more massaging and work than the stories people tell about it would lead you to believe.


I don't think it's that much work. It's just that when you hit a wall, you have to fix it yourself. But many of the fixes are easy -- OTP usually does things in a very simple way, and sometimes something more complex is needed to scale beyond.

I think this is the case regardless of what languages or systems you use, but more well used systems may have more experts and more documentation to lean on.

For things that are a good fit for Erlang, it seems worth it to train up a couple people with deep internal knowledge of the VM you're using. As you said in another part of the thread, Erlang doesn't have a lot of abstraction -- most scaling problems aren't too many layers deep.


Yep, I kinda meant it as an illustration - the heartbeats are just the base operation that has to occur between meshed nodes, not that that itself is generally going to be the issue (the inter-node communication is likely to have a bit more going on than just that!).

Containers is where I've had issues, not necessarily anything drastic, but I've found myself dropping half of the the things I really want from an Erlang system (mainly making as much as possible non-stateful rather than stateful, not using supervision trees to their full potential) to buttress against that ephemeral nature of containers (I haven't really looked at diskless nodes in much detail either though)


I can't really go into it in detail, but at a high level Erlang default methods of distribution and security don't scale very well. There are people working on better mechanisms for this, and I know of several companies that have custom solutions for clustering nodes. One big issue is you cannot easily burst Erlang nodes to handle peak traffic. The number of nodes is usually relatively static in a deployment.


Bursting is an interesting thought. I think, if you planned it out, it could be done -- subject to some constraints.

It would be hard to burst stateful (mnesia) nodes --- schema operations require a lock across all the nodes in the schema, and that lock requires that the nodes not be in the middle of the 'log dumping' process (where the global transaction log gets divided into per table logs and such), which means long delays in high volume situations, and even longer delays if doing multiple schema operations. This could probably be patched around, but... In my team's experience, our mnesia nodes were generally ok under higher than normal load, expansion was driven by data size. Expansion could be a lot nicer, but I haven't heard of many database systems that handle expansion off the shelf.

So that leaves stateless nodes. I don't see why you couldn't burst those, especially if using standard dist. Bring up the host, push your software, connect to one dist node, and get meshed automatically, once you see all the pg2 groups you need to operate, enable traffic.

That said, we never did too much of that, we're in bare metal hosting so we don't have an incentive to run different server counts at different times of day, and provisioning isn't fast enough to handle incidental spikes -- we have a pretty good model of what spikes to expect, and provision to handle that load being mindful of the possibility of a load spike during a network or datacenter availability incident.


Our nodes are stateful. We also wrote our own dist (I didn't write it, it could be a lot better). We also have good metrics on our spikes, we just have black friday where we need to burst like crazy.


Oh yeah, black Friday is crazy. For us, we've always been seeing that our annual big spike load ends up being our sustained daily peak in a few months, so we will need those nodes anyway.


Rust has too big of a learning curve, and it's quite complex for writing service oriented systems. Granted, having a static compiler eliminates many types of bugs found in other systems.

In my opinion a more sensible approach is Elixir + Rust (via rustler). Your get the elegance and productivity of Elixir, the concurrency and fault tolerance of Erlang/OTP, while still being able to writing super fast, low level code in Rust.

Plus, you shouldn't use a single language across your whole stack. When the only tool you have available is a hammer, everything looks like a nail.


I don't buy this argument. Erlang does have a steep learning curve, but it isn't the semantics of the language, it's understanding how to build and deploy OTP applications. Understanding apps, releases, clustering, mnesia, process registries, circuit breaking etc. all take time. It also will cost you hours, and hours, and hours down the road when you have a large sprawling app with very little abstraction and no -specs, because dialyzer is frustrating.

I was able to learn Rust in about week by reading the O'reilly book. Compared to C++ it's almost tiny (I have many Bjarn's C++ tome and the Meyers books right next to my Rust books actually), and unlike C++, the compiler will essentially teach you the language if you didn't learn enough from the book to write code that compiles.

Once you've put in the initial investment, Rust code is way easier to maintain, read, and scale. It's also easier to onboard people. Onboarding people for Erlang is hard, and it's hard to hire for. Rust on the other hand is familiar to all of our FP people who like Haskell and Erlang, and our Java and C++ devs.

I've also found it extremely suitable for writing services. We have many services in Rust running in production right now. We have a service toolkit for our cross cutting concerns. We can run our IDL through a compiler and get a client and server that are on the order of 20x faster than their Erlang counterparts.


For the record, I would prefer not to get into a programming language flame war. Having said that, I will add the following: It's much harder to reason in Rust, as there are many concepts one needs to keep in their head (borrowing, lifetime, pointers, etc). Plus Rust is much more difficult to read. Whereas Elixir, and I emphasize Elixir over Erlang here, is much easier to reason in, concise, and simpler to read.

At the end of the day, use whatever language you prefer, just keep in mind that software needs to be (1) shipped, (2) maintained (by multiple people who read each other's code) and (3) evolved. I would also add (0) experimentation; before one even ships any code, one ought to easily experiment with various ideas.


I content Erlang can be very hard to reason about. I've seen some extremely gnarly Erlang where the author didn't write a -spec and it was almost impossible to tell what shape a tuple parameter would take. In Rust the compiler enforces all of this for you. Rust gives you:

- Maintainability: Rust signatures not only tell you the types of arguments, but also their lifetimes. A signature in Rust is an extremely strong abstraction. Traits can further constrain types making it essentially a game of plugging the right blocks in to the right holes. Erlang is like having the blocks but all the holes are under a tablecloth and you just have to guess how to fit them in.

- Correctness: Without a static type system Erlang does very little for correctness. Rust has pattern matching as well and can enforces exhaustion. Not even Haskell does that. Lifetimes guarantee safety even for shared pieces of data.

- Speed: Obviously, Rust is a compiled, manually memory managed language with near C++ performance.

On every metric, Rust offers a lot more than Erlang. As someone who has spent years with Erlang, Rust is simply better for professional development.


Rust however does not offer nearly the same level of power when it comes to runtime introspection that BEAM does.

Coming from a company that uses elixir heavily and has made a significant investment in rust, I don’t think we would ever use solely rust on our distributed systems. However, we have rewritten some code that elixir was too slow at in rust and exposed it as a NIF on BEAM - and that has worked well. (Blog post on that soon hopefully).

I do admit, we are also going to be ditching mnesia for one of our clusters for our own in-house simpler system (ETS replication with different consistency/netsplit guarantees for our use-case), we've had to write our own cross-node process monitoring solution (at peak we see 200M+ cross-node monitors on our cluster), and we've also had to overcome the limits of message-fanout on distribution as well (https://github.com/discordapp/manifold).

However, for operating at our scale (peak 9m ccu @ 5m events/sec fanout to clients), we run a surprisingly small number of servers for our real time system (~120).

EDIT: I can't reply to your post below, but I think the runtime introspection we run into is not dealing with OS level metrics, but application level introspection. Introspecting the state of processes, writing code in the repl to debug issues within the cluster, benchmarking to find hot functions or where specific processes are spending a lot of their time. Capturing traffic to replay on a test cluster to simulate production load, all becomes very trivial with BEAM.


Yes, we can do that through flamegraphs. For any container running on our system, we can get the call stack of the running process, time taken, resources used etc. We can attach to containers, proxy traffic etc, all through Kubernetes.

I said this earlier, but we've essentially separated operation concerns from our applications, and that opens us up to relying more on knowledge of Linux which is easier to hire for, and we can reuse that knowledge and all our tooling with any other languages we want to use.


We run between 200k and 2m processes per beam VM. I don't know how it'd be possible to get as precise metrics as we need just from relying on linux utilities. And in the same cluster, some processes although identical in code have dramatically different workloads.


We couldn't get enough introspection. Now we use Rust and we can use a much larger toolkit. We have BCC, kprobes, flamegraphs... When you use Rust the OS is your machine. With Elixir you need to understand the BEAM, a slightly esoteric platform with a rather small community working on it.


> and enforces exhaustion. Not even Haskell does that.

Maybe it doesn't throw an error message, but ghc does warn about it when you use the -Wall flag. I believe you can get the behaviour you want by turning on only that specific warning (I forget what it's called) and using -Werror.


Seems like you don't get it. In the comment above there is a suggestion that Erlang being well-researched and based on the right principles of FP is good-enough to build almost half of 4G platforms having crappy syntax, crappy VM, crappy everything but OTP itself. This is the point and the big deal.

Rust is a mess of amateurish overcomplicated poorly understood hype-driven crap. Tokyo is utter bullshit (look how Erlang or Go solve the same problems with order of magnitude less lines of code), etc.

Rust, it seems, is repeating the story of Ruby, where a crowd of overly excited (for no reason) amateurs quickly (without understanding) pile up "solutions" to really hard problems, which has been researched by the best minds for last 5 decades or so.

For example, all the concurrency bullshit could be boiled down to the well-understood concept of a software interrupt, which is hardware-assisted to be a lightweight isolated process. No sharing, no threading, no cooperative multitasking, no bullshit.

On the other hand, there are Streams and Futures, which also has been well-understood and researched.

Finally, the Actor Model defines how to build distributed system the right way - the way Mother Nature does (isolated entities communicating by message-passing), which is at the core of Erlang and things like Akka.

Erlang and Go are the best examples of how small, uniform and simple systems could be when based on the right principles and proper abstractions. Rust is the opposite of this.


> Rust is a mess of amateurish overcomplicated poorly understood hype-driven crap.

Please be more specific about the amateurism.

> Tokyo is utter bullshit (look how Erlang or Go solve the same problems with order of magnitude less lines of code),

No, they work in fundamentally different spaces.


Piling up features instead of reduction and unification is the clear sign of amateurism.

For example, ML's (and Erlang's) unification of bindings via unified pattern-matching everywhere is a major achievement and canonical example. Haskell's unified approach to typing, is another. Everything is an expression of Scheme and even Lisp's unification of representation of code and data is the great discovery of old times.

PL design is hard, design of good runtimes (OTP, Go) is even harder. Ignoring almost everything which was good and true in PL field is definitely amateurism.

I don't want even start about what kind of nonsense Tokyo is. Universal event-driven frameworks is the same madness as J2EE. On the other hand, ports, typed channels or futures or pattern-marching on receive - support of fundamental concepts in a language itself is the right way.


Rust is a simpler language than Haskell. You can’t write an os in Haskell. Rust is a replacement of c and cpp and it’s well suited for that. Rust embraces zero cost abstractions which none of the languages you mention do.

You need to wait a bit. Tokyo is a low level abstraction. People will build on top of this. Let’s be real, it’s the most promising language of late.

How long did it take for erlang to mature. Rust has come insanely far in the little time it had. Give it five more, you’ll be surprised.


You are comparing the learning curve of a complete ecosystem (erlang wirh clustering) to a barebone programming language (rust). Isn‘t this totally unfair? ;) Learning another framework for rust (if it exists) to solve the same problems as erlang (transparent clustering of everything) will add many hours too.


That is the language though. Erlang is DSL for writing massively concurrent, distributed programs. Rust is a general purpose language that enables systems programming. Learning how to do either one of those activities with their respective language is all factored into my assessment.


Erlang's syntax can be learned in a few hours. The language is very small. How is that a "complex learning curve?"


Erlang the language is a small part of Erlang the platform. Erlang the language is simple; Erlang + OTP + BEAM VM, which is what Erlang actually is, less so.


I have been following this thread with interest. Wouldn't Golang (over Erlang/Rust) be the ideal language here for scalable microservices ? Comes with in-built concurrency, all your afore-mentioned container orchestrators are written in Go-Lang and it comes with first class support for gRPC. Very stable, very easy to learn, a terrific concurrency model, great libraries for network programming, productive right off the bat and compiles to container-friendly binaries.


Would you say that learning Erlang/OTP has given you transferable knowledge about building concurrent and distributed system? Did you transfer any of that knowledge to your Rust designs?

Do you have an opinion about Scala/Akka?


I don't have experience with Scala/Akka. Erlang is a good language to learn clustering/concurrency, SMP, queuing theory, etc. Absolutely. If you learn about concurrency through Erlang, you'll have the right mindset. But you should also read Simon Marlow's book about concurrency in Haskell. You should learn pthreads and futexs so you can understand the building blocks of mailboxs and channels and higher level patterns.

I am a big believer that you should continually invest in learning and mastering new languages. Each language gives you a different perspective on how to solve problems.

But as far as the right tool to build professional software, there are caveats. Safety, correctness, speed, and maintainability are all important factors in choosing the right tools.

Edit: I should also say I've had plenty of interviewees who couldn't explain the difference between concurrency vs parallelism, and whose knowledge of these concepts was limited to spawning a pthread and locking shared data. Needless to say these people tend to do really poorly at explaining how to scale a distributed system.

Contrast that with the Erlang developers we interviewed, who think in terms of scale. Their answer is almost always something that could accept 10 requests/s or 10 million. Same thing with our Haskell interviewees. They'll answer our algorithms questions by writing out the types and deriving an answers with a single expression. I had an interviewer who was extremely confused by this. We hired the guy that confused him, not only does he understand concurrency, he also understands lazyness and we love lazy developers :)


> I've had plenty of interviewees who couldn't explain the difference between concurrency vs parallelism

Is there a good answer when it seems people don't even agree on what those term mean and apply to in CS.


In computer science, concurrency refers to the ability of different parts or units of a program, algorithm, or problem to be executed out-of-order or in partial order, without affecting the final outcome.

Parallel computing is a type of computation in which many calculations or the execution of processes are carried out simultaneously.


So much this. I don't see myself deploying anything big in Erlang anytime soon but learning it (and building one or two small projects) gave me intuition about message passing concurrency. This knowledge transferred very nicely to Android programming with Kotlin coroutines. My colleague was astounded learning how easy it is to model concurrency with actors and channels and how much of it just works.


Very happy with Akka. It is fast (thanks to JVM), has lots of build-in building blocks / modules and documentation covers everything, also community is very helpful.

https://akka.io/docs/


JVM can't be really fast, since it does lots of context switching when doing I/O, with locking and busy waiting, JNI string conversions, useless buffer copying, etc, etc.



What's wrong with Erlang, the topic at hand?


Nothing's wrong with Erlang per se. I personally found Erlang's syntax hard to follow, especially considering I'm working with multiple, more traditional languages on a regular basis. For instance in Erlang, variables are capitalized, whereas in other languages that's how you name a class. But hey, others swear by its syntax, so there's that.


One thing that's pretty frustrating about your comments, is how you both have an answer for everything but don't actually give any specifics, which makes it nearly impossible to actually address or consider your advice.

Also, fundamentally, I think you're way out of the norm in terms of system time to build. Getting the kinds of reliability and business value guarantees out of Rust is enormously harder than what you get out of the box with the BEAM. Having to hand-roll everything would take literally months, where I can be done and providing business value in hours in the BEAM ecosystem. Now, could Rust get to this place in another year or two? Completely. But it isn't there now, and, it still ignores my second issue...

...which is difficulty. I've been programming for just under 7 years, and while I've not learned C++, Rust is not only by far the most difficult language I've encountered thus far, it's exponentially more difficult. Months to grok the basics, likely years to be effective in it. You might've had an easy time hiring for this, but that might say more about your social group and hiring channels than the actual availability of talent.


> One thing that's pretty frustrating about your comments, is how you both have an answer for everything but don't actually give any specifics, which makes it nearly impossible to actually address or consider your advice.

What is it you want to know? I'm giving you my personal experience, not an essay. It's hard to find people who have professional experience with Erlang, so I thought it would be useful to other people who are evaluating it to hear from someone who has been there. I'm more than happy to dig into things more if there's a pointed question.

> Getting the kinds of reliability and business value guarantees out of Rust is enormously harder than what you get out of the box with the BEAM. Having to hand-roll everything would take literally months.

It depends on your needs. My issue is that it is enormously hard to fix the BEAM once you have built something successful with it. It's hard to hire for. Working with it involves lots of esoteric knowledge.

We also know that people are adopting highly reliable and scaleable systems. It's just not Erlang. Cloud enables this. Containerization enables this. Instead of Erlang's IPC, you can use an IDL and an RPC compiler. Instead of pids, you can have a service registry using etcd or DNS. You don't have to hand roll any of that.

What Rust gives you is the ability to quickly build correct, fast, and maintainable systems that are easy to scale because concurrency is a central theme of the language.

> I've been programming for just under 7 years, and while I've not learned C++, Rust is not only by far the most difficult language I've encountered thus far, it's exponentially more difficult.

What languages have you learned? Did you learn about manual memory management in school/camp/on your own? What's been difficult about it for you? I'll admit, I know a lot of languages. Most of the things in Rust are familiar to me outside of lifetimes, which I don't consider to be to difficult to learn. It's just making something you consider implicitly explicit.

We have hired three new grads who we started on Rust, and we got them up in running in a few weeks. They did have the books, mentorship, and assigned work involving small tasks to ramp up on.


> What is it you want to know?

What kind of projects have you built? What were the primary needs? What were the constraints?

> My issue is that it is enormously hard to fix the BEAM once you have built something successful with it.

There's a lot of community resources who are more than happy to give advice for free on this stuff. The fixes that usually need to be provided are easy to implement, which wouldn't be the case for a hand-rolled system. It's hard to believe you hit an actually-unique problem unless it was pre-2015, and the world of today's BEAM is an entirely different place than before it, which should itself be considered.

The dismissal of Elixir I think says a lot about your view of the ecosystem- more time and energy has been put into that half of the community in the past few years than Erlang had in the previous decade(which says a lot about just how good Erlang was before the Elixir community came along).

> We also know that people are adopting highly reliable and scaleable systems. It's just not Erlang. Cloud enables this. Containerization enables this.

This really seems like a complete misunderstanding of the kinds of easy bonuses and guarantees you get in the ecosystem. Implementing the kinds of things you're describing literally requires teams- teams! - of people to build and run. My employer is currently in the rampup to this, and the amount of time, energy, and rough edges in the ecosystem right now is out of control. Containerization is great, cloud is great, but they don't automatically give you hot deploys, easy data handoff, automatic introspection, and opposingly, they're difficult to build and debug(since the tools for such are designed for a very different Unix world), have poor separation of concerns(half the tools that handle this stuff also do 2 or 3 other things, all in different ways, and all with very blurry boundaries), and I really don't think finding knowledgable people for them is much better considering half the tools have existed for less than 4 years(coincidentally the same length of time as Erlang's newest comer).

> Instead of Erlang's IPC, you can use an IDL and an RPC compiler.

To me, this sounds like saying, "Instead of this ultra-fast and agile big rig, you can put together a raft with these here twine". Once again- you've gotta build the world, you lose the niceties of the ecosystem, and it's just time time time.

That's not to say that an RPC can't be excellent- it absolutely can- but it isn't easy without a lot of infrastructure.

> Instead of pids, you can have a service registry using etcd or DNS. You don't have to hand roll any of that.

Who uses pids anymore when you've got `bitwalker/libcluster` providing service discovery via whatever mechanism you want(including etcd, Kube DNS, Kube selectors, Consul, EC2 tags...)?

> What Rust gives you is the ability to quickly build correct, fast, and maintainable systems that are easy to scale because concurrency is a central theme of the language.

While I definitely believe that(Rust is nothing if not a real marvel of engineering), it's got a long way to go before it starts removing any reason to use the BEAM. You call those systems maintainable- but how long have you been maintaining them? What team sizes do you often work in, do your projects require? BEAM systems are famous for scaling to tens or hundreds of millions of concurrent users, with no downtime, on engineering teams that could share a couple of pizzas.

I think Rust will get there- and that's the whole reason I've spent so much time in it- but we're just barely starting to get quality Actor system implementations, and they haven't been made easy to use either.

> What languages have you learned? Did you learn about manual memory management in school/camp/on your own? What's been difficult about it for you? I'll admit, I know a lot of languages. Most of the things in Rust are familiar to me outside of lifetimes, which I don't consider to be to difficult to learn. It's just making something you consider implicitly explicit.

I didn't learn about manual memory management until very late(~3 years ago), so that's definitely where a lot of the introductory difficulty has been. But in addition, the depth and complexity of the type system, how that type system interacts with its memory model, the inconsistencies of different types because of pre-implemented traits, etc. Simply reading the Rust book took well over a month of serious study- learning Elixir via "Programming Elixir"(which I consumed before learning Erlang) taught me the majority of the language in an afternoon, and the basics of OTP by the end of the book(later that week of light reading).

Now, years later, I wouldn't consider OTP that difficult to learn or understand. But this one I'll forfeit, if only because I seem to have understood it naturally a bit faster than most(because for me, the concepts honestly seemed to "just make sense". I remember several times thinking, "This is exactly how I would build this.", which actually allowed me to forget a lot that I learned to understand and effectively program in other languages, especially around concurrency).

> We have hired three new grads who we started on Rust, and we got them up in running in a few weeks. They did have the books, mentorship, and assigned work involving small tasks to ramp up on.

Admittedly, I've had several things blocking me on this:

1. The Rust Book, while good, was long and didn't always use the best examples.

2. I only have one serious Rustacean that I can access regularly(a former Mozilla employee).

3. I've struggled to find projects that made me genuinely think, "Rust would be perfect for this", outside of small callouts to it from other languages. I'm just now on one that is going to have some unusual math that needs to be performed quickly and continuously that I might use it for, but even still.

Because of this, I'd definitely concede that it might be possible to do it much faster if one had adequate support, but without prior memory management and deep type systems experience, I doubt it'd be sub 2 months without a constant pair(which might do the trick, and I advocate in most regards anyhow).

I apologize, reading over this I see many places this likely comes off as hostile, and that's not my intention. Just a lot of your comments honestly surprise me, and directly contradict both my lived experience with the ecosystem and that of teams I hold a lot of respect for.


> What is it you want to know? I'm giving you my personal experience, not an essay. It's hard to find people who have professional experience with Erlang, so I thought it would be useful to other people who are evaluating it to hear from someone who has been there. I'm more than happy to dig into things more if there's a pointed question.

While this whole comment thread has turned into a bit of flame war, I appreciate your candidness and think you’ve tried to fairly express your opinion. As someone who likes semi-obscure languages and systems it’s valuable to see when and where system designs fail for people. Personally without Elixir I wouldn’t want to delve into Erlang/OTP for many of the problems you mentioned. The Erlang syntax seems "elegant" upfront on small problems, but digging into say CouchDB or Riak it’s a bit of a pain to follow. The lack of good name spacing, etc, I find to be a pain.

Elixir is overall a syntax that works well for me, at least as well as Rust’s. The Elixir team have made great strides in providing good compiler error messages, and are actively improving the distribution/ packaging story. Hex and mix are fantastic, and on par with what I’ve use in cargo.

> It depends on your needs. My issue is that it is enormously hard to fix the BEAM once you have built something successful with it. It's hard to hire for. Working with it involves lots of esoteric knowledge.

Not sure I follow this. I’ve dug into the BEAM source and found it to be well designed and relatively easy to follow. Especially compared to CPython, but not as nice as Lua. Presuming you’re meaning fixing dist_erl and such, that’d make more sense. But how would it be any different than say tweaking consuld

> We also know that people are adopting highly reliable and scaleable systems. It's just not Erlang. Cloud enables this. Containerization enables this. Instead of Erlang's IPC, you can use an IDL and an RPC compiler. Instead of pids, you can have a service registry using etcd or DNS. You don't have to hand roll any of that.

Exactly! Except it works two directions. Having to learn and deploy say k8s, consul, and then learn gRPC and figure out how to route messages, etc is a lot of work. While I’ve not deployed a large cluster on BEAM, it’s clear that I wouldn’t want to scale a BEAM cluster more than a few dozen nodes. However, a few dozen nodes can handle the workload for probably 90% of companies. Being in small startups if I can effectively get binary RPC, distributed namespacing, etc, for free it saves a lot of trouble and effort. I’ve also found it’s not to hard to entirely replace the distributed namespace mechanism, with projects like Lasp or Swarm, or heck, likely shunt it off to consuld in the future.

All that said I like Rust and plan on using it in the future where I can, likely, in conjunction with Elixir via rustler. My work is primarily doing IoT work, where Rust is slowly evolving. I’d love to have a Rust "OTP" and actor system for IoT devices where Beam doesn’t fit.


> Elixir is overall a syntax that works well for me, at least as well as Rust’s. The Elixir team have made great strides in providing good compiler error messages, and are actively improving the distribution/ packaging story. Hex and mix are fantastic, and on par with what I’ve use in cargo.

I'm working on making the dialyzer error messages better, too, in Dialyxir[0] and Erlex[1] =).

[0] https://github.com/jeremyjh/dialyxir [1] https://github.com/asummers/erlex


Wow, thanks! I’ve kept up with your project and greatful for the work.

Out of curiosity, have you seen anything in Dialyzer for dealing with typing GenServer messages and handlers? I haven’t figured out a way to spec message handlers as dialyzer seems to only check that you have any `handle_cast` implemented and adding a behavior doesn’t compose. Given the key role GenServer plays in OTP, it’s a big lack IMHO. I’ve wondered if the new named guards in Elixir could help with it somehow.


Nah because as soon as you deviate from the GenServer behaviour you'll get errors from not matching the @callback. The trick there is to provide a narrower API that calls into the GenServer functions with the proper @spec and treat the handle_cast and friends as private functions that you don't call directly. So like a pop/0 might call cast(:pop) and you'd have the associated handle_cast for :pop.


Yah, that's the error I ran into. Creating function wrappers to send messages works well for client api's. However, it doesn't help verify that you've implemented the `handle_cast` properly or dealing with messages from other places. Normally it's not to big of an issue.


Yeah ideally those APIs are well scoped enough to be able to unit test them, but I agree there's some piece missing there.


> However, a few dozen nodes can handle the workload for probably 90% of companies. Being in small startups if I can effectively get binary RPC, distributed namespacing, etc, for free it saves a lot of trouble and effort.

I think this key. There are issues, and there are parts of the platform that are crap, but Erlang (via Elixir as well in my case, I'm not sure if feel the same way if it was purely Erlang) is on balance the most useful and usable tool I've ever used for building networked services


> I’d love to have a Rust "OTP" and actor system for IoT devices where Beam doesn’t fit.

You might want to check out Grisp - https://www.grisp.org/ - Erlang VM ported to RTEMS. They also offer devboards with ample resources designed to run this thingy.


Actix is trying to be a decent actor system for Rust, but I haven’t dug into it enough to know how it would compare to OTP.

Also, you should take a big look at the Pony language if you haven’t yet. It’s much earlier days than Rust, but comes from people who know and love Erlang.


I’ve heard about it Actix. For it to compare with OTP, you’d presumably need a lot more tools for dealing with supervisor trees. For example if you customized the `try!` macro to "kill" an actix actor and let a supervisor restart it. That’d be more OTP like.


> As someone who has worked two jobs now writing, deploying, and operating Erlang clusters, I recommend switching to Rust.

I'm surprised to read this, Erlang and Rust have almost nothing in common. It's understandable that BEAM doesn't suit your needs but suggesting that everyone should write Rust instead is grossly misleading. They are suited to solving different problems!


They have a lot in common syntactically, semantically, and philosophically. They are both languages designed to promate safety and robustness. They're both suitable for writing highly concurrent, massively scale-able applications. The future is lots of cores, and the Rust community is on board with this.

I'd also like to point out that at both of my jobs, we used Erlang for uses cases that are most heavily associated with the language, and for which popular open source alternatives in Elixir/Erlang already exist. We have had better success scaling, operating, and hiring for Rust. We've saved man hours, time, and money.

So I am saying that Rust is a replacement for Erlang. It's not a tool for a different problem.


Early Rust was much closer to Erlang, but it lost more and more of it over time.


You’re comparing the out of the box, network transparent scaling solution that is fully meshed and doesn’t claim to solve everyone’s problems forever with a solution in rust that probably involves a completely hand rolled implementation of lots and lots of different things OTP is doing. You have made me want to learn Rust though, how does it compare with Golang?


We move the orchestration from the Erlang runtime to our container orchestrator though. We have a better separation of concerns, and we can scale any language. The registry, process communication, are just services and protocols running in containers.

So compared to Go, the Rust folks seemed to have paid close attention to the things people liked about Go. We have most of those things. We have great tooling, auto formatting, concurrency primitives (although we have freedom of choice. Green threads are a third party library, and aren't part of a runtime). We even have an animal mascot.

From a language standpoint, Go isn't anything remarkable (our Haskell colleague calls it a step backwards). If you listen to the Gotime podcast, Brad Fitzpatrick, one of the Go authors even says as much. Go's big contribution was essentially web programming and go routines.

Rust, on the other hand, has a lot. It has typeclasses from Haskell. It has lifetimes from the PL research community. It has immutability and pattern matching just like in Erlang. It has all the familiar control structure and enums/structs from C++.

Performance of Go in terms of CPU and memory pressure is usually somewhere near Java, if not slightly better. That's pretty damn good. The JVM has improved a lot over the years. Rust performance is a hair shy of C/C++. That's amazing. All those abstractions, and the speed of C++?


> It also enables event driven programming through tokio, which is a better fit for web servers than green threads (you're mostly waiting on the network)

This is a really strange thing to say. Erlang is still an event-driven I/O runtime, it just doesn't inflict the user with the burden of continuation passing like tokio does. What is the benefit of doing that yourself? Even with futures its far more verbose and awkward than it needs to be. And since the entire ecosystem is not built on this io system, you will always be finding libraries - even very popular ones, such as diesel - that are incompatible with it and forcing you into threadpools. Every library in use with BEAM uses async I/O in exactly the same way.


This is an implementation detail that you have no control over. From the point of a view of a process, io is synchronous. This is very limiting. It's better to keep decisions of concurrency in libraries rather than bake it into the language.


Can you elaborate on the limits this imposes? What is it that you can do with direct access to an event loop that you cannot do in Erlang?


The world of ephemeral containers really has changed the dynamic. Erlang solves a lot of problems with good solutions, but writing good software is only part of the battle. Lifecycle management, monitoring, etc make up a huge operational burden and should always be thought of as first-class problems in language design. Rust makes that better in many ways but not uniquely better than alternative languages.


In my experience any compiled language is easier to manage. There is no disconnect between your runtime and the system. You're investing in systems knowledge rather than knowledge of a particular VM's bytecode/abstractions as well.

Erlang does provide good solutions to lifecycle management. The problem is that whenever someone picks a solution for you, you're now locked it and it can be difficult to migrate to a new paradigm like containers.


Totally agree regarding compiled languages. I’m surprised python et all don’t offer a way to check for file/script “correctness” without execution. It’s so bad that I will often add a “-h” flag to scripts I encounter just so I can run it and make sure no one introduce a silly bug (such as typo, import, etc).


Yep, I've seen my fair share of Python stack traces. Completely sympathize, and it is one of the reasons I believe in statically types and compiled as well.


> The BEAM just doesn't like to be treated like cattle.

So don't. Deploy using edeliver and give up docker for the Erlang / Elixir parts of the project. If you really can't, you might want to switch to a language with a runtime that plays more nicely with the container idea of the world, but I know companies that deploy Erlang with docker and they are happy with that. Purists frown at the idea but it works.


It's a major pain to run it in containers, and that extends beyond Docker. That's why I do advocate switching away from Erlang.

Giving up on containerization isn't an option unless we want to manage two completely separate infrastructures, which we definitely do not want to do.


I think that those people I wrote about are running BEAM in containers as if it were PHP. No connections between nodes and a load balancer in front of the containers. That should leave them with the niceties of the language, most but not all of OTP, and almost none of the deployment woes people have when unfamiliar with BEAM.


I also find erlang does not play well with modern ops work flow like docker and kubernetes. But as for Erlang, a lot of people uses Erlang for building in-memory massive stateful services, and a very fault tolerant system on top of it. I'm interested in both language, but haven't start playing with Rust yet. I tried Akka and find it's pretty good but the virtual machine is not as good as BEAM, but Scala is a powerful language. I'm curious about Rust. Could you please share some insights about if Rust is capable of such stateful services as Erlang does, or there could be other advantages in terms of stateful/fault tolerance?


This is a post on `X` by example.

I don't think it's constructive to the discussion here to say X has drawbacks (and then go on at it), use Y (and then go on at it).

I have not used Erlang or Rust, and when I read this comment, it seemed flamewar-ish to me.


Isn't the whole point of the BEAM languages to keep distributed state reasonable to work with through the actor model. If you can put a service in a docker container and spin it up or down without losing data, then it doesn't sound like you need any of the BEAMs.


>and it's hard to burst. Like, super hard to burst

What exactly does 'burst' mean in this context?


Bursting is when you need to add additional capacity to handle expected but higher than normal load for a brief period of time.


What is TLC?


Tender loving care


What do you think of Elixir?


Elixir doesn't affect most of the operational aspects of Erlang, and it certainly in no way affects performance. You're still running on the BEAM, except now you actually have more apps running (The Elixir runtime runs as an app).

Syntactically it cleans some things up, adds some niceties, and annoyingly makes atoms require a prefix of ":" (annoying because atoms are central to Erlang's readability).

Personally, I don't find the changes to make a big difference when using the BEAM, and I prefer the familiarity of Erlang's syntax. It's also the language all of the documentation for the BEAM will use.

If you absolutely can't get your coworkers to ditch dynamic/gradual typing, Elixir or Erlang are still great choices and way better than Python/Ruby. Otherwise, try Rust.


Are you actually using Rust for professional web development? I've tried it earlier this year and my experience wasn't that great, so I'm quite surprised to read this comment.


What issues did you have? We haven't had very many issues. We rebuilt some internals tools, but for the most part it's been smooth sailing. We do use hyper and h2.


My main issue was with the ecosystem, which seemed too green and volatile/experimental. IIRC I used Actix and Diesel, for a toy web app.


What did you end up using for Rust? The actix ecosystem is pretty good, and quite fast.


I did like Rust (as a language), but I'm using Elixir, Phoenix and Ecto right now.


There's always Pony.


Pony has a lot of promise but it needs a few key missing libs. If it had a quality http and postgres libs it prob would be perfect for web dev. but it does not have either yet.


Cool site!

PS: Content looks cropped on an iPhone 7 nonmatter how you resize it


Can't read the left half of any code examples on Android chrome.


Yep, completely unusable on mobile




Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: