Hacker News new | past | comments | ask | show | jobs | submit login
Erlang Garbage Collection Details and Why They Matter (hamidreza-s.github.io)
115 points by byaruhaf on Sept 15, 2015 | hide | past | web | favorite | 23 comments

I would expect this kind of article to start with numbers showing that Erlang's GC has shorter GC pauses than, say, Java or Go and then explain why this is so.

Instead it merely states that Erlang GC has better latency than other system and we're supposed to just believe it.

It's fine as a description of how Erlang GC works but totally useless as an argument that Erlang has short GC pauses.

Here is a semi-recent benchmark for Phoenix, which is written in Elixir (runs on the Erlang VM, BEAM). I think it will give you a decent idea of the differences: https://gist.github.com/omnibs/e5e72b31e6bd25caf39a

I would like to see this run with the latest versions of the various platforms, as a lot has changed since Phoenix 0.13, and I'm sure it has for the other platforms that were benchmarked as well. Initially, it seems like there is not a lot of difference between most of the platforms, but when you're working with variances of 1-2 ms on 180k rps is pretty substantial.

Would it be possible to update node.js to v4.0.0 and rerun the Express Cluster benchmark? node.js v0.12.x was slower than v.0.10.x, while v4.0.0 is supposed to be faster than both. Sorry for making this request if it's out of your field of interest, but couldn't resist asking :)

EDIT: My mistake, I thought you were the author of the benchmark

I think there is a bit of information left out here (presumably because it is assumed to be understood by the reader), but it is not uncommon for erlang applications to have millions of different processes active at the same time. As such, since memory management is (mostly) isolated within a single process, you have a de-facto sharded garbage collector. This, as the article explains, is a core trait that helps with the soft realtime guarantees erlang brings.

How much of this is truly "Erlang" GC and how much just happens to be the current implementation found in Erlang? Does Erlang the language specify that this is how the GC acts, or is the language free to use whatever GC implementation it likes? iN short, if you are writing in Erlang, can you be certain that the GC will always act in the manner described?

Erlang basically has BEAM which is similar to Cpython or YARV in that it is both the official implementation and de-facto standard because the actual standards are either non-existent or don't cover a lot of edge cases (making an unofficial spec).

That said, I believe that if an official third-party specification for Erlang were ever created, it would likely have to specify a similar GC and threading system because this is necessary to allow the normal Erlang programming style.

I enoyed this [unexpectedly] informative read right up until the conclusion:

    Even if we are using a language that manages memory itself
    like Erlang, nothing prevents us from understanding how memory
    is allocated and deallocated. Unlike Go Language Memory Model
    Documentation Page that advices "If you must read the rest of
    this document to understand the behavior of your program, you
    are being too clever. Don't be clever.", I believe that we must
    be clever enough to make our system faster and safer, and
    sometimes it doesn't happen unless we dig deeper into what is
    going on.
The potshot at Go in the conclusion seems counter-productive. The author is certainly taking the quote in a different spirit than it was intended..

Erlang and Go have functional overlap, sure. However the trade-offs between both the guarantees and operational realities of the two languages leaves each one best suited to a different domain. Thus it is a pointless waste of time to argue and criticize one or the other in this manner.

While I may personally appreciate Go's opinionated-ness, it seems to lead to excessive amounts of conflict and drama. I wonder if it's an overall win or loss. On one hand, the controversy sparks interest which helps spread Go. On the other hand, it alienates many folks who may otherwise be open to getting involved.

Food for thought.

TLDR: It looks rather silly to compare languages with very different design goals.

I'm incredibly tired of this argument from Go folks. Your language makes terrible design decisions, period, because its designers can't be arsed to learn about the last 2 decades of language design. When people compare your language to other languages it's because other languages have better ways of achieving the same goals.

But let's take your argument seriously for a second and pretend that all these terrible designs are actually the best way to achieve Go's design goals: to make a language that compiles fast for giant codebases while maintaining reasonable performance for a huge infrastructure (Google's). If that's the case, why are you using Go? There are a very small handful of companies where these design goals even make sense as goals even if they did achieve them. If you're going to claim that this is about different design goals, then I'm just going to claim that those very design goals are why you shouldn't be using Go.

I have been hem-hawing for years on what my next programming language to invest time into learning should be and I found this comment extremely informative. After verifying that what you said about their design goals for Go wasn't a strawman, I got curious about criticism against Go. I did some googling and found this:


That article is interesting. I agree that if Go were just bad, people wouldn't criticize it so much. But I don't think it's an identity issue; maybe it is for other critics of Go, but that's not why I criticize it.

First off, I'm not an ML-language guy: I have written a small amount of Haskell and respect the work those guys have done, but I'm more of a Scheme and Python guy. And I'm not super ideological about functional programming either; I write a good amount of C and some assembly.

I am all for making pragmatic compromises. Part of what offends me about Go is that they've coopted the idea of pragmatism as their own when in fact the design decisions made by the Go language aren't pragmatic for most users. Pragmatism is rooted in knowledge, having thoroughly investigated solutions and chosen the best based on your goals, not just giving up on implementing stuff because it's hard to implement (which I suspect is what the Go team actually means when they say something's not pragmatic).

In the end though, I think the reason I care is that Go's popularity poisons the industry as a whole. It eliminates places I'm willing to work at, and worse, it lowers the caliber of programmers I can work with. Programmers who use `go generate` aren't going to learn how to use code generation that integrates with a type system. Programmers who unconditionally claim generational garbage collection is too heavyweight aren't going to be able to write programs that allocate larger amounts of memory. Programmers who think "goroutines" are the newest, most innovative, one true way to do threading are going to shoehorn that model into their code in other languages, the same way I shoehorn threading models from other languages into my Java code, except I have a variety of models I've used and I can choose the right threading model for the right situation. And I'm going to have to work with these people at some point, and screen them out in interviews, and I'd rather we just didn't let this shitty language into our industry in the first place.

By my reading, he's not knocking Go. Rather, he's citing a bit of advice in their documentation and explaining why he disagrees.

By my reading the author accidentally took a potshot at his own ideas. He didn't convince me of his philosophy on cleverness.

Thanks for reading my post and taking time to write about it.

As the title of post which emphasizes on "WHY GC matters", I tried to reason why we should understand the underlying of a system, say Erlang, Go, Java or whatever else. But unfortunately always there are few documentation around about them and sometimes it is advised not to read them.

I do agree with you that comparing languages with very different goals is totally wrong, but what I did was comparing the Go's documentation author advice with mine, which is about being clever enough to understand your system.

The Go docs don't have that because they don't want you to know, they have that because they want to be able to change it without blowing up your programs.

Erlang has had similar corners over time, such as the way it has handled large binaries, where they've needed to reserve the right to change it. (IIRC the docs promise that the large binaries won't be in your process heap, which has been true for a long time, but beyond that, the details have changed significantly.)

> The Go docs don't have that because they don't want you to know, they have that because they want to be able to change it without blowing up your programs.

That's a very charitable interpretation. If that's what they want to say, they could say that. I just don't buy it.

It also doesn't really make much sense. The Go Language Memory Model page is about memory synchronization, not memory management. This article is about memory management, not low-level details of when writes and reads are visible across coroutine boundaries.

Ugh, seriously? The page says: "If you must read the rest of this document to understand the behavior of your program, you are being too clever. Don't be clever."

That's an idiotic thing to say. I don't care if you're saying it about memory synchronization or memory management, it's an idiotic thing to say either way.

Redirecting all criticisms of Go and pretending that they don't apply because Go isn't for anything or about anything seems to be the only defense Go users have. Why not, instead of telling us all the things Go isn't for, or all the things Go docs aren't talking about, why don't you actually use a non-shitty language that's actually useful for something.

The advice goes somewhat deeper than this. Design goals aside you really shouldn't get too clever with GCs as they are subject to change. If the GC changes (because of platform upgrades, or simply using different runtimes) suddenly your application can start showing strange or bad memory performance characteristics.

Either way a less controversial way to put the Go advice would be:

> If you must read the rest of this document to understand the behavior of your program, you are being untrusting. Don't use things you don't trust.

If you care about memory performance characteristics that much you really should be using a language that gives you control over them.

That's not to say that all GC advice is bad: good GC advice exists and usually sticks to the formal definition of a GC (an infinite memory simulator), instead of implementation details which, again, can change.

Then why don't they just say that? Something like, "Hey, we may change how this works in the future. Clever optimizations now may not work down the road."

The problem with depending on specific optimizations like that it lock-steps you to updates that boast improvements (that break your implementation-dependant code).

The author argued that the GC details matter, not whether they will or won't change.

>It looks rather silly to compare languages with very different design goals.

Well, not THAT different. Different philosophoes maybe, but the goals were quite similar: a high level language to create network services in (and facilitate concurrency and/or parallilazation).

> Unlike Go Language Memory Model Documentation Page that advices "If you must read the rest of this document to understand the behavior of your program, you are being too clever. Don't be clever."

I see a language barrier here. (no pun intended)

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact