Hacker News new | past | comments | ask | show | jobs | submit login
For Better Computing, Liberate CPUs from Garbage Collection (ieee.org)
433 points by mbroncano on May 9, 2019 | hide | past | favorite | 449 comments



IMO garbage collection is the epitome of sunk cost fallacy. Thirty years of good research thrown at a bad idea. The reality is we as developers choose not to give languages enough context to accurately infer the lifetime of objects. Instead of doing so we develop borderline self-aware programs to guess when we're done with objects. It wastes time, it wastes space, it wastes energy. If we'd spent that time developing smarter languages and compilers (Rust is a start, but not an end) we'd be better off as developers and as people. Garbage collection is just plain bad. I for one am glad we're finally ready to consider moving on.

Think about it, instead of finding a way of expressing when we're done with instances, we have a giant for loop that iterates over all of memory over and over and over to guess when we're done with things. What a mess! If your co-worker proposed this as a solution you'd probably slap them. This article proposes hardware accelerating that for loop. It's like a horse-drawn carriage accelerated by rockets. It's the fastest horse.


> It wastes time, it wastes space, it wastes energy.

But all of these are much cheaper than developer labour and reputation damage caused by leaky/crashy software. The economics make sense.

Anecdotally, I spent the first ~6 years of my career working with C++, and when I started using languages that did have GC, it made my job simpler and easier. I'm more productive and less stressed due to garbage collection. It's one less (significant) cognitive category for my brain to process.

Long live garbage collection!


> But all of these are much cheaper than developer labour and reputation damage caused by leaky/crashy software.

And that's why so much effort has gone into making it fast and low latency, but it was a false dichotomy: We can have memory safety without garbage collection.

- Automatic reference counting: Most people know about Objective-C's efforts in this space, but it's admittedly less automatic than programmers would like so perhaps it doesn't get enough attention. And yet it should, since q/kdb+ uses reference counting exclusively, and it holds top performance numbers on a large number of data and messaging problems.

- Linear lisp[1] asked functions to fully consume their arguments making (cons x y) basically a no-op. Again, no garbage collection, and no fragmentation (at least as long as all objects are cons), and yet no matter how promising this path looked, garbage collection got much more attention.

- Rust's borrow checker/tracker makes ownership explicit; somewhat of a middle-ground between the two...

There's other scattered efforts in this space, and I don't know about all of them but for more on "everything we know about languages is wrong", also consider that Perl5 uses plain old reference counting, executes the AST directly, and still outperforms python for data and IO[2]!

I think the thing to take away is that memory management has to be automatically managed for developer sanity, not that garbage collection is the way to do it.

[1]: http://home.pipeline.com/~hbaker1/LinearLisp.html

[2]: The asyncio stuff in Python looks really promising though...


This comment is just bad and misinformed all over.

(1) Automatic Reference Counting doesn't work; its equivalent in interpreted languages is, well, reference counting, which can be optimized quite a lot (though has some issues with multithreading), but cannot collect cycles.

(2) therefore, if you want reference counting, you have to either also have GC (for cycles), or program carefully to avoid creating cycles (which is then only marginally better than C++)

(3) your comment on Python vs Perl5 is just nonsense, Python uses reference counting as well (along with the occasional GC to collect cycles)

(4) linear / uniqueness types (not exactly the same, but both can be used to ensure safe memory management) impose significant mental overhead on programmers as they prohibit many common patterns

(5) Rust features a lot of "cheating" - pointers that allow mutation from multiple threads, reference-counted pointers, etc. - so obviously just ownership&borrowing isn't good enough

conclusion: you can't have your cake and eat it too (at least according to current cutting edge research and implementations) - you either have GC or you have to be very careful/restricted when writing code


Please don't call names in arguments here. Your first sentence is clearly against the site guidelines, and was quite unnecessary.

https://news.ycombinator.com/newsguidelines.html


Sorry. I get emotional about programming languages...


Me too, as do many others.


> (1) Automatic Reference Counting doesn't work; its equivalent in interpreted languages is, well, reference counting, which can be optimized quite a lot (though has some issues with multithreading), but cannot collect cycles.

This is what weakrefs (or better data structures) are for. The Linux kernel uses reference counting incredibly effectively for almost every structure. I think that pretty much discounts any argument that reference counting cannot work.

> (5) Rust features a lot of "cheating" - pointers that allow mutation from multiple threads, reference-counted pointers, etc. - so obviously just ownership&borrowing isn't good enough

"unsafe" isn't cheating and isn't even related to whether garbage collection is a good idea or not. Yes, Rust's ownership and borrowing model doesn't cover all programs and hence "unsafe" is required -- but "unsafe" doesn't use a garbage collector so the existence of "unsafe" doesn't change that Rust works without one.


Weakrefs require “careful programming” i.e. knowing which ref must be a weakref.

By “cheating” I don’t mean “unsafe” but all kinds of pointers that aren’t just unique or borrowed such as different reference-counted pointers, the existence and use of which demonstrates that ownership+borrowing isn’t a complete memory-management technique. You need reference counting, therefore you need GC or careful programming.

Hopefully we discover some technology that makes these needs go away, but not so far AFAIK.


The point with weakrefs, though, is that they still require developer intervention.

Also, you can use most memory management facilities in Rust (including reference counting) without `unsafe`.


Correct, leaks are considered "safe" -- it's premature deallocations that are considered unsafe.


I think the poster to whom you're responding means things like Cell and/or RefCell, which allow mutation behind the scenes while still being safe.


Right, but I think their main complaint is that RefCell is implemented using "unsafe" -- hence my point about "unsafe".


It's API is not `unsafe` though. Otherwise, every safe API depends on some `unsafe` call down the call graph.


I think we're both in agreement, I was phrasing things in a manner which I thought would better reference how GP thinks of unsafe. Obviously the API itself is safe, but the "cheating" GP was referring to was the fact that Rust requires "unsafe" under the hood in order to implement some safe operations (because the memory model doesn't cover all cases). You and I agree that this isn't a problem.


Reference counting is a garbage collection algorithm. It just isn't a very good one compared to semi space because its runtime is the number of dead objects not live objects.


Refefence counting is barely classifiable as garbage collection. The runtime is O(1) because you free the object immediately when it dies, there's no stage where you start looping over dead objects.


>Rust features a lot of "cheating" - pointers that allow mutation from multiple threads

You're confounding GC and synchronization of multi-threaded access here. What rust does with things like Arc other languages will need some other synchronization primitive to enable. Their GC or manual memory management does not guarantee the multi-threaded semantics. That's one of the advantages of the rust memory model, some patterns become multi-threaded without extra primitives (read-only borrows) and all of the primitives are compile-time checked for correctness.

>reference-counted pointers, etc. - so obviously just ownership&borrowing isn't good enough

In practice ownership and borrowing gets you a long way. I've only had to use a reference counted primitive (Arc) when that was the exactly correct semantics. I had a cache and both the caller and the original owner could drop the value first based on external input. No other static decision can be made and the ergonomics of writing that are very good. Rust seems already quite a bit better than having to be "very careful/restricted when writing code".


No, Rust is conflating memory safery and multithreading safety.

JVM state-of-the-art GC guarantees multithreaded memory safety.

AFAIK many wait-free data structures are pretty much impossible without a GC.


You start your response with "No" but don't actually argue against anything I've said.

> Rust is conflating memory safery and multithreading safety.

Rust doesn't conflate anything. It just gets some multi-threading safety for free as a result of its memory model. And then gets quite a few other multi-threading safety characteristics that no other mainstream language has from the type system. That several features in the language work together well is just a sign of good design.

> JVM state-of-the-art GC guarantees multithreaded memory safety.

This is where you're mixing the two unrelated concepts. Garbage collection does nothing for multi-threading safety. While the object is live and two threads hold a reference to it you need to synchronize access somehow. GC can't do anything about that.

> AFAIK many wait-free data structures are pretty much impossible without a GC.

This is unrelated to the previous points I made but isn't true either:

https://aturon.github.io/blog/2015/08/27/epoch/


By "conflating" I mean that Rust's memory management technology provides just one kind of multi-threading safety (i.e. "multiple readers/single-writer" model), but to implement other kinds of multi-threading-safe algorithms/data structures, you need to transcend beyond ownership+borrowing.

I think "epoch-based memory reclamation" (which Linux kernel also uses IIRC) is still a (primitive / limited / contained) form of GC. For those that aren't familiar with it, a quote from the article @pedrocr posted:

> The basic idea is to stash away nodes that have been unlinked from the data structure (the first source of reachability) until they can be safely deleted. Before we can delete a stashed node, we need to know that all threads that were accessing the data structure at the time have finished the operation they were performing.

So basically, the runtime needs to keep track of allocated objects until it's certain there are no more references to them... sounds like GC (the difference between this and a real GC is that (1) you know there's no cross-references between the nodes, so you don't need to mark&sweep but just need to keep track of the epoch, and (2) the nature of lock-free data structures is such that threads will move to the next epoch fairly soon (unless they infinite-loop)).


> By "conflating" I mean that Rust's memory management technology provides just one kind of multi-threading safety (i.e. "multiple readers/single-writer" model), but to implement other kinds of multi-threading-safe algorithms/data structures, you need to transcend beyond ownership+borrowing.

That's not "conflating" anything. You're just saying that "to do other things you need other stuff". Sure, to implement multi-threading you also need locks and semaphores and atomic variables sometimes. Rust provides all of those. There's nothing specific to GC vs non-GC there.

> So basically, the runtime needs to keep track of allocated objects until it's certain there are no more references to them... sounds like GC

Reference counting is not GC by most definitions. And if you want to define it like that Rust already has it.

> (the difference between this and a real GC is that (1) you know there's no cross-references between the nodes, so you don't need to mark&sweep but just need to keep track of the epoch, and (2) the nature of lock-free data structures is such that threads will move to the next epoch fairly soon (unless they infinite-loop)).

So the difference between this and GC is that it's nothing like GC... If reference counting, RCU, etc are all GC then there's nothing to discuss as it turns out languages like Rust have all the GC you need.


I agree, the model of lock-free data structures is pretty much perfect for reference counting (no cycles, shallow reference tree... just needs to be thread-safe). I wonder then why epoch-based approaches are used, which (sound like they) are more difficult / tricky to implement... I honestly have no idea. The only thing that comes to mind is that they seem to require another atomic primitive (atomic increment, otherwise lock-free data structures only require compare-and-swap), which might not be available on all platforms...


> AFAIK many wait-free data structures are pretty much impossible without a GC.

That makes no sense. You can implement a garbage collector in a manually managed program (or even just the specific parts that gain you the attributes required to accomplish something), so even if a GC made certain data structures easier, there's nothing to prevent their use in a manually managed language. The argument you're making is sort of like saying that certain things can only be accomplished in ain C or Java, and not in assembly.

More specifically to your actual assertion, anything a GC does for lock free data structures could be handled by a special routine or manually in a language where you handle memory manually.


(1/2): It most certainly does work: q/kdb is an interpreted language and doesn't have cycles. Being an array language, nesting is uncommon, so chasing pointers isn't an issue either.

(3): I didn't say otherwise Python doesn't also do reference counting. Python however does have a bytecode and a GC, and perl5 does not and executes the AST directly. And yes perl5 is faster for many problems[†]. I think it's quite common that people believe that GC and bytecode are a win, and this is a simple and accessible example where it is not true.

(4): More significant than what? q/kdb is implemented in this style, and it's a huge advantage (reducing bugs) having your library handle allocation/deallocation over having the caller do these things so I tend to do it as well. What exactly are you comparing it to? The mental load of figuring out what's causing pauses is quite high since it requires considering the entire program, but linear programming is very easy and it obviates a lot of need for other kinds of memory management techniques.

(5): Good enough for what? Why is mixing static analysis with runtime analysis "cheating"? What would be "fair" in your mind?

[†]: https://benchmarksgame-team.pages.debian.net/benchmarksgame/...


It is weird, I do not feel restricted when I write non-GC code. Btw. you still need to pay attention with GCd languages no to leak references that cannot be GCd. Java has this sort of problems. Rust made it easy and simple to deal with memory in a non-malloc level while not having GC. Erlang on the other hand has a VM that makes GC much faster than in other languages. I can live with that because if you do it right it does not impact end user latency. For me both Rust and Erlang gives the cake + eat it feeling.


"I can live with that because if you do it right it does not impact end user latency."

The vast majority of GC use does not impact end user latency. Thinking otherwise is either the availability heuristic at work, or you work in AAA videogames or another ultra-high-real-time-performance industry and don't realize you're not in the center of programming work, you're on an extreme. (A perfectly valid and sizeable extreme.)

This may also be why you don't feel restricted; if you happen to work in any of the many places where a tree of references is adequate, you'll find Rust-style management to be a treat. If you work pervasively with graph structures in the core of your program, though, the experience can change. Pervasively-graph-type-structures are generally just a question of how you want your automated garbage collection to work, because I'm pretty sure getting the manual management correct otherwise is effectively impossible without an algorithm that's going to look a lot like a GC.

That said, despite my disagreement, I also disagree with the people who, as of this writing, have downvoted your comment into grey. I feel like I've been seeing a burst of "downvote as disagreement" in the past couple of weeks.


"Downvote as disagreement" is such a pervasive pattern that I Find myself doing it automatically even when I know that's not what downvote is "supposed" to be for. I suspect it's because I'm more likely to view posts that I disagree with as being poorly thought-out.

This even extends to platforms where downvoting is not a thing - witness 4chan's struggles with "sage is not a downvote" ("sage" being a function which allowed you to reply to a thread without bumping it to the top and extending its lifespan - again, intended to mark a thread as low qualit) until they finally gave up and made saging invisible.

Given how thoroughly inescapable downvoting-to-disagree seems, I wonder if it might not be better to just declare defeat, allow downvotes and upvotes to mean nothing beyond a signal of protest or agreement, and have a separate "this comment is low quality / this comment is high quality" buttons.


PG long ago said that it is ok to use downvotes (in part) to signal disagreement.


> still need to pay attention with GCd languages no to leak references that cannot be GCd. Java has this sort of problems.

Unless you're doing weird things with class loaders, this is a non-issue.

> Erlang on the other hand has a VM that makes GC much faster than in other languages

What makes Erlang's VM make GC faster than other languages?


That's a very loaded question.

I don't think it's easy to point to one thing Erlang is doing and say that is why it's faster, partly because I think the way Erlang programmers are encouraged to write their programs also has an effect; The language itself may have a greater effect on its GC performance than the GC implementation itself.

As an example: In Erlang, we have function calls like F(X) but we also have message sends like F!X. Message sends get one type of memory treatment, while function calls get another one. The lifetime of objects you send via messaging is going to be different, and yet Java only has function(method) calls, so there's no way to hint this to Java. Maybe this information can be recovered with static/dynamic analysis, but in Erlang we have syntax to tell the VM what's going to happen.

If you want to learn more about Erlang's internals, I can recommend Joe's original paper on the subject:

http://citeseerx.ist.psu.edu/viewdoc/summary;jsessionid=0397...

since it describes a lot of the design that went into Erlang early, and:

http://www.it.uu.se/research/publications/lic/2005-001/

which while long, quite in-depth describes many of the sub-optimisations that can be realised on top of it.


Thanks for the links. That being said, new developments in GCs on the JVM (like Shenandoah and ZGC) are yielding very impressive results. Things like < 1ms pause times for heaps in the TB range. No other free GC for any other language approaches this as far as I'm aware, including golang's which people claim is fast, but completely gets destroyed for even heaps in the GB range (I did some benchmarks).


>It is weird, I do not feel restricted when I write non-GC code.

Some people don't feel C has any safety issues either -- perhaps they think all others are idiots and their code is mistake proof because they do it right.


> (4) linear / uniqueness types (not exactly the same, but both can be used to ensure safe memory management) impose significant mental overhead on programmers as they prohibit many common patterns

Such as?

I've found that linear style is generally just good coding style. When I get caught up on something I can't express linearly, it generally means I was trying to plough through with a dodgy approach; stopping and thinking through would have resulted in better code even in a GCed language.


You came in guns blazing and then covered yourself in embarrassment already in the first two points you made.

The thing you said doesn't work is in use on millions of devices, working just fine. Or maybe you'd like to be more precise than "work".

Avoiding cycles and thinking about ownership is not programming carefully, it's merely programming. Really, manual memory management is not rocket science folks, just error prone. And RAII is reliable to the point that anyone that is capable of writing software can wield it.


>Really, manual memory management is not rocket science folks, just error prone

Doesn't have to be rocket science to be a cognitive burden. And we want to eliminate "error prone" factors...


And we did eliminate them.

But no one can seriously claim that GC is a magic bullet, because it's only clearly winning against plain C.

Carefully designed, popular languages are choosing RAII and reference counting implementations and those are very competitive.


?

C++ and Objective C inherited C's memory model and thus don't have a good memory management option. Rust is explicitly a systems language, meaning stronger memory models are out of scope, and then inherited most of C++'s idioms.

Other than those, what popular languages are specifically choosing reference counting as a primary GC method?


Swift.


Swift requires reference counting for compatibility with Objective-C code.

Apple could have gone the .NET way (RCW) to interface with COM reference counting, but that naturally would complicate the bindings to Objective-C runtime and existing libraries.


Which inherits its memory management from Objective C.


You say

> but it was a false dichotomy: We can have memory safety without garbage collection.

but then say

> Automatic reference counting

(ARC henceforth). Naive ARC is AFAIK very expensive as it causes a lot of updates at each pointer 'take' even if it's a read only (chasing pointers), trashing caches. Poss. even worse if multithreading is used as a memory barrier may have to be issued. Also it does not collect cycles. Also I've read it causes memory fragmentation (take that with a pinch of salt, that's from my memory!).

ARC has some advantages but it has some real costs.

Also and even more, if ARC is what you think it is, how is it not an implementation of GC?

> Perl5 uses plain old reference counting, executes the AST directly, and still outperforms python for data and IO[2]!

Ok... 1) outperforming python is not a high bar. 2) IO is usually handed off to the underlying OS so talking about IO performance is probably misleading. 3) Python now has a fallback non-ARC GC to pick up cycles. 4) GC in python is IIRC an implementation detail, not language-mandated. Finally, if it outperforms it, please provide a reference.

I'll check out linear lisp if I can find the time.


> Naive ARC is AFAIK very expensive as it causes a lot of updates at each pointer...

Yes, and I think that's why people shy away from it. Language design can help a lot though: Array languages use reference counting almost exclusively because nesting objects is uncommon. That means no pointers to chase the vast majority of the time.

> if ARC is what you think it is, how is it not an implementation of GC?

Garbage Collection involves crawling live objects from roots to determine what's still alive.

If you're not doing that, you might still be doing automatic memory management, but you're not doing GC.

There's tricks (gardens, barriers, treadmills, colouring, etc) to do less work or less work at a time, but it remains predicated on the assumption is that you have more garbage than you have live objects so you're doing less work by only crawling the live objects. If this is true, then GC should be a win, but it's not true an awful lot of the time, which is why manual memory management always beats GC for performance (latency and throughput).

[†]: notwithstanding your earlier point about pointer chasing...

> I'll check out linear lisp if I can find the time.

I can recommend everything Baker has written on this subject: it's all gold.


> Garbage Collection involves crawling live objects from roots to determine what's still alive.

> If you're not doing that, you might still be doing automatic memory management, but you're not doing GC.

Whether GC and automatic memory management are [significantly] different terms depends on who you ask, I don't think there is a consensus on that. In many cases however people tend to say GC as shorthand for tracing GC.

I think in many cases we see ARC is "good enough", and in a JIT scenario you could remove many RC changes by effectively collapsing/merging scopes. OTOH it is quite clear that for the same workload tracing inevitably performs better.


> Naive ARC is AFAIK very expensive as it causes a lot of updates

It also means you can't really rely on CoW after fork(). After a while your whole heap will be duplicated in every process because of the RC updates.


A simple trick gets you around this: Don't use your system malloc() but instead map a tempfile (or a memfd) with MAP_SHARED and use the STM primitives to update your references. If you combine this trick with __attribute__((address_space(256))) your "pointers" are also the offsets in the tempfile.

Or don't: Maybe you want two heaps. fork() gives you the option!


Lots of things that people do don't work after `fork()`, though, to the point where `fork()` has become largely irrelevant in new code. For example, threads are not generally compatible. You can also easily run into issues with open files, file locks, etc.

Of course, your point might still be relevant in "legacy" environments with poor threading support anyway. I would argue that those are gradually going away, as even interpreted languages without exposed threading support are increasingly using threads or adopting threads to accommodate performance needs.


This is not legacy. Any application in Ruby, Python, Perl, ... is affected. Majority of the backends of current web have the issue of duplicated heap in all the workers. It's made slightly better by Python introducing gc.freeze in 3.7 (https://docs.python.org/3/library/gc.html#gc.freeze) and Ruby doing something similar soon, but not great.


That's a nice point. yikes.


> Naïve ARC is very expensive…

So is/was garbage collection, and we’ve spent thirty years trying to squeeze every possible bit of performance out of it.

Maybe it’s time to look at other approaches.


Even more so from reference counting, given that it was the very first kind of garbage collection algorithms.


I don't think that's true.

Garbage collection (to my knowledge) was invented by McCarthy back in the late 1950s[1] in the following three paragraphs of his paper "Recursive Functions of Symbolic Expressions and Their Computation by Machine":

Nothing happens until the program runs out of free storage. When a free register is wanted, and there is none left on the free-storage list, a reclamation cycle starts.

First, the program finds all registers accessible from the base registers and makes their signs negative. This is accomplished by starting from each of the base registers and changing the sign of every register that can be reached from it by a car − cdr chain. If the program encounters a register in this process which already has a negative sign, it assumes that this register has already been reached.

After all of the accessible registers have had their signs changed, the program goes through the area of memory reserved for the storage of list structures and puts all the registers whose signs were not changed in the previous step back on the free-storage list, and makes the signs of the accessible registers positive again.

That's mark/sweep!

He adds as a footnote "We already called this process ``garbage collection'', but I guess I chickened out of using it in the paper--or else the Research Laboratory of Electronics grammar ladies wouldn't let me." but doesn't explain where the term came from. I suppose it's possible he didn't invent it, but I've never seen anything on that and I think most of the existing strategies used for memory allocation at the time were very explicit -- FORTRAN didn't even have ALLOCATE yet!

Do you have a reference for that?

[1]: http://www-formal.stanford.edu/jmc/recursive/node4.html


>it causes a lot of updates at each pointer 'take' even if it's a read only

This isn't required in the read-only case. If you're only borrowing the value, for example, taking a const& in C++ and not holding onto it, then you can cast to the underlying type and use the reference. In Rust you can also just borrow:

https://play.integer32.com/?version=stable&mode=debug&editio...

>trashing caches

ARC is usually atomic reference counting. Indeed that will trash caches because it must invalidate the cache line where the reference count lives in other cores whenever it's updated. RC without atomicity won't invalidate cache lines on other cpus.

FWIW, GCC has an extension to shared_ptr that doesn't use atomic reference counting:

https://gcc.gnu.org/onlinedocs/libstdc++/manual/memory.html#...

> Also I've read it causes memory fragmentation

Any heap allocations not bound to an arena and not part of a compacting garbage collector is likely to cause memory fragmentation. Stack allocations for example are implicitly compacted.

>Also and even more, if ARC is what you think it is, how is it not an implementation of GC?

This is a "can submarines swim" type of question and is completely uninteresting (imo).

>Finally, if [perl5] outperforms [python], please provide a reference.

It doesn't. Not by a long shot. Reference: https://bitbucket.org/ewanhiggs/csv-game/


> This isn't required in the read-only case

which is why I carefully used the word Naive, however I had little time to expand on it so I can't blame anyone for overlooking that.

> If you're only borrowing the value...

Then you give examples of manually offloading memory management to the programmer. It should be done, and I believe the book I reference covers it (can't find it now, sorry), but definitely not by the programmer else it's not automatic GC.

> ARC is usually atomic reference counting.

That won't make any necessary inter-core or inter-socket coherency traffic disappear.

> Any heap allocations [...] is likely to cause memory fragmentation.

My memory was that refcounting was particularly bad but feel free to disregard that as I can't justify it. I may well be wrong.

> This is a "can submarines swim" type of question and is completely uninteresting (imo).

eh? Puzzled...

> it doesn't.

thanks, will check.


In regards to memory fragmentation, you probably think of GC with compaction. Not all GCs are like that, but that is a plus for GCs if use the heap a lot and need cache friendliness.


Compare the "Unified Theory of Garbage Collection" (https://www.cs.virginia.edu/~cs415/reading/bacon-garbage.pdf). Reference counting is a variant of garbage collection. But yes, you can take ideas from their to make your garbage collection better.

Linear/uniqueness typing is interesting. Clean is another language that uses something like it, and you can implement these ideas in Haskell as well.

Though in practice you want affine typing, not linear typing. (Basically, you want to be able to drop arguments without explicitly having to consume them.) The important bit about linear typing / affine typing is that you use your arguments at most once, not that you have to use them at least once.

That's how you can model mutation in a pure FP way, too. Not just GC.


I think you mean "Automatic Garbage Collection", rather than just "Garbage Collection". The latter is exactly what you do in C when you use `free()`, the former is an automated system that does `free()` on your behalf at runtime. Indeed, the fact that you consider Rust's borrow checker to be part of the non-gc methods of handling memory, make it clear that you do indeed recognise this distinction, even thought it is merely implicit.

For the record, Automatic Reference Counting can't collect self-referential objects. And is also considered a method of automatic garbage collection. I haven't heard of any research on ARC being done statically, if you have that it would indeed be interesting.

Linear Lisp indeed looks extremely promising, I don't remember finding any code examples when I looked it up, it seems to be something that's languished in the theoretical department. HBaker has a series of really interesting papers, by the way, it's worth just going through his website :) There are a lot of nuggets of gold there.

It might also be worth for you to read up on the Mercury compiler -- if you haven't already -- which boasts compile-time memory allocation: https://mercurylang.org/documentation/papers.html#mazur-thes...


> I think you mean "Automatic Garbage Collection", rather than just "Garbage Collection". The latter is exactly what you do in C when you use `free()`, the former is an automated system that does `free()` on your behalf at runtime.

To me, the difference between garbage collection and other memory management systems is that gc scans memory looking for memory that can be reclaimed. I don’t consider free() to be garbage collection. free() is the final step you do in any memory management scheme, it’s not a specific technique of memory management.


With moving/compacting collectors, you never even call free(), and this is a huge advantage over a mark/sweep:

First mark touches all the live objects, then sweep touches all the objects that aren't live. You're touching every object every cycle. Yuck.

A moving collector touches all the live objects, but now they have new memory addresses. We never need to touch the old garbage at all! Most advanced GC techniques are around trying to revisit live objects less frequently, but they all have this basic theme.

Other memory management techniques explore the other side: Just touching the garbage. That's what happens when you malloc+free manually, or do (automatic) reference counting.

So the value-question restated is this: Is there more ham than spam?

If we have more garbage, then the ideal GC strategy wins. If we have more living objects (and few/no garbage) then literally anything else wins.

We do seem to have more garbage in atom languages like Java, Erlang, Python, and so on, so it makes sense that GC gets much more love than the other side of the fence, but I wonder often if we're missing a trick.


Your definition of garbage collector excludes reference counting, which does not require scanning memory. Reference counting is a form of garbage collection. The type of garbage collection you are referring to is the sub-type of "tracing" garbage collection.

However, I agree, "free()" is not garbage collection, but rather a form of memory management, of which garbage collection is a subset, and manual (free) is another.


It depends who you ask. In my point of view, "reference counting" is a strategy that has nothing inherently to do with memory management and can even be manual instead of automatic. GC as the name implies should involve a collector of sorts.


Plain old refcounting cannot resolve circular references, which are inevitably leaked. To avoid leaks, most refcounting garbage collectors have a heuristic to occasionally run traces to clear out the cruft.


I think the most widely accepted definition for "garbage collection" is something which frees memory automatically, as opposed to manually freeing memory (or other memory management schemes). In fact, the second paragraph of Wikipedia's page on garbage collection[1] says that "Garbage collection is essentially the opposite of manual memory management, which requires the programmer to specify which objects to deallocate and return to the memory system".

[1]: https://en.wikipedia.org/wiki/Garbage_collection_(computer_s...


Reference counting is garbage collection. It's part of the mix of potential strategies for automatic collection of memory. As is static analysis of data flow, fwiw.


In taxonomy, I say we have "automatic memory management" when we have some kind of `malloc()` type device and no need to `free()` it.

Garbage collection is about visiting live objects to figure out what's alive and ignoring the dead stuff. Every theory about why GC can work (or be useful) is predicated on the idea that there is more garbage than live objects so you're touching less stuff.

If all you do is mark/sweep, you're visiting the garbage as well, so GC research has led neatly into moving collectors (which don't ever touch the garbage). That's a win, and that's why GC is worth study.

There are other kinds of "automatic memory management" though: reference counting, arena allocators (at a stretch), and object memory (I've only seen this on mainframes), and they're all worth some attention.


> Garbage collection is about visiting live objects to figure out what's alive and ignoring the dead stuff

That's a description of tracing GC, not a definition of GC.

GC is whatever technology that enables what you've called automatic memory management: a kind of illusion that there is infinite memory, from the programmer's perspective.

That might involve some mix of reference counting (Python is probably the biggest example), tracing (what you seem to think is all of GC), data flow analysis (e.g. escape analysis in some Java GC / JIT integration), etc.


> GC is whatever technology that enables what you've called automatic memory management: a kind of illusion that there is infinite memory, from the programmer's perspective.

That's definitely not what McCarthy meant by it. I understand that's probably a common view nontheless, but I can use sbrk() and the fact my computer has gobs of ram/swap, and I'm not going to call that garbage collection. I don't think this definition is useful.

If you re-read what I wrote now understanding what I mean when I say "garbage collection", perhaps we can talk about strategies for providing that infinite view of memory instead.


Actually, I think process termination is pretty useful as garbage collection. It's normally much more reliable for releasing resources, for one thing. And there's little sense in cleaning up RAM when you're just going to quit.


The kernel still deallocated those pages as an explicit [manual] result of a free-like call -- that just so happens to be named exit().

To seriously tell people that every C program running on unix has automatic "garbage collection" built-in seems like it would create very confusing conversations.


GC has always meant tracing GC while reference counting (ever since McCarthy invented it) was considered another form of automatic memory management (like using arenas, doing escape analysis, etc...). Only recently in hacker news have they become all merged into the same word, for some reason.


> Only recently in hacker news have they become all merged

This is demonstrably not true. The Garbage Collection book by Jones is the key text in this area, and it covers reference counting under that term. It was published in 1996, before Hacker News existed.

There is also Bacon's Unified Theory of Garbage Collection, 2004, which said the two are on a spectrum that meets in the middle.

> all high-performance collectors (for example, deferred reference counting and generational collection) are in fact hybrids of tracing and reference counting


The updated Garbage Collection Handbook does as well.


That reason are academic worthy books describing garbage collection algorithms, I can provide a couple of them.

Lets start with the Garbage Collection Handbook, chapter 5.


Reference counting is garbage collection, and it's not a very good algorithm.


Garbage collection also performs other performance improving functions like memory compaction.


> Automatic reference counting

> Rust's borrow checker/tracker makes ownership explicit

So no compaction, no pointer-bump allocations? You have to reimplement these manually, aka reinvent GC while having ref-counter/region-based-mm overhead?


In the ~15 years I spent building software in C++ I don't recall a single time that I wished for garbage collection. By using RAII techniques, it was always possible (nay, easy!) to write code that cleaned up after itself automatically. I always found it easier to reason about and debug programs because I knew when something was supposed to be freed, and in which thread. The only time that use of simple reference counted smart pointers did not work was when you had cyclic references: but cyclic object graphs caused other problems in any case - so I would always refactor my design to avoid creating them in the first place.

In contrast, in the ~10 years I spent working in Java, I frequently ran into problems with programs which needed an excessive amount of memory to run. I spent countless frustrating hours debugging and "tuning" the JVM to try to reduce the excessive memory footprint (never entirely successfully). And of course, as soon as your JVM needed to interact with native code (e.g. a quant library written in C++), you were even worse off, because there was no language support whatsoever for ensuring when, or even if, some native memory would be freed. It made it even harder to debug these leaks - because the "leak" was always very complicated to reproduce.

Garbage collection is an oversold hack - I concede there are probably some situations where it is useful - but it has never lived up to the claims people made about it, especially with respect to it supposedly increasing developer productivity.


While RAII is a very good technique, it is not the solution to all issues here. I read that some concurrent datastructure implementations would not be possible without a GC. Furthermore, when a project reaches a certain level of complexity, it ends up implementing GC anyway (see Unreal engine).


I agree that RAII does not solve everything - in particular, issues with memory fragmentation. However, I prefer the direction being taken by Rust and suggested by the top level poster: rather than having to run a separate thread which tries to infer which resources are no longer needed, with potentially large runtime costs, instead augment the language semantics with a clear ownership model, and thereby give the compiler enough information to generate code that runs in a stable, predictable memory footprint.


That reminds me of the VILW argument: we will make languages really expressive and compilers really smart, then we can get rid of fancy CPU hardware that do instruction scheduling dynamically.

It doesn’t always work. A dynamic tracing GC (or reference counting or using arenas) can be more efficient in ways that static approaches probably won’t be able to effectively emulate. Rust is a great experiment for seeing how far we can go in this area, so we will see.


RAII has nothing to do with memory fragmentation.

Solving memory fragmentation requires moving objects around. Usually you get that via a moving (or compacting) garbage collector. But there's no reason RAII style resource allocation couldn't move things around.

And there are plenty of garbage collection strategies that don't move objects around.


I was simply stating that RAII does not solve the memory fragmentation issues. Many GC's do solve the memory fragmentation issue.


C++ code is full of use after free problems. You can avoid those with garbage collection.


If you have a use-after-free it means that you made a wrong assumption in your code about the lifetime of the object. A GC would hide the issue and potentially lessen the gravity of the bug but it doesn't resolve the core issue which is that you probably have a fundamental logical problem in your code.

I'm a bit "GC hater" so obviously I'm heavily biased but to me it just means that GC make it easier to write sloppy, poorly architectured code where you have a mess of a data-dependency graph and nobody knows who owns what and when things need to be deleted, so you let the GC (hopefully) figure it out. In my experience this leads to all sorts of issues independently of memory management, such as having obsolete data lingering in certain parts of the program because it's not expunged properly.

IMO GC or not having clear ownership semantics are necessary to properly architecture anything more complicated than a script. And if you have ownership semantics then why do you need a GC anyway?


The nicest point about GC is that, for the vast majority of data, you no longer need to have a concept of ownership at all.

Ownership is not a fundamental architectural property - it is "only" a powerful technique for tracking data with a specific lifetime (of course, even in GC languages you still need proper ownership semantics for some pieces of data).


Ownership itself might not be, but lifetimes are and the two concepts are tightly related.

That's why Java and C# both had to introduce a way of (semi-)automatically closing resources, waiting for the GC was a catastrophe.


Common Lisp, CLU, Mesa/Cedar, Modula-3 and many other system languages with GC always supported those kind of mechanisms.

Java and C# did nothing new there, only catching up with was already the state of the art before they were born.


That’s what I find ironic about C#. In some ways I worry more about and do more manual memory management than I did in C++.


> If you have a use-after-free it means that you made a wrong assumption in your code about the lifetime of the object. A GC would hide the issue and potentially lessen the gravity of the bug but it doesn't resolve the core issue which is that you probably have a fundamental logical problem in your code.

The basic assumption with a GC'ed language is that the lifetime ends once the object isn't referenced anymore. You rarely have to deal with it at all, so those "logic errors" happen much less frequently.

If we look at the major real-world languages without GC (C/C++), it is your job to figure out the lifetime. You will fuck this up unless you structure your program in very disciplined way and that's frankly too much to ask of the vast majority of programmers.


What if your domain problem (say, compilers, symbolic algorithms) doesn't have a clear notion of ownership because everything is a shared DAG that may survive for arbitrarily long? Not every problem has clear ownership imho (and tools in C or C++ in these domains resort to their own refcount/custom GC implementation anyway).


imo you solve these problems by segregating the ownership logic and relationship logic of the data structure.

In other words the DAG would never be traversed to allocate or free memory, each vertex would be held by a vector or other container and the edges represented by an adjacency list/matrix.

It's faster and safer than pointer chasing.


> A GC would hide the issue and potentially lessen the gravity of the bug but it doesn't resolve the core issue which is that you probably have a fundamental logical problem in your code.

No, it solves the problem because resource lifetime is no longer a concern. A field advances as we decrease the number of irrelevant details we need to attend to.

Turns out, most of the time, memory is one such concern. You can argue that we need better approaches when memory is a concern, but that doesn't discount the value of GC as the right default.


Yes, but you can also avoid that and the cost of garbage collection by having an explicit ownership model akin to Rust.


> You can avoid those with garbage collection.

Until you need to close a socket or file or unlock a mutex or respond to an async response handler.

$(cat /opt/propaganda/rust/raii-in-rust.txt)


> C++ code is full of use after free problems.

Once upon a time, this statement may have been true, but it isn't any more. ASAN and Valgrind are widely available, and runtime test suites are much more prevalent than they used to be.


> But all of these are much cheaper than developer labour and reputation damage caused by leaky/crashy software. The economics make sense.

Of course, with traditional languages, that's the trade-off we're being asked to make. That's my point! We need to develop languages that accurately encapsulate lifetimes statically so that we can express that to the compiler. If we do, the compiler can just make instances disappear statically when we're done with them -- not dynamically! Like Rust but without the need for Rc and Arc boxes. Rust gets us 80% of the way there.

The truth is with most of the Rust I write, I don't have to worry about allocation and deallocation of objects, and it happens. It's almost ubiquitous. We need to extend that to 100%.

> Long live garbage collection!

Long live the rocket powered horse!


This only works if your language is strict. With a lazy language like Haskell, you can't statically infer (at least naively with respect to scoping rules) when something lives and dies.

Performing this statically is hard, and GHC tries to do it (Demand analysis).

But without a GC, these sorts of languages would be a no-go, as would logic-based languages like Prolog which need to GC their internal data structures.

While we now have the expertise for eliminating the GC from single threaded strict programs (multi threading in Rust is still quite complex, and you do see Rc<Box<..>> more often than not in these settings, which is essentially refcounting/GC), this _does not scale_ to all models of languages.


GHC infers these kinds of things all the time. It can't do so in general, but neither can you do that in a static language.

GHC even has a strictness analyzer.


> If we do, the compiler can just make instances disappear statically when we're done with them -- not dynamically!

Not possible generally. It would be easy to create a situation where some kind of refcount is a necessary final fallback.

> The truth is with most of the Rust I write, I don't have to worry about allocation and deallocation of objects, and it happens.

If Rust is the answer, why are you pushing for new langs when Rust is sufficient? Something doesn't add up here.

> Long live the rocket powered horse!

Your response to someone giving their years of experience is that? Well, we're all convinced now.


> If Rust is the answer, why are you pushing for new langs when Rust is sufficient? Something doesn't add up here.

Maybe because he said, twice, that Rust is not all the way there?


You're quite right, thank you, upvoted.

However if he says "Rust gets us 80% of the way there" as you point out then he is obliged, IMO, to point out explicitly the 20% where it fails or we can't even begin to discuss it.


He mentions Rc and Arc which I assume is what he means by the 20%. I tend to agree, although I'm not sure how you would get rid of them.


>> If we do, the compiler can just make instances disappear statically when we're done with them -- not dynamically!

> Not possible generally. It would be easy to create a situation where some kind of refcount is a necessary final fallback.

Either refcount or a GC. But those situations might be rare in practice with a good system.

If you think about it, generational garbage collection implements lots of those heuristics. (But it's closer to a JIT optimizer, not a static optimizer.)


How do you see multi-threaded anything working without Arc? Some owners (threads) are created and destroyed dynamically. Are you thinking Arcs get added wherever necessary?


There are solutions for that, but it requires a more powerful type system / logic than Rusts: http://www0.cs.ucl.ac.uk/staff/p.ohearn/papers/permissions_p...


Or a multi-threaded app where plenty of dynamically created closures operates on persistent data structures? ;)


Oh to be clear I don't have an answer for this off-hand, I'm not that smart. However, if we'd given the 30 years of people developing faster for-loops the task of figuring out how to manage memory statically, I'd wager we'd not be having this conversation right now. It is by no means an easy problem, however.


I think there are essentially two ways of making the compiler statically aware of lifetimes: you can annotate the program with lifetimes, as in Rust, or you can have the compiler infer all of the lifetime information somehow.

Adding annotations requires more programmer effort and reduces the level of abstraction of your programming language. Rust takes this approach, but tries to keep the level of annotation somewhat reasonable, and provides the "unsafe" escape hatch for when you cannot convince the compiler that what you're doing makes sense. (Although I do believe that it would be possible for the Rust compiler to infer most/all lifetimes).

For instance, in a functional programming language, you are typically creating closures all over the place, like one of the other posters mentioned. It seems quite unlikely that there is any kind of reasonable lifetime annotation system that would capture all of the complex lifetime patterns that you would see in such a language.

One can say "Oh, but surely someone smarter than me must be able to come up with something that works for those situations", but I don't think that is a very constructive line of thought, since you could say that about pretty much everything. Rust is an example of a language where lifetimes work reasonably well, but Rust already makes some tradeoffs that reduce its level of abstraction to some degree. It doesn't provide the same level of abstraction that, say, Haskell does.

Program analysis would allow you to keep your level of abstraction and ideally not require only little annotations. However, since any non-trivial program property is undecidable, program analysis is fundamentally incomplete and it is again unlikely that it would be able to deal with all lifetime patterns that you see in practice.

Therefore, it seems to me that there is no free lunch here. Either you sacrifice abstraction by requiring annotations or you don't require any annotations but then you have an incomplete program analysis.

(Also, you seem to think 30 years of good research has been wasted on the "obviously bad idea" of garbage collection, but somehow none of these researchers were good enough to realize this and come up with the silver bullet that makes GC unnecessary? After all, people were already thinking about GC-less languages long before Rust was a thing)


   possible for the Rust compiler 
   to infer most/all lifetimes)
I'd be interested to learn why you think this is possible. As you point out yourself, by Rice's theorem, precise lifetimes are not statically decidable. Rust's lifetimes are essentially based on nesting (think well-balanced brackets) and tracked by enriching the standard Damas-Hindley-Milner approach with a notion of affin-ness which is a weakening of Girard's linearity. Circular data structures immediately violate affine-ness. What mechanism do you propose to get around this conundrum?


"All" might not be possible, but every Rust release tends to see improvements in lifetime elision. I just went through a few Rust apps I wrote 6 months ago and was able to get rid of nearly every explicit lifetime annotation thanks to improvements.

So there's definitely room for improvement, but also good progress being made.


I think that for function definitions in Rust you would in principle be able to leave out lifetime annotations on arguments/return values in more cases than is currently allowed by lifetime elision. I believe they don't allow this since it can become very confusing if the lifetimes are not written out when there are multiple references arguments. I'm not entirely sure about this, though.


I know this is possible because Rust already does this. Rust's types in closures can be fully inferred. Try replacing all Rust functions with closures. It goes surprisingly far.


I'm not sure this would be possible for a standalone library, compared to a self-contained program, either.


It's a little bit contradictory that you simultaneously believe that the people who worked on GC are much smarter than you, but at the same time they were wrong to choose to work on GC rather than your research program ;-) Not all problems can be solved by throwing smart people at it. There has been research along the lines you suggest. Look at the MLkit region inference research, or escape analysis research, for example. It doesn't work very well in practice. It can correctly infer like 95% of memory allocation/deallocation, but that remaining 5% leaks memory which means you need a GC anyway. It's still valuable research though. Go, for example, made the bet that they can use a low-pause low-throughput GC by reducing the pressure on the GC by doing escape analysis. Maybe that would work even better with the far more powerful region inference of the MLkit.


> Like Rust but without the need for Rc and Arc boxes. Rust gets us 80% of the way there.

I think you're qualifying Rc/Arc as kludges to get around the type system. They may end up being used like that sometimes but when well used they are actually encoding extra semantics, just like &Type and &mut Type.

A simple example is a cache get() method:

pub fn get(&self, key: &K) -> Option<Arc<V>>

https://docs.rs/multicache/0.5.0/multicache/struct.MultiCach...

This allows you to get and use a cache value across your program and depending on outside input the value can disappear first in the cache by eviction or in the consumer. So Arc<V> is already expressing to the compiler the exact semantics. It just so happens that the actual lifetime is runtime defined from outside input so it must be managed dynamically.

I don't see a general way around that without copying but maybe a lot more can be inferred automatically so you get similar ergonomics to a GC language by introducing Rc/Arc/etc automatically at compile time. I don't think that's a good fit for a system programming language like Rust though where you don't pay extra costs without asking for them explicitly. But maybe there's space for a higher level language that does all that magic and doesn't use a GC. Someone trademark Rustscript.


It isn't clear that what you're asking for is even theoretically possible, and it's far from obvious that the end result would be comparable to GC in terms of productivity.


In Rust, does the standard way of implementing a graph structure still use a vector of nodes and indices for edges?

I think I saw a Rust implementation of an arena. Is it working?


>In Rust, does the standard way of implementing a graph structure still use a vector of nodes and indices for edges?

You'd probably want to do this whenever you implement a graph in a performant language, it's faster and safer (depending on the problem). Implementing edges as vectors of pointers is an exercise in memory leaks and data races, imo.

You can see an implementation with adjacency lists here:

https://github.com/bluss/petgraph/blob/master/src/graph_impl...

>I think I saw a Rust implementation of an arena. Is it working?

https://docs.rs/typed-arena/1.4.1/typed_arena/


Anecdotally for me, I spent more time in my first 0.5 years of Java tuning GC and debugging GC issues, than I spent chasing memory leaks in my previous 5 years of C++. The worst part is that you quickly learn to not introduce too many memory leaks, but as far as GC tuning goes the skill-set feels more like voodoo then science and is constantly obsoleted by GC code changes (or wholesale replacement).

I can see how this could be true in some cases and especially for throw-away code and when perf doesn't matter, but for anything at scale and/or with perf requirements, GC is terrible and wastes tons of developer time fighting it (debugging, tuning semi-opaque configs) and working around it (object pools, off-heap buffer managers, controlling object lifetime to avoid certain characteristics, controlling object size, etc.)


To steelman the idea a bit: giving more context for the GC is not the same as manual memory management.

Better static analysis of programs can precompute a lot of GC decisions.

Allocation on the stack is one example of that kind of optimisation. But you can imagine more sophisticated optimizations.


I think C++ could get most GC usability benefits if there was a better/simpler syntax for shared_ptr and unique_ptr. I have written code with extensive shared_ptr usage and it was quite pleasant if you ignore the ugly syntax. On the other hand even after years of C# I still miss deterministic destructors.


One downside of reference counting is the long delete delay you get when you remove the last reference to a large data structure. That's pretty easy to fix, but then you lose deterministic destructors.


That would be a really large structure get a noticeable delay :-)


My first 8-9 years was full of very competitive, bleeding edge C++ and I can tell for sure that your productivity increase was not caused by garbage collection, but by using a more orthogonal language. I feel it in delphi the same way I feel it in go and python.


> But all of these are much cheaper than developer labour and reputation damage caused by leaky/crashy software.

Exactly! And that is the biggest advantage of not relying on a GC, it is easier to detect awkward usage and requires more thought upfront (but not any more than what is actually needed).

This results in less bugs and easier maintenance.


C++ is not bad. I think it sits in the middle between C and GC languages. I actually like what C++ offers. Languages with GC are actually easy to debug. I remember once I had my own memory management code, suddenly the write beyond boundary is not segfaulting any more, took me more time to figure out the problem.


Ridiculous.

The problem is not that you don't know when/where the lifetime will end — that can usually be characterized by a terse "English" description. The problem is that this lifetime is dynamic in nature.

The end of the lifetime of an object may coincide with some user input, for instance.

At this point, either you go back to manual management, with the potential for errors (and for what it's worth, I think manual management is fine in many cases) ... or .... or what, exactly?

Even Rust doesn't solve that problem at all. If your objects can survive outside of a "linear" execution, which is the case in interactive or multi-threaded programs, you're falling back on reference counting — garbage collection.

No one has ever devised a scheme that lets you specify very dynamic lifetimes AND statically checks that no leak can occur.

I'd stake a lot on that being impossible in general, and the best we can hope for being an AI that searches a way to insert the manual memory management calls (mostly, "free" and a minimal set of reference counts) assorted with a proof that this change will not cause any leaks or invalid accesses. We're a long way off that yet.


I don't see your point. Of course sometimes the lifetime of an object is not tied to code scope but actually to something dynamic. Let's say for instance when you close a tab in your browser you expect the resources to be freed (ignoring caching to simplify the argument).

Clearly somewhere in your code you have to explicitly handle tab closing and break the references to allow the GC to do its job. Why not free the resources here while you're at it?

When you say "manual memory management" if you're thinking C-style malloc-free then you have a point, it's very easy to forget a free() somewhere. But any language with destructors can handle these situations without much more overhead than GC-based approaches. Just remove your object from whatever data structure was holding it and let the language automagically run the destructor for you. I find RAII a lot easier to model and think about than "you drop this reference and one day maybe your object is destroyed, but don't really rely on that".

>No one has ever devised a scheme that lets you specify very dynamic lifetimes AND statically checks that no leak can occur.

GC doesn't solve that either. If you're not careful and let a reference to old data dangle somewhere in your code you can effectively "leak" data just fine. When there's no clear ownership and nobody knows who's supposed to clean what it's very easy to end up with data dangling everywhere through complex ownership graphs. GC may mitigate some use-after-free problems (generally by masking the problem) but that is something that can be solved without GC either.

I'm not saying that applications with (justifiably) extremely complicated ownership semantics that also require very high performance don't potentially benefit from using a GC and that it could outperform manual memory management by, for instance, batching the object deletions. What I'm saying is that it's that such applications are few and far in-between, basically 100% of all the code I've ever wrote in my ~15years as a professional software developer didn't have such requirements.

GCs should be an opt-in niche tool used to solve specific problems, not a kludge to let developers write sloppy code and get away with it.


> Why not free the resources here while you're at it?

Because you do not have an exclusive reference to all of them; we could be freeing something that is still in use somewhere.

So why not reference-count? Because reference counting is slow, and must be meticulously maintained (here, languages help with constructs like smart pointers), and doesn't handle cycles in the object structure.

> I find RAII a lot easier to model and think about than "you drop this reference and one day maybe your object is destroyed, but don't really rely on that".

The RAII model doesn't do away with this "maybe" in any shape or form. We drop this reference, and the smart pointer decrements the refcount; maybe it has hit zero so the object is destroyed, maybe not. Maybe the zero refcount triggers a whole cascade of thousands of other objects hitting their zero refcount, which takes many cycles to execute, or maybe it doesn't.


RAII has nothing to do with reference counting, it stands for resource acquisition is initialization. In C++ that means the lifetime of a resource is tied to the creation and destruction of an object. Often when people talk about RAII and C++ they are also referring to destructors being called when an object goes out of scope, so if an object that contains an object containing some allocated resource goes out of scope, the resource is released as the destructor path is called.

If an object with a smart pointer goes out of scope, then of course there's the overhead of the smart pointer, but the previous comment wasn't referring to that situation


RAII is commonly used for representing smart pointers that manage references to objects that themselves have otherwise pervasive lifetimes.

RAII not managing any shared resources is trivial and uninteresting. The lifecycle of an object that is confined to a single lexical scope can be handled manually and verified by visual inspection (though of course it is more reliable to automate that with RAII or macros, not to mention less verbose).


> Let's say for instance when you close a tab in your browser you expect the resources to be freed

That's basically region-based memory management. Which is a great idea, but unfortunately only applicable in narrow scenarios (another example besides browser tabs is user requests in a HTTP server).

The real issue, however, is what if you can't keep all current task's data in RAM! E.g. browser tabs still have GC for JavaScript... Going a bit meta, you can easily think of a hierarch of such "regions" (e.g. computer > process > browser tab), each is self-contained in the sense that when you turn it off/terminate it/close it all its resources are reclaimed, but the real issue is when they need more resources than you have... Then you're back at needing GC.

I agree with your second point; GC doesn't solve the problem of reference leaks... Whereas regions mostly do! So maybe more language support for regions (finer-grained than "whole process" - think Erlang) would be useful (each region can still of course optionally have its own GC).


> Clearly somewhere in your code you have to explicitly handle tab closing and break the references to allow the GC to do its job. Why not free the resources here while you're at it?

You are missing the point. The main point behind GC is removing the complexity of writing the software. Writing your own destructors, thinking about when to free your memory or writing lifetime annotations, needing to design your app in specific way to satisfy the borrow checker etc in languages like Ada/C/C++/Rust all are additional complexity and mental overhead that is removed by GC. Your point about "just do this X and you don't need to use GC" is a reason people use GC languages, because they don't want to think about or write this "X".


My point is that you need a "destructor" even with a GC. Continuing with my example you need to pop your tab from the data structure containing your tabs and you have to make sure that all references are dropped so that the GC will do its work.

When I write Rust code I basically never have to explicitly free anything outside of FFI code dealing with C pointer or the like. The most straightforward way to allocate anything in Rust is to reify a Box which frees its contents when it's dropped. The data is freed when I remove it from containers or when a Box gets dropped. The closest I get to garbage collection is using a reference-counted pointer here and there (and only after careful consideration).

I really don't see what complexity the GC removes in this situation, besides letting you write code without having to care about ownership which, as I stated earlier, I consider to be an antipattern.

>"just do this X and you don't need to use GC" is a reason people use GC languages, because they don't want to think about or write this "X"

A lazy programmer is a good programmer but it's also important to recognize when something can't be pushed under the rug. I don't want to think about or write documentation and unit tests either, yet here we are. If I had to maintain an application and I ask the coder "when exactly are these objects deleted?" and they answer "dunno lol, I let the GC figure it out" I'd be extremely worried.


> it's also important to recognize when something can't be pushed under the rug.

If you haven't heard of it, you might be interested in the Waterbed Theory of Complexity[1]. The idea behind it is that it's not that you can't push down complexity in an area, but often that complexity just pops up in a different area. For example, you can do away with the vast majority of allocating and freeing memory, but the complexity of doing that pops up again when you start having to tune your GC or use very special patterns to avoid/minimize GC freezes.

That's not to say there's not a benefit to that sometimes. If you can push the complexity away from the majority of use cases to a few special use cases, that might be a net win. It doesn't always feel like that though if you're the person stuck trying to coax that last few percent of performance optimizations out of a runtime that only gives you so many levers and dials to work with.

I think some of the same concerns that make a dynamic/scripting language a poor choice compared toa compiled and static one are the same things that might make a GC language a poor choice compared to a manually memory managed one, and it's just a matter of scale that separates them. Is performance important? A scripting language might be a poor choice. Is performance really important? Then a GC language might be a poor choice. On the other hand, if I'm writing some program to run correctly once or twice ever, and it might actually take more tome to write than it will ever run for (e.g. a script to parse and/or transform data on disk for me to fix a problem), then is anything really gained from manual memory management? Writing that in Perl or Bash is a valid option there, and that's about as far away from memory management as you can get. For portions of my career, I've spent half or more of my time writing programs like that to help with sysadmin tasks.

1: https://en.wikipedia.org/wiki/Waterbed_theory


The Waterbed theory is stupid, because a given requirement can be handled in one place in the system, or it can proliferate into every module, giving rise to a more complexity to achieve the same functionality.

So that is to say, when we push down a "bump" of complexity in one place, seventeen bumps at least as large can crop up elsewhere. Moreover, that seventeen is just today; we might have paved the way for more such bumps now being necessary going forward as functionality is added, whereas if we kept the original bump, it would stay the same. Water is not a good model for complexity because it is incompressible for all practical purposes; if we push down a liter of water here, the places where it pops up add up to just a liter.

For exmaple, virtual memory is considerably simpler than individual modules of a program implementing a scheme for swapping code and data to disk and back.


> when we push down a "bump" of complexity in one place, seventeen bumps at least as large can crop up elsewhere.

I think you've taken the metaphor about a waterbed and design advice a bit too literally. :)

When you push complexity down in one area and it pops up in another, there's no guarantee that it's of the same overall magnitude, as it would be in a real waterbed. Sometimes it's larger, and sometimes it's smaller.

The same idea applies to whether we should all write assembly or use a higher level language to accomplish our computing needs. We push down the complexity of knowing exactly what the processor is doing at any one moment for a easier to manage overall view of what we want to accomplish. Overall, the net effect is a lessening of complexity for most people, but for those that want or need to know exactly what is going on, there more complexity. Still, overall I think the net effect is less complexity for most people, vastly outweighing the time and effort a few put into figuring out why something isn't as expected.

On the other end of the spectrum, using a dynamically typed scripting language might save people a lot of time initially, or when prototyping, or on smaller programs, but as those programs become larger, the lack of constraints (which provide a small but immediate complexity hurdle) can build up and overwhelm a large project that might otherwise had an easier time if some more constraints were required along the way.

No one tool or paradigm is perfect for every task, and we should not expect one to be. We don't usually commute in dump trucks, and we don't usually haul lots of debris in compact cars. I see no reason why it should be any different for programming languages. Sometimes all you need is a 10 line bash script, and given that information, I doubt there's a huge level of complexity that's going to rear its head later because I didn't write it in some lower level language.


The waterbed theory in fact explicitly states that the amount of complexity is preserved no matter how it is distributed. The Wikipedia definition of it explicitly talks about how the volume of water remains constant due to its incompressibility.

If that isn't taken literally, it informs of nothing we don't already suspect. It tells us that for each functional requirement, there is at least one piece of code in the system somewhere. If we remove these lines of code from one place, or places, something has to be added elsewhere (possibly way more lines, or way fewer) so that the requirement remains implemented.


> If the waterbed metaphor isn't taken literally it informs of nothing we don't already suspect.

It informs you of nothing you didn't already suspect. I think it's useful to remind people of something that's fairly obvious ans self explanatory if you think about it, but not everyone does, and having an easy and slightly odd metaphor to point to helps the point to get across. The fact that you think it's obvious is actually one of its major benefits. You can immediately start exploring the consequences of it instead of spending time proving it. Honestly, of all the criticisms to level at a metaphor, "too obvious" isn't one I would consider worth bringing up. Obvious metaphors are used to good effect all the time.

Edit to address your prefixed paragraph:

> The waterbed theory in fact explicitly states that the amount of complexity is preserved no matter how it is distributed.

If you read carefully, it says that the minimum level of complexity may not be reduced. Who is to say the minimum level of complexity has already been reached? Or that pushing down a level above the minumum in some area won't cause only the minimum level to rise elsewhere (or vice-versa)?


I like the waterbed theory, thanks, I hadn't heard of it before.

You're absolutely right, in the GC context you:

(1) Have to manually reference count / manage external resources like sockets and files because you can't count on the destructor to ever run.

(2) Lose determinism and then have to spend huge amounts of effort tweaking GC parameters to fix your issue, but to your point, it becomes a Jenga tower where you touch one part and the whole thing falls down again.

(3) Can easily still leak huge portions of your object graph by, for instance, throwing them into long-lived dictionaries.


In what memory management paradigm can you free the object in a long-lived dictionary without removing it from that dictionary, and do you advocate this paradigm?


> mental overhead that is removed by GC

and it is replaced by GC when your program behaves unexpectedly because, whoopsie, that's the one time the GC decided to run. Good luck debugging or even reproducing that.


>that's the one time the GC decided to run. Good luck debugging or even reproducing that.

GC usually runs on allocations, so it's quite predictable. The amount of time spent on collection is pretty much as indeterminate as the time spent on malloc/free.


When will a correctly-implemented GC ever alter your program's functional correctness, aside from timing-related bugs that aren't fixed by manual memory management either?


Just run your program normally and use eBPF/bpftrace with something like a flamegraph to track your system and its GC. This isn't an easy problem by any stretch of the imagination, however, it is a solved problem.


> The main point behind GC is removing the complexity of writing the software

And more importantly, reading it.


Having spent years writing code in both GC’ed and non-GC’ed languages, it’s pretty clear that it programmers spend about the same amount of time worrying about memory management with either paradigm.

The main differences are that people using GC systems spend their time doing magical incantations to manage the GC wizard, and they’re a lot more smug about the goodness of their system.


>programmers spend about the same amount of time worrying about memory management with either paradigm

I have no idea how you've come to that conclusion. For the vast majority of memory allocation in any GC'd language I can think of, you spend zero time thinking about memory management 98% of the time.

whereas in a manual language, you must consider it every time you allocate memory. That's not a bad thing, it's often dead simple in the vast majority of contexts, but still.

I've also very rarely seen code in professional code that's mean to poke or prod the GC into running.


>For the vast majority of memory allocation in any GC'd language I can think of, you spend zero time thinking about memory management 98% of the time.

>whereas in a manual language, you must consider it every time you allocate memory.

This is like bizarro world to me. I genuinely can't comprehend that.

Here's how memory management looks like to me, using concrete examples in languages featuring RAII like C++ or Rust:

- I have a function that needs to compute a hash so I need to allocate a hash context for it. If it's small I'll put it on the stack, otherwise it'll go on the heap. In either case when the function returns the destructor is automatically called.

- I have a function that takes the name of a file as parameter and returns its contents in a buffer. Clearly the buffer needs to be allocated on the heap, so I allocate a large enough vector or string read into it and return that. The caller will then either use it and drop it immediately or store it somewhere to be dropped later, either on its own or as part of the structure it belongs to. In either case the destructor will be called automatically and the memory will be freed.

That's easily 99.9% of what memory management looks like in the programs that I write. I have absolutely no idea what using a GC would change at any point here. Note that while this is technically "manual" memory management I never have to explicitly free anything and in the vast majority of cases I don't even have to bother implementing the destructor myself (I basically only need to implement them if I need to release external resources like, for instance, raw OpenGL handles, file descriptors and the like. The GC wouldn't help either here).

It's genuinely a non-issue as far as I'm concerned. I literally never think "uh, I have no idea when this object won't be used anymore, I wish I had a GC to figure it out". I can't even imagine when such a scenario would crop up.


Here's you're example with GC:

- I have a function that needs to compute a hash so I need to allocate a hash context for it. At some point after the function has returned the destructor is automatically called.

- I have a function that takes the name of a file as parameter and returns its contents in a buffer. I allocate a large enough vector or string read into it and return that. The caller will then either use it or store it somewhere. In either case the destructor will be called automatically and the memory will be freed.

See how it's simpler?


>he caller will then either use it and drop it immediately or store it somewhere to be dropped later, either on its own or as part of the structure it belongs to.

You're glossing over a lot of complexity there. That's the entire point of GC, you don't have to think about when it's dropped.

Also, I wasn't arguing for the blanket necessity of GC in all contexts and for all problems.


Can you point out a specific example then? I'm genuinely not trying to play dumb. I very honestly not see that complexity that should be obvious to me given that I spend most of my time writing code in non-GC languages.

Maybe it's just Stockholm syndrome and I'm so used to working that way that I don't even see the problem anymore but there are now a long string of replies talking abstractly about the overhead of programming in a non-GC environment and I simply don't relate at all. It's a complete non-issue as far as I'm concerned.


>store it somewhere to be dropped later, either on its own or as part of the structure it belongs to.

Wherever you're storing the reference, you have to eventually free the memory. That propagates memory-management code throughout your codebase, and is error-prone.

I totally agree that if you're allocating on the stack it's a non-issue. If you're returning a reference and always freeing in the caller, that's pretty easy too, but you have to have that logic in every single caller. If you're storing a reference on the heap somewhere, you suddenly have dynamic lifetimes with indeterminate reference counts, and it gets quite hard to 1) correctly manage memory and 2) verify that you're correctly managing memory.

The alternative is that in all three instances in a GC language, I don't have to care about memory management. Barring unusual edge cases, I can't forget to deallocate something and end up with a memory leak.


”If you're storing a reference on the heap somewhere, you suddenly have dynamic lifetimes with indeterminate reference counts, and it gets quite hard to 1) correctly manage memory and 2) verify that you're correctly managing memory.”

It is axiomatic that GC programs leak memory. I’ve never encountered one that didn’t. The difference is that it’s harder to see, and most programmers who have grown up with GC systems don’t know how to fix it.

”The alternative is that in all three instances in a GC language, I don't have to care about memory management.”

Again, no. The alternative is thar you aren’t paying attention, and don’t see the leaks until they become fatal. Just like cockroaches, you have them, somewhere.


For 95% of the code you write it's not a concern; that last 5% sure can eat up time if your software has any concern about performance.


Totally agree. I don't think that tradeoff is particularly out-balanced by the time it takes to do manual memory management.


”I have no idea how you've come to that conclusion.”

30 years of experience.

”For the vast majority of memory allocation in any GC'd language I can think of, you spend zero time thinking about memory management 98% of the time.”

No, you think about 98% of the code 0% of the time. The remaining 2% ends up causing so many problems that your amortized time spent on memory management is the same.


> GCs should be an opt-in niche tool used to solve specific problems.

That is true. Sciter (https://sciter.com) contains implementation of DOM(tree), CSS and script.

DOM is a regular tree structure - each element of the tree has strictly one parent and no cycles on the tree in principle. It does not need GC at all and is not using it for DOM tree management.

CSS is a collection of collections of name/value pairs. No loops. It does not need GC at all.

And only script uses GC for the simple reason: in UI, ownership graph of script objects frequently contains loops. Yet lifespan of objects is unknown at compile time - depends on user interactions.

Even purely native GUI application benefits from GC when that GC is used in places where it is needed. GC is just one way (of many) of doing memory management.

Ideally it should be a multi-paradigm programming language and environment that allows to use as explicit memory management (and so RAII, memory pools, etc) as a GC.

    int *ptr = …;
    gcable int* gptr = …;
 
But unfortunately such dualism is hard to achieve.


This is very hard to achieve because GC + RAII do not play nice together. It's hard to reason about whether an RAII resource has been cleaned up since managed objects have non-deterministic destruction. This non-determinism is unsuitable for common RAII patterns like scoped mutex locks. Now, the question being begged is "Can this be avoided by not using destructors with managed objects?" Yes, but not reasonably. Any unmanaged objects that are owned in the entire managed object graph now also have non-deterministic destruction. Thus, the only way to reasonably follow this constraint is to have the compiler disallow/warn when an object with a non-trivial destructor (explicit or implicit) is used with GC. If that wasn't enough of a pain, any runtime-polymorphic types may also need to be destructed and these can't be known at compile time if dynamic linking is used. In essence, these concepts are incompatible to a point where any language supporting both will likely make extensive trade-offs.

I find this really sad because many developers may never be exposed to how elegant RAII can be if they are never exposed to non-memory-managed languages. Sure it can be approximated using `with` constructs, but those don't allow resource allocations to cross scope boundaries.


Rather, GC should be default, and you opt-out of GC for specific optimization problems.


> When you say "manual memory management" if you're thinking C-style malloc-free then you have a point, it's very easy to forget a free() somewhere. But any language with destructors can handle these situations without much more overhead than GC-based approaches.

What happens when that object is shared and still referenced from somewhere else? Now you have a use after free error waiting to happen.


That's what smart pointers/ref counting is for


C++ is the flagship language for smart pointers and UAF memory corruption flaws are endemic to it.



Doesn't seem related though?

Unless you want to pick on the specific implementation?


That's literally GC.


I mostly agree with your point on RAII.

But I don't think OP was advocating RAII.

The typical argument for GC is that it is less error-prone than even RAII. And I agree with that point. It's a trade-off on which you have to decide on. I think the tiny perf penalty is often worth it.

You can still retain stale data under a GC, but real leaks (unreachable allocated memory) are precluded — which is usually considered (not unreasonably) to be the more common and serious problem.

I think GC is a sane default because it's less error prone, and god knows the average developer will make enough errors as it is. I'd trust myself with manual memory management, but not a random group of developers I don't know. What about you?


You can of course introduce a space leak in a honest GC environment. Make a linked list, and keep adding items to it while keeping its head referenced. Of course you do it by mistake: you kept two references, the main and the auxiliary, and the main reference is duly gone.

This is why weak references were invented.


>I find RAII a lot easier to model and think about than "you drop this reference and one day maybe your object is destroyed, but don't really rely on that"

Heh, I just realized this is what we depend on to remove commits from Github. "Uh, just delete any branches that contain it, and, uh, it'll get removed ... like, eventually." (of course you have to assume any secret data is compromised)


This is a fundamental git model. see `git help gc`


I know I can clean up locally, I meant deleting the commits on the github server/repo.


If the lifetime can be expressed with a state machine then it needn't be garbage collected so long as the language is sufficiently expressive.

Reference counting isn't garbage collection, as most people consider it. There is no need to sweep or otherwise traverse memory to discover objects that are not referenced.

Claiming that ref counting is gc is a bit like claiming malloc/free is gc.


I've always seen automatic reference counting classified as GC.

e.g. https://users.cecs.anu.edu.au/~steveb/pubs/papers/rc-ismm-20...

> Reference counting and tracing are the two fundamental approaches that have underpinned garbage collection since 1960.

https://www.cs.virginia.edu/~cs415/reading/bacon-garbage.pdf

> Tracing and reference counting are uniformly viewed as being fundamentally different approaches to garbage collection that possess very distinct performance properties.

This last one is great if you want to understand GC tradeoffs btw, highly recommend it.

---

And as the first paper implies, refcounting is often slower (because it trashes caches). The issue is having to propagate the diminution of refcounts amongst a graph of references. This also creates de facto "pauses", much like tracing GC (although arguably more predictable!).

And this "propagation of diminished refcount" is very much a memory traversal — (probably) smaller in scope than (non-incremental) tracing, but also much less local.


Interesting! I'll be reading that today.


"If the lifetime can be expressed with a state machine then it needn't be garbage collected so long as the language is sufficiently expressive."

One programmer here told me he does that in Ada and/or SPARK Ada using enumerated types encoding states. Now there's two of you. Most stuff I see like this are linear/affine/dependent types or separation logic. I rarely see people talking about simple solutions like this. Do you have any links that go in-depth on it with examples? It would be good to have something to pass along to people interested in this.


Any lifetime that depends on an io loop or user input cannot be determined within the loop Period. Even if you have an asic designed for garbage collection. It cannot even be dynamic.

Basically in these cases you have to extend the lifetime to beyond the point of indeterminism into the point of determinism or in other words move the life time outside of the loop.

Worst case scenario you move the lifetime to right before termination of the program.


I think that garbage collection has proven to be the best idea in programming languages in the past decades, and probably the only one that has made an impact at all. It's getting so good these days that few are willing to consider "moving on." In terms of throughput, it beats manual memory management (in fact, in principle, the cost of GC can be made arbitrarily low; a famous Andrew Appel paper shows how it can be cheaper than a stack), and in terms of latency we now have GCs with worst-case pauses of ~1ms on terabytes of heap, with an acceptable impact on throughput.

> Think about it, instead of finding a way of expressing when we're done with instances, we have a giant for loop that iterates over all of memory over and over and over to guess when we're done with things. What a mess!

It would be a mess if that's how modern GCs worked, but they don't. The problem with "expressing when we're done with instances" is that it's just wasteful, especially when concurrency is involved. This is one area of computing where the computer can offload a human task, and do so quite efficiently.


Good throughput with GC seems to come with significant extra memory use:

"We compare explicit memory management to both copying and non-copying garbage collectors across a range of benchmarks using the oracular memory manager, and present real (non-simulated) runs that lend further validity to our results. These results quantify the time-space tradeoff of garbage collection: with five times as much memory, an Appel-style generational collector with a non-copying mature space matches the performance of reachability-based explicit memory management. With only three times as much memory, the collector runs on average 17% slower than explicit memory management. However, with only twice as much memory, garbage collection degrades performance by nearly 70%."

https://people.cs.umass.edu/~emery/pubs/gcvsmalloc.pdf

So you're still paying for it, just in a different dimension.


> Good throughput with GC seems to come with significant extra memory use

Of course. RAM is the price we pay for GC. Good thing it's so cheap on servers.

BTW, just note that the paper doesn't compare GC (never mind that the algorithms have much improved since then) to real explicit memory management, but to an oracle (i.e. explicit memory management by an all-knowing God). Paying nothing other than for 3x RAM to get performance that's 17% worse than what God could achieve is a bargain, and why GC is considered one of the greatest success stories of modern computing.


Memory you are wasting for GC is Memory you are not using, for example, for caching which directly impacts performance.

Mind you, I think GC is great, but saying that RAM is free is misguided.


FWIW, one mitigating factor is that GC isn't the only approach that wastes memory. You can also end up with memory waste in manual memory management. Due to heap fragmentation, for example, or retaining objects for longer than necessary due to programmer caution or error. I don't (personally) dare estimate their relative sizes, because I'm guessing a huge, perhaps dominant, share of differences in memory usage across languages is driven by features other than GC. Dynamic typing, for example.


I didn't say it's free, only that it's relatively cheap. That RAM can simply be used for caching is incorrect, though. There are very-non-neglible costs to maintaining caches in distributed systems.


It’s not clear what you mean by a distributed system in this context, but RAM is used for caching, costs or no.


My point is that you can't increase the cache size indefinitely to get performance improvements, because you also have to handle cache invalidation. There is an optimal working set.


Using RAM for "caching" sounds extremely wasteful. There could be applications that want to use it for something more useful than a 1% performance improvement.


Caching is way more impactful than a 1% speed up. It is hard to know exactly, because it is impossible to disable on most modern operating systems, but I wouldn’t be surprised if it were an order of magnitude improvement in some situations.

By the way, caching doesn’t prevent RAM for being used by applications. If an application wants more memory, the os can always just evict some of the cache.


This is not only a dated paper (a decade and a half old), but it relies on assumptions that do not necessarily hold in practice.

If you test actual allocation performance on actual current day hardware, you may end up with completely different results, e.g.:

https://github.com/rbehrends/btree-alloc


That's a very fresh repo. I wonder how GHC would fare. I guess I should contribute.


Thanks, but I do not plan to extend this project to other languages. I put it together a while ago as an illustration that conventional wisdom regarding GC cost is not what many people think.

If I were to expand it, I'd look at other allocation patterns rather than more languages; I've already good a fairly good cross-section of garbage collectors and malloc() implementations, so I don't really need any more.


Fair enough. Well, too late, I just hacked together a Haskell version based on the OCaml version. It's naive code, but 16 times faster (0.235s vs 3.818s).

So I expect some GHC specific optimizations are kicking in. Eg GHC is probably smart enough never to construct the whole tree at once. Whether that counts as the kind of static analysis static of object lifetimes someones else in this thread talked about or not, I don't know.

Update: I think it's just common subexpression elimination kicking in..


> I think that garbage collection has proven to be the best idea in programming languages in the past decades, and probably the only one that has made an impact at all.

Depends on how you set your thresholds. Paying more attention to side-effects / purity / const has been pretty useful, too. And so have mainstream languages adopting first class functions. And making it easy to compose data structures. (Compare eg how Python allows you to return essentially two values via a tuple, and how cumbersome the same tasks is in Java; so people are tempted to use side-effects instead.)


> Paying more attention to side-effects / purity / const has been pretty useful, too...

I don't know. Studies haven't been able to find any big effects, and the industry isn't seeing them, either. You may like those things, but unlike GCs, it doesn't seem like they've made a real impact.


Interesting. Do you have links to these studies? Thanks!


If anyone is interested in reading the paper, it’s available at this FTP server: ftp://ftp.cs.princeton.edu/reports/1986/045.pdf


I tend to take the position that arguments about the merits of GCs are proxies for a different issue at a higher level of abstraction that is rarely tackled directly. One of the correctly argued advantages of GC is that it automatically handles some difficult edge cases like ambiguous lifetimes and cyclical references. In my opinion, there is an interesting discussion to be had around why these edge cases need to be solved in real software systems.

At least in my experience, the cases that GCs solve uniquely well are a code/architecture smell. Every time I've written some complex code that creates the edge cases GCs solve well, it has been a red flag for poor design. I've never run into one of these cases where there was not a different, better design that made these edge cases go away in a sensible manner. In a sense, GCs primarily help developers by allowing them to ignore issues of poor design (which may not be their fault) in order to get things done. While there are obvious merits to the "just get things done" view, the reality is that the sloppy design that leads to these kinds of situations tend to be strongly correlated with other software problems that a GC won't paper over.

Programming languages strive to enable and encourage good design with their feature set. There is an interesting philosophical debate about language features that exist almost solely to mitigate the effects of poor design, which IMHO is where GCs fit. Does availability lead to enablement in practice?


This hits on something I've been wondering about lately:

For those of us who might want some of the performance characteristics of Rust, but not to the extent that we're willing to take on all of the strictures it imposes, what's the feasibility of building a compiler that can avoid GC for objects whose lifetime can be determined statically, while still retaining GC for the rest?

My (wild) guess is that it wouldn't actually gain you any real practical advantage over a good generational GC, but still, I'm curious if anyone has looked into something like that.


Compilers of managed languages already do this optimization when they can prove that a variable has a static lifetime. Look up scape analysis.

The trouble is, this is hard to do for non-trivial code (and in the general case is uncomputable).


I'm not finding how to edit, but, it's called escape analysis


I think this was the promise of D. For some reason it's not very popular.


Knowing when you’ll be finished with an object solves the issue of knowing which objects to free but not the issue of heap fragmentation.

The main reason that eg C programs don’t suffer so much from heap fragmentation is that writing a C program which uses ephemeral objects in the way a GC’d language might is so incredibly painful that one mostly avoids the convenient programming techniques which lead to fragmentation.

The usual solution in a language like C (or a safe language like rust) to data where it is hard to know when it should be freed is to use reference counting yet reference counting is typically slower than an actual gc, especially on a multithreaded system (but this needs a quite advanced gc) where reference counting needs atomics. This leads some C programs to use gc when allocating and tracking these objects because it is cheaper than the overhead of malloc and reference counting.


E.g. Redis is a well known example, that have to battle with the fragmentation problem:

https://github.com/antirez/redis/pull/3720


Fragmentation is solvable. Mesh can compact C programs using malloc/free without any source changes.

https://arxiv.org/abs/1902.04738


Garbage collection is godsend when it comes to concurrency algorithms.

Determine liveness is hard on its own, ABA issues are totally eliminated by GC enabled setups.

As for energy costs, I'd bet generation/copy collectors are cheaper than malloc/free. They are way cheaper than ref. counting which requires deeper pipelines + branch predictor extra load; ref. counting with concurrency requires atomics, cache coherency traffic, flushing of the write queue and memory barriers.


> ABA issues are totally eliminated by GC enabled setups

Doesn't the same approach work with a thread-safe reference-counting approach to memory-management?

The big difference between tracing GC, and reference-counting, is the ability to handle reference-cycles. Is that relevant here?


> Determine liveness is hard on its own, ABA issues are totally eliminated by GC enabled setups.

FWIW, Herlihy has examples of ABA issues on algorithms implemented in Java.


I wonder what that be... It can happen with primitives only and the usual solution is: increment only and use prepare/publish with two distinct counters (for circular buffers alike).

Do you have any references?


Herlihy, The Art of Multiprocessor Programming. I would quote chapter and verse, but somehow the book has disappared from my desk.


I tried to have a quick look: 10.6 Memory Reclamation and the ABA Problem

"There are several reasons we might want to do this. Languages such as C or C++ do not provide garbage collection. Even if garbage collection is available, it is often more efficient for a class to do its own memory management, particularly if it creates and releases many small objects. Finally, if the garbage collection process is not lock-free, we might want to supply our own lock-free memory reclamation."

There is =absolutely= no reason to attempt and recycle nodes (for queue/stacks) in Java, esp. 'small object'. It'd perform worse than pretty much any GC. The solution to use "AtomicStampedReference" is a weak one as - it does not enjoy JVM intrinsics. It's an example of course [and AtomicStampedReference has it uses], but the better option is just keep creating new objects and let them trivially die in the young gen.

Again ABA does not exist in Java unless attempting to reuse objects (pooling direct memory would be the closest case; Direct memory [direct ByteBuffer] is one of the areas where GC sucks hard)

Personally I have not read the book, so I cannot comment on its contents. I have quite extensive experience coding lock free data structures.


IMO manual memory management is the epitome of sunk cost fallacy. Sixty years of good programmer effort thrown at a bad idea that has produced an endless stream of security bugs and performance problems that have only hidden the cost of memory management behind non-obvious barriers and done severe violence to systems languages with an absurd obsession with custom allocators and baked ownership into otherwise straightforward APIs and made all programming harder.

> we have a giant for loop that iterates over all of memory over and over and over

I don't think you understand how garbage collectors work.


How is that not how garbage collectors work? You start at a gc root and chase pointers all through memory. Then do the same thing again and again. (I'm assuming we're talking about a standard, generational mark-and-sweep gc.)


>> we have a giant for loop that iterates over all of memory over and over and over

> How is that not how garbage collectors work?... I'm assuming we're talking about a standard, generational mark-and-sweep gc

GCs do not scan "all memory", but small fraction of memory. In case of generational GC the scanned fraction of memory is (usually) limited to single generation. Even without generational approach scanning heap itself is frequently avoided in favor of scanning separate data structure with highly compressed representation of object set.

GCs do not generally iterate over memory just because they can. They either reclaim space for new allocations, move things around to reduce fragmentation or fire periodically in response to increased allocation rate. If your program does not make allocations, it may never incur a GC at all.

The grandparent comment makes it sound like garbage collection is a simple effort, conducted solely by distinct GC code ("giant for loop"). This is often not the case: for example, JVM may generate additional memory barriers in any code, that uses references (exact nature and purpose of memory barriers depends on GC being used [1]). Augmenting the code with those barriers allows GC to operate more efficiently and quickly: achieve smaller pauses, scan less memory, collect memory for some threads without disturbing others.

[1]: https://shipilev.net/jvm/diy-gc/#_barriers


While I can't agree with the GP at all, I also can not agree with you. There are more options for memory management than fully manual and fully runtime.


I think he was simplifying to make a point.


I think he was simply wrong.


You say a lot of things without justification: "IMO garbage collection is the epitome of sunk cost fallacy. Thirty years of good research thrown at a bad idea." "Garbage collection is just plain bad." And more.

> The reality is we as developers choose not to give languages enough context to accurately infer the lifetime of objects.

Possibly because it's hard and until compilers + maths developed enough, it wasn't possible. Maybe. I dunno, but there's probably a reason Rust didn't appear in the 1970s. They're probably harder to use than conventional langs, judging by what I read about Rust's borrow checker (dunno, Rust's on my todo list so no actual experience).

> we have a giant for loop that iterates over all of memory over and over and over to guess when we're done with things

This is from memory so I may be wrong, but a conservative GC will iterate over the whole memory but if you're using one of those you've probably got other problems because yes, conservative GCs do 'guess'. (Edit: and when it guesses wrongly it can fail to free, causing leaks. And some optimisations that temporarily make pointer values 'disappear' can cause premature frees of the object pointed to. I believe an example of the latter is using branch-free XOR on pointers to swap them. 'proper').

Most modern systems will use something more sophisticated and known-correct such as tracing and will trace just the live stuff. Mark-compact for sure only touches live data which makes it efficient for functional langs (for some definition of 'efficient', I have questions).

I'd recommend reading a book ('The Garbage Collection Handbook: The Art of Automatic Memory Management' 2nd edition, Richard Jones et al) before abusing GC. I actually agree with some of what you're saying but you need to say why.

> It wastes time, it wastes space, it wastes energy.

I quite possibly agree! (see a recent previous post where I call lack of optimisation in scala 'immoral') But if you just give your opinion without some solid backup I can't endorse you. Please read the GC book, it really is very good indeed. The author's a nice guy too, met him years ago.


   Thirty years of good research
Historical aside: the first paper on GC is [1] which was written in 1963, by Minsky, who's now famous for AI research. So we have GC research for > 50 years.

[1] M. L. Minsky, A LISP Garbage Collector Algorithm Using Serial Secondary Storage. https://dspace.mit.edu/bitstream/handle/1721.1/6080/AIM-058....


See the chapter 6.3 'The Free-Storage List and the Garbage Collector' in the LISP I Programmer's manual from March 1960.

http://bitsavers.org/pdf/mit/rle_lisp/LISP_I_Programmers_Man...


Thanks. That's amazing. I always though GC must have been in Lisp from Day 1, but the Minsky article used to be the oldest one I was aware of. Has anybody dug into the original LISP sources (do they still exist?) to see when working GC first arrived?


I'm pretty sure that is the first implementation of Lisp. Earlier "versions" of Lisp were hand-compiled, because it was considered too difficult to produce a compiler. Then Steve Russell realized that McCarthy's "eval" function could be implemented in assembly language (which apparently hadn't occurred to McCarthy, he considered it merely theoretical) and produced the first Lisp interpreter. Note that the manual is from March 1960 and McCarthy's first paper on Lisp was published in April 1960, so nobody outside his circle at MIT would have known about Lisp at the time.

The manual also states that it's for "a version of Lisp being prepared for the IBM 709", implying the manual was written while the interpreter was still being developed, and all references I've found state that the IBM 709 version was the first practical implementation.

More on the very early history of Lisp here: http://jmc.stanford.edu/articles/lisp/lisp.pdf


According to Prof. Herbert Stoyan, probably the oldest GC algorithm description is in 'J. McCarthy, M.L. Minsky: Artificial Intelligence. Quarterly Progress Report No. 53, Res. Lab, of electronics MIT, Cambridge, April 1959.' Prof. Stoyan then mentioned that the 'first garbage collector was implemented by Dan Edwards during June/July of 1959'.

So we have 60 years of GC research.


You can find the source code here: http://www.softwarepreservation.org/projects/LISP/lisp15_fam...

The same site has the sources for plenty of other versions.


In my opinion, shared memory mutability is the actual bad idea that's going to eventually die. It's not just error prone, it doesn't scale; try mutating the same cache line from two threads on an Intel desktop chip and see how that goes. Let alone across mesh buses, chiplets, dual sockets or (God forbid) the network.

Garbage collection on immutable data is vastly easier; it's intrinsically non blocking, concurrent and parallel. You can build a garbage collected immutable data store over arbitrary numbers of computers.


Au contraire, as memory, and hence workloads, get bigger, the cost of copying increases, and copying rather than sharing becomes ever more expensive. I just spoke with a physicist who told me that his current experiment produces 200 Gigabytes of data per second! Copying that amount of data around for immutability is inconceivable. Indeed the single biggest factor for performance for such workloads is minimising the amount of copying.

Erlang-style shared-nothing is brilliant if you can afford it, but it's affordable only for small data.


Erlang style immutable reference counted shared memory is not at all state of the art. Persistent data structures are common and effective. I have implemented several myself, to store and update trees that don't fit in main memory. If you're updating one byte of your physicist's (say) 1 petabyte dataset, it requires ten memory allocations (log[branch factor] of the size of the dataset).

If you're updating the whole thing linearly, do consider that updating something in place or creating an updated copy is identical in terms of performance; a storage device doesn't care whether the bits streaming back are overwriting the same locations or new ones. Though of course you end up with both copies at the end.


Are there high-performance matrix multiplication libraries, e.g. for fluid-dynamics simulations, that are implemented using persistent data structures?


Are there any where multiple threads are mutating overlapping sections of memory? How exactly do you think a GPU matrix multiply violates my initial assertion?


Usually when working with immutable data you don't need to copy it around, you can just pass it around via reference, and the data structure will support cheap "updates". You can see this in F# and Clojure where immutable maps are supported by tree structures, and operations like add/remove return a new reference without touching the original while being reasonably performant.


Am I wrong in thinking that a system comprised of entirely immutable memory doesn't even need a garbage collector? If you built such a system in Rust would not all memory just be deallocated once no longer utilized? The rust principal is (many readers XOR one writer) leads to fully static memory management. This is my point. We need to go back and re-think our language design principals keeping memory management in mind instead of pretending it's a solved problem.

IMO you make a good case.


   Am I wrong ...
Yes.

Unless your memory is infinite. Rust also automatically deallocates memory that is no longer used, so it's some kind of GC, but one where the point of deallocation is either statically determined (based on nesting of life-times) or dynamically (using reference-counting). The problem with static dealloc is that the compiler cannot always work out when memory becomes free and must make conservative approximations (see Rice's theorem [1]), the problem with ref-counting is circular data structures (and also the cost in space and time and synchronisation across threads of maintaining an index).

[1] https://en.wikipedia.org/wiki/Rice%27s_theorem


In high-performance systems, software architectures have actually been moving toward non-shared mutability. Shared immutability tends to waste a performance-limiting resource (memory bandwidth).


I'm a bit confused by your phrasing. Non-shared mutability makes perfect sense (and is not the thing I believe needs to go), but at some point you have to share the results of a computation.

If not shared immutability, how is this done in the systems you're describing? I assume message passing, queues or similar?


Sorry, I wasn't clear. There are two common software architecture schools for scaling practical concurrency.

The first is the copy-on-mutate style commonly used in functional programming paradigms. This is what I assumed was meant by "shared immutability".

As you surmised, in the second model individual threads own all mutable state, and operations on that state are requested via message passing (usually SPSC queues within individual servers, which are scalable and extremely efficient). This makes all the code effectively single-threaded. The key technical element for these designs is that there often needs to be a mechanism to sporadically transfer ownership of state between threads to balance load, also arbitrated over the SPSC queues. This has a number of straightforward solutions.

The second model has come into favor mostly because it is very memory-friendly, due to excellent natural cache locality and requiring minimal copying of state.


That's interesting to hear, thanks!

My rant was only really aimed at shared mutable memory; passing a piece of it around with one owner at a time is totally fine. One writer, multiple readers has its uses too.

I am a fan of persistent data structures and I think that as we keep slapping on cores they're going to be more appealing for general purpose computing, but they're certainly not one size fits all, and the performance tax is substantial.


Utterly unfounded and myopic. Yes, you're right to stress new languages, but being liberated from memory management in FP languages like haskell and scala is amazing and I will never even consider going back to manual management unless I absolutely need the last ounce of performance.


  The reality is we as developers choose not to give languages 
  enough context to accurately infer the lifetime of objects.
The reality is we as developers choose not to give languages enough context to just do what I mean.

  instead of finding a way of expressing when we're done with instances
So, when are you done with an instance? How to enumerate all possible situations and devise a comprehensible encoding for them? Without requiring developers to keep a lookup table of lifetime declarations or other non-local reasoning in mind? Concurrency. Laziness. Runtime code replacement.

You're handwaving over fundamental research problems here. You don't know this is even possible in a way that doesn't completely hose the comprehensibility of a language.


It wastes time, space and energy, but it also lowers the bar for programming ( don't have to know or care much about memory ) and removes the potential for errors. It's like programming on training wheels. Sure it, there are costs upfront, but it also saves costs down the line.

Also, garbage collection doesn't iterate over all memory. There are lists, structures and variables that the runtime/GC manages.

And we already have languages that doesn't use GC and give us ways of "releasing memory" - C being one of the most famous ones. But C also has its share of problems.

The more control you give to the programmer, the lower the footprint but the higher the learning curve and the costs of bugs. The more control you give to the runtime, the higher the bloat and lower the learning curve, but lower the cost of bugs. The GC exists because it removes a class of bugs that human programmer seem prone to write and it theoretically allows programmers to work on more business/productive aspects of software development.

Software development is such a vast space that I think there will be room for GC and non-GC. And as you said, sunk costs ( on both sides ) almost dictate they both be around for a very long time.


> Rust is a start,

Rust is not a start, it's a next iteration over old ideas (look at Ada, Cyclone etc). All of those languages were niche and will stay niche for a reason.

> Garbage collection is just plain bad. I for one am glad we're finally ready to consider moving on.

We are not moving anywhere. You still need lifetime annotations in Rust and design your application in specific way to satisfy the borrow checker. You can't just write it the same way like you would in GC language. You trade productivity for complexity to avoid GC. Simplicity and productivity will always win in current market with complexity. Developers time is money, a lot more money than money spent on energy and hardware. That's why companies throw money at hardware and use GC languages, because it's a lot cheaper, simple and productive way to create software.

So unless you solved the halting problem GC is here to stay.


Productivity has many facets and is hard to measure. A lot of people feel very productive in rust because it catches a lot of problems early on. GC allows you to be sloppy in reasoning about the lifetime of an object, which is often an important design point that affects the overall system.

Also, you typically have to think about lifetimes when writing libraries, not so much when writing applications that use the libraries.


> The reality is we as developers choose not to give languages enough context to accurately infer the lifetime of objects.

Well, if we force devs to describe context, why not simply make them manage memory manually then?

There is nothing wrong with manual memory management. It is not hard at all.


> It is not hard at all.

proceeds to cause trillion dollar damages because literally Terrence Tao couldn't prove major manually-managed programs are correct


> If we'd spent that time developing smarter languages and compilers (Rust is a start, but not an end) we'd be better off as developers and as people.

What about Swift? Automatic Reference Counting avoids the typical problems of mark and sweep garbage collection. Has its own trade offs of course but the benefits include more consistent latency, efficient memory usage and cache friendliness. Like Rust it uses a smarter compiler (LLVM).

Really interesting stuff happing on the server side of Swift right now. It’s early days but I’d keep a close eye on it. Long tail latency is a big big server problem.


> we have a giant for loop that iterates over all of memory over and over and over to guess when we're done with things. What a mess!

Yeah, that's like having dedicated busboys clean up customers mess and clear tables when they are done in a restaurant. Why can't the customers get up and remove dishes as courses are served? Or the waiter could do it. Why does the poor busboy have to ask "are you done with this?" Now this article is proposing busboys on powered skateboards. What a mess!

Counter-point, for you.


Maybe, but you forgot something: memory fragmentation, if this HW solution enable a "bup the pointer" allocation and also allow "real time" deallocation (I don't know I cannot access the paper), the result could produce better latency than "normal" manual memory management and better resource usage than "allocate everything at the beginning" than "real time" software use.


An idea so bad it has made it into hardware!


Hardly the first time this has happened. Anybody remember Jazelle?


I think you might have missed the point of having software tell you what mistake you made without crashing?

Not to mention seamless hot-deployment or concurrent parallelism that is very hard without a VM.

I think it's hard to convince non-server coders of why VMs are good.


I bet you complain about interpreted languages too. The trade off of computer vs developer time happens by design.


Cyclic data structures, memory safety, memory management without tracing. Pick 2.


Well, you can use weak pointers appropriately in a safe setting. It's not fun though. Alternately, you can use a layer of indirection, such as throwing all your graph nodes in a big vector and having them use indices to refer to one another.

So I would perhaps revise that statement to "easy cyclic data structures, memory safety, memory management without tracing, pick two".


Fair enough. But with the vector approach, now you can have something that looks a lot like a use-after-free error!


Yes, you can, but it doesn't result in type confusion, which reduces its severity.


Naive question here, as many in this discussion are way more experienced with the inner workings of languages:

It seems like doing away with GC is somewhat akin to the Halting Problem, in that it's impossible in the general case but may be practical in the common case. Obviously there are languages without GC, but they often end up reimplementing the patterns of GC for particular problems, so I really mean doing away with GC-like patterns and concerns (memory management).

With the Halting Problem, the specific feature that triggers it is unbounded loops. The solution is that most languages have two classes of loops. Ones that are bounded and ones that aren't. We still need unbounded loops for a language to be Turing-Complete, but frown upon their use unless it's absolutely necessary.

To the question: it seems to me that the equivalent feature that triggers the need for GC is passing/scoping items by reference rather than value, because it violates the simpler memory lifetime model of local variables in the stack frame. Could it be that the solution would be a language that makes it easier to explicitly reason about, detect, and limit objects that are passed around by reference?

I'm thinking the initial reaction will be no way, it's way too convenient to pass by reference, all languages pass larger objects by reference to avoid unnecessary copies, etc., but am wondering if it's the next "GOTO considered harmful".

Perhaps there's a way to rethink the way program flow and object persistence interact. Not in the general case, but as a way to make it more explicit which things might leak, in the way that most languages now allow you to mentally discount "foreach" loops as not subject to the halting problem and only worry about the rare "while" or recursive call.

This may be just me groping back to the idea of functional purity, as pure functions don't mutate external state, but perhaps there are other, less restrictive ideas that would also help, such as a notion of larger flow control regions that guarantee they won't return to some distant part of the program and thus will not reference an object used there again.

Overall, I'm just thinking there are no silver bullets, but that perhaps the cases where the problems occur can be corralled into a corner, as probably the majority of a lot of programs have no issues with memory management and would benefit from making that explicit, so it's easier to focus on the bits that do.

Edit: I'm guessing a lot of what I'm groping towards is exactly what Rust and its kin deal with, but should probably learn one of those kinds of languages before I armchair about this stuff.. :)


> If your co-worker proposed this as a solution you'd probably slap them.

So instead you'd suggest to research better languages? What would you say to your boss when he asks you about a time estimate?


You seriously should just start the discussion by actually checking out how garbage collection actually works instead of touting random garbage about how it is just a predictions game.

Garbage collection has multiple variations, multiple optimizations, multiple tradeoffs. Different variations are powering the most popular and the most used languages all around the world (see also: C#, Javascript, Java).

To think that that is just "research thrown at a bad problem" is sheer insanity.

I propose we split this discussion up, because dealing with leaks is a completely different problem than dealing with garbage collection. You can have memory leaks in manually managed languages too, it has nothing to do with GC. But lets not throw the whole field out just because you do not want to deal with automatic memory management tradeoffs.


Applications are open for YC Winter 2022

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: