
Garbage Collection is Wrong - downscout
http://www.lb-stuff.com/gc.html
======
danbruc
Yes, in some scenarios you can associate the lifetime of a resource with the
lifetime of a storage location, but this simply does not work in all cases,
probably only in a small fraction of all cases. And then? How do you handle
resources that have no single obvious owner? How do you determine if the
resource is still in use when you are done with it in one place? You implement
some kind of reference counting? You keep the resource alive until you reach a
point where you definitely know it is no longer in use? You turn your code
upside down and force it into a single owner structure destroying the
semantics of your code on the way?

~~~
mpweiher
"Yes, in some scenarios you can associate the life time of a resource with the
life time of a storage location"

In what cases is this not possible? I model the resource as an object and that
object has a memory location.

(Yes, in a non-deterministic GC, you can't do this for expensive resources,
but that's a problem of GCs, not an in-principle problem).

EDIT: I am guessing the parent actually meant "associate the life time of a
resource with the life time of a _single, non- /never-shared_ storage
location". The statement as written is one I have seen proponents of non-
deterministic GCs make.

~~~
mikeash
For example, two lists both contain a reference to the same object. Neither
one is a sole owner, and you can't safely destroy the object until both lists
are destroyed.

~~~
letzjuc
Then the lists should only have a weak_ptr to the object.

Something is handling both list [1], that something could own the object, or
maybe something external to that. Giving ownership of the object to both list
is a design error [2].

[1] e.g. if the two list are an implementation detail of a data structure, the
data-structure itself could own the objects in the lists.

[2] Do non-deterministic garbage collectors that handle cycles allow you to
have a resource with multiple owners? Yes. Should you do it? No, god, please
don't.

~~~
ori_b
It's only a design error when you don't have a GC.

Imagine you have an arbitrary long lived connected cyclic graph that can be
incrementally updated from multiple short lived worker threads. With a GC,
this is a no brainer: just put the objects in the graph. No workarounds, no
extra tracking.

Without a GC, on removing a node, you have to walk to essentially do a
mark/sweep of the graph to find dead nodes that were connected through the
node you removed.

~~~
letzjuc
Have a vector of shared_ptr that own the objects in the graph and build a
graph with weak_ptr ?

Removing an object is just as easy as removing an element from the vector. (If
you test the weak_ptrs on use, that's actually the only thing you would need
to do).

~~~
mikeash
Isn't that equivalent to using a shared_ptr directly, just unnecessarily
complicated?

Reference counting works fine as long as you don't have cycles, of course.

~~~
letzjuc
The solution above works even if your graph has cycles. Of course if you know
that it doesn't you can just build the graph with unique_ptrs.

~~~
mikeash
It works with cycles, as long as you can know exactly when you want to remove
something from the graph (as opposed to having it be removed when no longer
referenced). Reference counting works fine that way too, though. It's a bit
more work, but you just dive into the structure and manually remove references
which breaks any cycles it may be involved in.

------
lucian1900
Meh, GC has its uses. It helps that the vast majority of resources are indeed
memory, so it makes sense to special-case.

Although in some situations it isn't acceptable and it's interesting to look
at what languages like Rust do about it. It ends up guaranteeing safety, but
with not much more effort than GC.

------
chrismorgan
Idiomatic Rust code follows the RAII pattern and does not use garbage
collection; we've been finding that it works very effectively.

~~~
azth
Of course it helps that Rust has notation for object lifetimes. :)

------
dalke
Zero-suppressed Binary Decision Diagrams (ZDD) requires reference-count based
garbage collection in order to be efficient. RAII isn't a viable replacement.

See for example [http://ashutoshmehra.net/blog/2008/12/notes-on-
zdds/](http://ashutoshmehra.net/blog/2008/12/notes-on-zdds/) \- "ZDD-bases,
which though conceptually easy to understand, are non-trivial to implement
efficiently because behind the scenes, lots of things have to be taken care
of: o New nodes are born and old ones die — nodes have to be efficiently
allocated, ref-counted and garbage-collected."

Or from
[http://www.ecs.umass.edu/ece/labs/vlsicad/ece667/reading/som...](http://www.ecs.umass.edu/ece/labs/vlsicad/ece667/reading/somenzi99bdd.pdf)
:

> One would like to release the memory used by those BDDs, but there are two
> problems. First, some subgraphs may be shared by more than one function and
> we must be sure that none of those functions is of interest any longer,
> before releasing the associated memory. Second, BDD nodes are pointed from
> the unique table and the computed table, as well as from other BDD nodes.
> There are therefore multiple threads and one cannot arbitrarily free a node
> without taking care of all the threads going through it [12].

> A solution to these two problems is garbage collection.

Similarly,
[http://vlsi.colorado.edu/~fabio/CUDD/node3.html](http://vlsi.colorado.edu/~fabio/CUDD/node3.html)
("The CUDD package relies on garbage collection to reclaim the memory used by
diagrams that are no longer in use. The scheme employed for garbage collection
is based on keeping a reference count for each node."),

------
dmitrygr
not even sure where to start...

1\. plenty of garbage collectors do not pause all threads or wait for memory
to be full.

2\. you CAN ask for gc anytime you want in java and in C# (you need not wait
for it to happen)

3\. new and delete are still integral to C++ (check out any large codebase,
like llvm)

~~~
rian
1\. Can you give an example of some of these systems?

2\. Designing your code in such a way that requires it to invoke the GC seems
counter-productive. If your algorithm is producing a bunch garbage that is
statically known, why not just release that memory explicitly? Invoking the GC
is way more expensive than necessary here.

3\. New/delete will always be used in performance critical code but I think
the point is that in general it's not a good practice.

~~~
dmitrygr
1\. [http://www.azulsystems.com/presentations/qcon-
london-2011](http://www.azulsystems.com/presentations/qcon-london-2011)

2\. fair but irrelevant. the argument was that there is no way to make GC
happen. It was wrong. I never objected to the (perhaps more interesting)
argument that one should be able to manually delete objects

3\. then the parent should have said "in toy codebases, whose function is to
look pretty, new and delete have no place" instead of a sweeping
generalization that all uses of new and delete are archaic and wrong. That one
generalization was an insult to every llvm & WebKit developer out there.

Disclaimer: i have no dog in this race - i prefer to work in C and think
anyone who cannot free their own memory should maintain a safe 5 meter
distance from any compiler. (this is my opinion only, of course)

------
emeraldd
_In modern C++, using new or delete in your code is wrong. It 's not done.
Nobody writes code like that anymore in C++._

It's been quite a while since I wrote any significant C++ code, but this
statement seems wrong. Is this really "state of the art" for C++?

~~~
banachtarski
Professional C++ coder here.

Seeing new and delete has _very_ heavy code smell and can be avoided 99% of
the time.

~~~
mcphage
What have you done to your language that the native syntax for creating and
deleting objects is a code smell?!

~~~
kd0amg
Included an alternative to new/delete's form of allocation that covers a good
range of its use cases and is less prone to screwup.

------
mildtrepidation
As someone who's not interested in following every acronym and resource
allocation strategy there is... is this even remotely useful? Lately I've
gotten back into Java for Android, and after a very long time with Python,
it's the first exposure in over a decade I've had to this argument...

...which, for some of us, _isn 't_ an argument, as we have basically no choice
in the matter. But I'm still interested to know whether this is actually
useful input or whether it's more holier-than-thou posturing.

To me, this is missing the forest for the trees. It's easy to argue about
language implementation semantics when, in practice, it means almost nothing
to anyone as we're stuck with the idiosyncrasies of the platforms we work on,
_even when we choose the platform._

I guess it's good that someone is looking at these things, but I can't see the
utility in it. Best case scenario, this becomes a paradigm shift in new
languages (which doesn't help us or anyone else for years) or gets implemented
in the next major revision of language X (which doesn't go mainstream for
years).

~~~
xg15
I disagree. I believe even if you're stuck with a certain situation, it's
important to know whether the status quo is actually a good thing and you
should work to keep it or if it's unavoidable for the moment but generally bad
and you should do your best to move away from it if you can.

------
rian
This is totally on point. Ref-counting / RAII is the only sane way to do
resource management. It's super lightweight and easy to understand.

The vast majority of resources are short-lived and don't create cycles. Our
garbage collections "systems" should be designed for this case.

Cycles are a special case required for few data structures. They are not the
norm and we shouldn't ship a huge heaping mess of a garbage collection system
and make everything else slower to account for this rarely used special case.

Anyway people programming today shouldn't be thinking in terms of pointers and
references. We should be thinking in terms of VALUES. Finite values have no
cycles! The Haskell and C++ community have already embraced this, everyone
else is still catching up.

Yet another reason why Java is a horrible language holding people back and the
JVM is basically a hamster wheel keeping itself busy.

~~~
voyou
RAII is great, and should be used where appropriate. I'm not sure what it has
to do with reference counting, though. Reference counting is just a
particularly slow and unreliable form of garbage collection; I don't see why
you would ever prefer it to a proper garbage collector.

~~~
rian
Classic RAII in C++ is a limited form of ref-counting (only one ref).

I have to disagree with your second statement, ref-counting is extremely fast.
If you consider it a GC then it's the _fastest_ GC. It's also deterministic
and does not pause.

~~~
pcwalton
> If you consider it a GC then it's the fastest GC. It's also deterministic
> and does not pause.

No, it's not, not unless you use a lot of cleverness. "We find that an
existing modern implementation of reference counting has an average 30%
overhead compared to tracing…" (They did perform a lot of optimizations to get
it up to speed with tracing garbage collection... however, these are far
beyond what shared_ptr does.)

[http://users.cecs.anu.edu.au/~steveb/downloads/pdf/rc-
ismm-2...](http://users.cecs.anu.edu.au/~steveb/downloads/pdf/rc-
ismm-2012.pdf)

~~~
rian
I'm a bit skeptical of the results of that paper without the seeing the source
code and in what contexts they are performing the comparison. One can always
find situations where one scheme is faster than the other but I'm not totally
sure if micro benchmarks are representative of the real world. In long-lived
servers, ref-counting can be preferable because it avoids random pauses. Maybe
it's a latency vs throughput performance dichotomy. But yeah, thanks for
posting that.

~~~
danbruc
Reference counting doubles the number of memory access - you have to update
the reference and the counter every time. That is a big performance hit.
Reference counting may randomly halt your code, too, because you never know
when you hit zero and the resource gets freed.

~~~
mcguire
And you don't know how many resources will be freed when you drop the last
pointer to that giant tree structure.

Of course, you could queue up and lazily delete the resources, but then you're
back to nondeterministic behavior.

------
codeka
When was this written? It seems to be at least half a decade out of date.
Certainly nothing it espouses is actually new...

~~~
mcguire
Indeed. The "garbage collection" described seems to have been state-of-the-art
in the late '80's. And, of course, it's compared to features that weren't
officially added to C++ until 2011.

Yes, garbage collection is inappropriate for things with finalizers---for
resources other than memory. Other than that, I can't see anything useful
here.

------
dllthomas
Returning closures from functions without massive copying and/or massive
headache seems to pretty well demand a GC.

------
eigenrick
I skim read the article, but I think the author is absolutely right, _only_
for those Resource specific instances he pointed out that need to to have an
explicit lifetime.

For everything else, there is garbage collection.

------
troebr
Isn't it simply a form of garbage collection? Instead of garbage collecting
all the unused objects at random moments, an object is garbage collected as
soon as it goes out of scope.

The way I understand it, no garbage collection means you take care of cleaning
after yourself, whereas garbage collectors do it for you. It sounds like what
he calls resource acquisition.

~~~
rgo
No. I think cleaning up, manually or automatically, after objects go out of
scope is not the same as garbage collecting. "Garbage" here implies objects
that linger beyond their usage scope, from construction to destruction, and
now have become garbage taking up memory that can only be reclaimed by an
extraneous routine, ie the GC.

------
shmerl
I surely prefer RAII to any garbage collection. It doesn't mean it's
completely wrong though and has no uses at all. It's just of limited
usefulness that's why languages which overuse it claiming it's always needed
(like Java) are too limiting. If anything, GC should be optional, not
mandatory.

~~~
AnimalMuppet
Well... if C and (early) C++ showed us anything, it's that programmers aren't
very good at managing memory. (It's not just leaking memory. Accessing deleted
memory is much worse.)

So, what are you going to do? Just say "Most programmers are lousy, deal with
it"? Introduce garbage collection, which solves 90% of the problem (memory)
with no programmer intervention, but does nothing whatsoever to help the other
10% of cases (files, sockets, mutexes, etc.)? Rely on RAII, which in turn
relies on programmer education and discipline? Or do you have another
alternative?

Optional GC could be interesting - maybe some kind of a switch that you set at
compile time that says "this program uses GC" or "this program uses
destructors". But then you're essentially talking about two similar, but
different languages (like, say, Java and C++).

~~~
shmerl
You are right, in case of C++ the language doesn't ensure safety because of
RAII since one can always go back to manual new / delete and etc. But in newer
languages this can be avoided.

Rust actually does that, and GC there is optional by the way.

------
cammsaul
Worth a mention: In Objective-C with ARC, the compiler treats every pointer to
an Obj-C object as a std::shared_ptr or std::unique_ptr (static analysis is
used to optimize out unnecessary reference counting), unless it's marked
otherwise with keywords such as __weak, __unsafe_unretained, etc.

------
pikachu_is_cool
GC is not 'wrong'. It's just an overused method of memory management. There
are plenty of valid uses for GC where ref counting wouldn't make sense. Lua
immediately comes to mind.

------
mandelbulb
I think the author's core issue is that in most cases GC aren't optional.

An optional opt-out would invite a wider range of developers.

------
Nate630
Sometimes GC is nice, sometimes it isn't. I like langs that give you options.

------
Antiquarian
> Don't believe me? Then don't skim-read.

My hero.

~~~
lafar6502
Yup, some people think they can teach others as soon as their first C++
program compiles without errors.

------
saimey
Why is your nickname in green color?

~~~
mpweiher
Very fresh (16h old at the time I wrote this comment).

------
bananaoomarang
What's the catch?

~~~
taybin
You can run out of space on the stack, which has a smaller size than the heap,
where new'd objects are placed. This can easily happen with large objects that
own other large objects and it is difficult to debug without knowledge of the
implementation of all of the child objects.

~~~
dmunoz
That's too strong of a critique. You can have just a handle on the stack, with
the data on the heap. When the destructor of the handle is executed, it
deallocates the data on the heap. This is what almost all resource holding
library or user-defined types do in C++.

~~~
taybin
Sure, but at that point, it's not RAII all the way down. RAII is a great
wrapper around new and delete, but I saw people talking about how they rarely
use new and delete anymore.

Edit: Actually, reading about auto_ptr<> and friends I see what you're getting
at. The codebase I worked on some years ago didn't use those so I'm not
totally familiar with that idiom for wrapping heap allocation.

------
lafar6502
GC was developed because too many programmers were morons, not the other way.

~~~
thirsteh
That's a pretty moronic way of looking at it.

------
lispm
1-

