
The Memory Management Reference - bjourne
http://www.memorymanagement.org/index.html
======
millstone
The FAQ at [http://www.memorymanagement.org/mmref/faq.html#mmref-
faq](http://www.memorymanagement.org/mmref/faq.html#mmref-faq) reads more like
an advocacy piece for garbage collection than an unbiased review.

I am by no means opposed to garbage collection, but I have noticed a
"strategy" that GC advocates use that I think is dishonest.

An empirical fact about garbage collection is that it has "more knobs to
turn." See for example ghc's RTS options for its GC, or the enormous list of
HotSpot GC options. This is because the space of GC algorithms is very large
(conservative, copying, etc.), and the algorithms make different, stark
tradeoffs.

And because it is so large, every objection to garbage collection can be met!
Garbage collection is too slow? Here's a two-space copying collector,
optimized for allocation throughput. Memory usage too high? Here's a mark-and-
sweep compacting collector. Latency too high? Here's a pauseless concurrent
collector. Want GC in C? Here's a conservative collector.

More options are good! But here's the sneakiness: you can't have all of them.
You aren't going to have a compacting conservative collector, or a pauseless
copying collector. Etc.

For example, the FAQ asks, "Can I use garbage collection in C++?", and answers
yes, you can! It then goes on to say that "garbage collection is often faster
than manual memory management. It can also improve performance indirectly, by
increasing locality of reference and hence reducing the size of the working
set, and decreasing paging."

The implication is that if you use GC in C++, you will see better locality of
reference and reduced paging. But of course you will not.

~~~
loup-vaillant
> _The implication is that if you use GC in C++, you will see better locality
> of reference and reduced paging. But of course you will not._

I have seen quite a lot of C++ code that manages memory in a very naive way,
with no thought put into the memory layout. Just stuff your data in some class
hierarchy, then let RAII do the management. In this case, I wouldn't rule out
the possibility for better locality or reduced paging if a GC were used
instead. (One caveat: if you use std::vector as your default container, as you
probably should, then a GC will probably not meliorate anything.)

But to me, naive manual memory management is symptomatic of a deeper problem:
thinking you need performance, when you actually don't. Basically, unless your
performance needs are so stringent that you require fancy stuff like custom
allocators or memory pools, then a GC is probably fast enough for you.

The vast majority of C++ code I have seen don't use this fancy stuff.
Therefore, the vast majority of C++ code I have seen would have been fast
enough if language with GC were used instead.

~~~
Verdex
So I suppose there are two false modes of thought: Manual memory management is
intrinsically faster, and we really need this performance (sans profiling of
course). On the manual memory management side the interesting thing to
investigate is everything that malloc and free actually do. It was quite the
eye opener when I saw an allocator implementation for the first time. I guess
that's one of the reasons super critical real time applications don't allow
heap allocations (ex. check out rule 5 of the JPL coding standard [1]). On the
performance side the standard plea continues to be invaluable, please profile
before optimizing.

On the other hand, data locality and prefetching is like getting the
invincibility star in mario (which you touched on with your std::vector
caveat, also checkout about 24 minutes into [2] for some pretty graphs and an
explanation). And sometimes you really do need the ability to get all of the
performance your machine can give you.

The above dichotomy is very frustrating to me because regardless of which side
you fall on there are embarrassing failures. Obviously, if you don't need the
performance it's not good that a bunch of needless energy is spent on
"performance improvements". But if you do really need the performance, then
you get to deal with the "we're doing it in c/c++ therefore it's fast"
discourse. And finally assuming that your group really does appreciate c/c++
and understands how to achieve high performance, then you still have to deal
with c/c++ in the non-performance related aspects of your product.

For this reason, I've been glancing hopefully in Rust's direction. Maybe it
(or some other future language like it) can help alleviate the pressures
between GC and manual memory management. A system where both are possible and
neither are required sounds ideal. And if we are very lucky maybe it will
spark additional dialogue in our industry concerning when we need manual
memory management and when we need GC.

[1] - [http://lars-lab.jpl.nasa.gov/JPL_Coding_Standard_C.pdf](http://lars-
lab.jpl.nasa.gov/JPL_Coding_Standard_C.pdf)

[2] -
[http://channel9.msdn.com/Events/Build/2014/2-661](http://channel9.msdn.com/Events/Build/2014/2-661)

------
hga
The bibliography stops in 2002.

Here's a 2011 book that's sort of an advanced, vs. updated version of it's
1996 predecessor: [http://www.amazon.com/The-Garbage-Collection-Handbook-
Manage...](http://www.amazon.com/The-Garbage-Collection-Handbook-
Management/dp/1420082795/) It covers the recent state of the art up to it's
publication (e.g. Azul's Pauseless but not C4).

------
cousin_it
From the FAQ:

> _I 've heard that GC uses twice as much memory_

> _This may be true of primitive collectors (like the two-space collector),
> but this is not generally true of garbage collection. The data structures
> used for garbage collection need be no larger than those for manual memory
> management._

Is that true? What GCs use less than 2x memory and have good performance?

~~~
hga
Generational garbage collectors achieve good/better performance by more
frequently sweeping one or more heaps with younger items before promoting the
survivors to an older heap, based on the thesis is that most garbage is short
lived.

Modern marking collectors are evidently pretty fast and many collect mostly
concurrently with running code, eventually the whole heap has to be collected
and the "world stops". The standard (being replaced?) Hotspot collector,
without any special hardware tricks, was said in the Pauseless GC paper to
sweep 1 GiB/second on standard x86_64 hardware.

As in, when it has to do a full collection and stops everything else it goes
that slowly, or at least it can be way too slow for many users with 10s or
100s of GiB of memory.

Pauseless and C4 use hardware tricks, C4 on standard hardware, like using
large pages, and making mass VM operations in one (added to Linux) call, only
invalidating the TLB at the end. Oh, yeah, however efficient/inefficient they
are, they collect concurrently with running code.

------
molixiaoge
great

