Hacker News new | past | comments | ask | show | jobs | submit login

> it's deterministic

Sort of. It's only deterministic in the sense that as soon as an object is not referenced it is freed, but it's not entirely deterministic because you don't know exactly when (or even in what thread) that "as soon as" happens.

> it detects new garbage immediately after it becomes garbage, which keeps the resource usage at a minimum.

Yes, but that comes at a cost to performance. One of the observations that tracing GCs make is that you may not need to do any work to deallocate memory at all. For example, a simple copying algorithm used to manage the young generation in many generational collectors is based on the fact that more objects die young than survive, and it only does work to keep the (fewer) surviving objects alive and does not do any work to free (the many) dead objects. As a result allocation itself becomes much faster. Tracing indeed requires more memory, but it automatically uses that memory to reduce the cost of memory management. In fact the (amortized) cost of memory management can be reduced arbitrarily by enlarging the heap and using a tracing collector, so much so that it could, in principle, be made cheaper than even stack allocation.




> but it's not entirely deterministic because you don't know exactly when (or even in what thread) that "as soon as" happens.

But you do know when and where: it's at the exact moment the last reference to the resource was released, and in the exact thread that did this last release. In many cases, this will be a single place and thread; in other cases, it will be a well defined set of places and their corresponding threads, with the one coming temporally last being the one where the resource is freed (that is, the resource is held for as long as it's needed, and no longer).

> > it detects new garbage immediately after it becomes garbage, which keeps the resource usage at a minimum.

> Yes, but that comes at a cost to performance.

I was talking not only about memory, but also about other (more limited) resources like file descriptors. But yes, for memory there is a time/space tradeoff: reference counting minimizes memory usage, but for that it has to do more work by constantly freeing and allocating memory.

> Tracing indeed requires more memory, but it automatically uses that memory to reduce the cost of memory management.

Which occupies memory which could be used instead to reduce other costs. For instance, this extra memory could have been used by the operating system's disk cache.


> in other cases, it will be a well defined set of places and their corresponding threads

But that's also true for GC, only the set of places and threads is larger :) So I agree that refcounting is a more deterministic garbage collection than tracing, but it's still not deterministic. Deterministic is when the "set of places and threads" is of size 1.

> Which occupies memory which could be used instead to reduce other costs. For instance, this extra memory could have been used by the operating system's disk cache.

This is not quite true. A given machine, with a given set of processors P, running application X, has a certain processing capacity beyond which it cannot go. To reach that maximum capacity it has a certain working set size, M(P, X), that it can put in RAM. It can cache it from disk or from the network, but increasing that cache beyond that point won't help you with performance any more (especially for distributed caching, where the distributed workload would just cause lots of cache invalidation). Caching anything beyond M would not help (and perhaps could even hurt). But you can increase RAM by any amount beyond M to reduce the cost of memory management almost arbitrarily low with a tracing GC (the increased throughput may increase M, too, so that more memory could be put to other uses, not less).

This is why so many people want a low-latency GC. You can get machines with 6TB RAM (probably even more now). Very few applications can benefit from an effective working set of that size (unless they have 500 cores or something).

Of course, in environments where RAM is constrained tracing may not be the best choice.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: