Real-Time Garbage Collection Is Real

userbinator · on Nov 10, 2014

our running programs may exceed the amount of space we want them to take

The majority of real-time systems are the small embedded ones where both speed and size are usually highly constrained, so this doesn't look as useful as it may seem. It's well known that GC overhead decreases with increasing available memory, so the result shouldn't be so surprising. A relevant phrase I've heard is "garbage collection is free only if memory is worth nothing."

Limiting the amount of work the GC can do also means that some quite subtle bugs can arise from exceeding the "allocation rate" that it can handle, which is something that might not be as easy to determine in GC-using code.

but not the amount of space that we estimate they could possibly take

I think it's a bad sign when the word "estimate" is used in talking about a real-time system... this is an area of guarantees and proofs, not educated guesses.

danbruc · on Nov 10, 2014

It is not only useful for embed system with hard real-time constraints but may also, for example, improve server systems with (very) soft real-time constraints by reducing response time jitter due to garbage collection cycles.

ajross · on Nov 10, 2014

> It turns out that RTGC is very real, and research on it is very active.

I remember being promised the same stuff at JavaOne in 1997. It's a solved problem, we're just waiting on an implementation.

We're still waiting.

That doesn't mean any of this stuff isn't interesting (though I gotta be honest it's getting less so for me every year), or that such a thing is impossible. But that title is dancing around a rather different definition for "real" than most of us use.

pjmlp · on Nov 10, 2014

Why waiting?

http://www.atego.com/de/products/atego-perc/

https://www.aicas.com/cms/en/JamaicaVM

One good thing about Java is the vendor eco-system.

bad_user · on Nov 10, 2014

At least a commercial JVM with a nonblocking GC is available and people are using it for real time stuff in the finance industry - http://www.azulsystems.com/

youngthugger · on Nov 10, 2014

I think more languages should adopt Swift and Objective-C's automatic reference counting. The only downside to ARC is retain cycle's which rarely happen in my experiance. A deterministic object life cycle just feels right.

rayiner · on Nov 10, 2014

Automatic reference counting has its own issues: 1) allocating short-lived objects is not cheap, because they call down into an underlying malloc/free; 2) as a consequence of (1), throughput is lower; 3) getting acceptable overhead for manipulating references in the heap/stack requires compiler optimizations to elide those reference counts, which adds a layer of complexity to your mental model of the program; 4) implementing the reference-count operating in a multi-core system requires pretty heavy-weight atomic primitives, and race conditions can result in incorrect reference counts.[1]

One of the more interesting avenues of research, especially in mobile devices where you don't want to pay the power cost of having a 2x larger heap your live size, are various hybrids of garbage collection and reference counting. A neat, easy to understand one is: http://users.cecs.anu.edu.au/~steveb/downloads/pdf/rcix-oops....

Also interesting is what you can do when you have more information about what pointers can point to (as in Rust). It's not so much that reference counting in Rust is cheap, but that the language offers a lot of tools to let you avoid reference counted pointers in the first place, in favor of references with static lifetime guarantees.

[1] What Swift or Obj-C do when you overwrite a pointer-valued field is to do a objc_release() for the old pointer, and a objc_retain() for the new pointer. If two threads write to a field at the same time, you can corrupt the reference count (even if the objc_release()/objc_retain() operations are themselves atomic!) As far as I know, Apple's obj-c runtime does not attempt to handle this situation. See: http://clang.llvm.org/docs/AutomaticReferenceCounting.html#o....

ezy · on Nov 10, 2014

I prefer well known memory lifetimes, but I'm undecided about ARC (or somewhat equivalently: std::shared_ptr, etc.). It can be a little confusing and overwrought at times, and I don't think the lifetimes are as clear as they could be.

IMO, more languages shouldn't assume that there is a single memory allocator. That's one of the worst assumptions I see in systems languages -- even C++ (before C++11) got this entirely wrong. Swift gets this wrong, Go gets this wrong. Rust is probably a little better because it actually has a static idea of memory scope, but I haven't seen a way to swap out the memory allocator in various contexts.

Most projects I've been involved with have used region/arena allocators. Not only do you mostly avoid the non-determism of your average GC, but you avoid the hassle of fine grained reference counting (in most cases). This relies on you choosing the scopes for your regions appropriately, but there usually is a clear scope to attach things to (e.g. frames, iteration of an event loop, etc.).

mamcx · on Nov 10, 2014

You are talking about a arena collector? Like this:

http://wiki.luajit.org/New-Garbage-Collector

Exist some pre-madethat can be used for a new project, for use with LLVM or luajit?

kevinnk · on Nov 10, 2014

Maybe you're referring to something else, but std::allocator was in c++98

ezy · on Nov 10, 2014

Yes, but (according to the spec) it was stateless -- which made it worthless for the purposes I'm describing. The original purpose was to support custom static strategies for allocation (e.g. i86 near/far pointers or one memory pool for one kind of object for the whole program), not dynamic memory pools and arenas. C++11 fixed that, which is why I mentioned it.

That said, by C++03, most STL implementations supported stateful allocators (and that's what I've used when I've had to use C++), but the standard took a while to catch up.

Dewie · on Nov 10, 2014

> , but I haven't seen a way to swap out the memory allocator in various contexts.

You mean something like this?

https://github.com/rust-lang/rfcs/pull/244

I think part of it was inspired by something similar from C++. D-lang also seems to have something like this.

ezy · on Nov 10, 2014

Yup. That's excellent. Rust changes so fast, I can hardly keep up. :-)

Dewie · on Nov 10, 2014

Sadly though I haven't been able to find any mention of if and when it will be realized. But the sentiment - from what I've gleaned from discussions on it - seems to be that something like that should/need to happen.

chrisseaton · on Nov 10, 2014

"The only downside to ARC is retain cycles"

Does ARC allow moving? If not, what happens when your heap becomes fragmented? Isn't that a downside?

Another downside is that many lock-free data-structures require a GC.

ltta · on Nov 10, 2014

As far as I know ARC (like manual memory managenent) can lead to release cascades that can also cause application delays in games eg. Something you have to watch for. Malloc() and free() and equivalents do quite a bit of work under the hood (check today's linked tcmalloc article for example) that takes time.

psykotic · on Nov 10, 2014

Release cascades can be a big problem even with plain old RAII in C++. I once diagnosed a performance issue in a C++ network server where the server listening thread would sometimes temporarily hang when clients disconnected. The culprit turned out to be the destructor of a large std::map that directly and indirectly accounted for tens of millions of heap-allocated objects that had to be individually destructed and freed. It's very rare for destructors to have intentional global side effects, so this kind of work could almost always have been done asynchronously, at least in principle.

An unfortunately common symptom of large C++ applications is that they take forever to shut down because they insist on calling destructors on everything in the known universe. In reality, most applications only have a tiny handful of resources that you truly have to release yourself at shut down, and memory certainly is not one of them.

mamcx · on Nov 10, 2014

So, what do? Perhapsh mark a object as "kill me without mercy?"

psykotic · on Nov 10, 2014

If the mass objects have trivial (ignorable) destructors and don't own any heap-allocated subobjects then it's not a problem. You can have top-level objects like levels or documents or sessions own everything directly and indirectly under them and allocate those things out of a private heap that can ideally be reused but otherwise be bulk freed in a few calls. Unfortunately, the standard C++ philosophy around RAII and value semantics works against this. Since there is something to be said for the elegance and upfront convenience of the value semantics approach, it comes down to a trade-off.

mamcx · on Nov 11, 2014

So, if I design a language, how design it well? I suspect is easy to kill all values (ints, str, etc) and arrays/list of it. But how know what is a "don't own any heap-allocated subobject" (sorry to ask if this is obvious, I have not deep experience in this matters)?

Is not circular references the real problem? And what when the object reference a resource like a file/handle/database/etc?

So, could be good idea to mark objects like this (maybe in separated areas of memory?):

Instant kill: Ints, Strs, Bools, Array of all of this

Safe destructors: If it hold a resource (file, handle)

But don't know what to do for objects like Customer.Orders = List<Orders>[Order1.Customer]

ltta · on Nov 13, 2014

This is not necessarily a language-level decision (except for GC, it helps in this case spread out the deallocations when using incremental GC).

You just have to be aware what happens in the destructor when you manually free an object. Libraries can make knowing this more difficult.

I guess you could build incremental alloc/release pools (even with reuse) or such things but it comes down to being aware of the problem as described and avoiding cascaded releases.

And to be honest this is not a super common problem but it can happen.

Ded7xSEoPKYNsDd · on Nov 10, 2014

This is just an idea, but how about adding a function to your allocator that turns all future deallocations into a no-op?

ltta · on Nov 10, 2014

And also check out this paper comparing reference counting and GC, and the accompanying discussion on LtU.

http://lambda-the-ultimate.org/node/2552

(Although I think the paper has problems and bias)

nraynaud · on Nov 10, 2014

I had big troubles with cycles, the trick is that you don't know that you are leaking until it's too late.

Another trouble is the cache pollution.

gkuan · on Nov 10, 2014

There is also this paper by Bacon et al that develops a framework that shows the relationship between refcount and tracing GC (duals of each other) plus the various optimizations of the two: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.143....

Dewie · on Nov 10, 2014

> The only downside to ARC

Doesn't reference counting typically have a worse throughput than garbage collection? That's what I've read anyway.

lumpypua · on Nov 10, 2014

GC has better throughput at the expense of significantly increased memory usage, and variable latency for any individual task.

On servers, that's fine.

On my phone, I'll take lower memory usage and predictable latency any day.

Yay refcounting.

tolmasky · on Nov 10, 2014

There are certainly classes of mobile apps that need the guarantees of lower/deterministic memory usage. In my experience, that is not the common app being written however. These concerns seem largely like fantasies not backed up by any concrete evidence for your every-day twitter client/mail app/weather/whatever. Generic list-based apps simply do not need to be acting as if you need to squeeze every ounce out of the processor/ram anymore. They would instead benefit much more from not having to worry about whether self in the closure you're creating should be weak or not. Especially when you consider that many (most?) of the apps on the App Store that are actually pushing things to the limit -- games -- are running in a (old) GC environment (C# in Unity).

mnem · on Nov 10, 2014

I would be quite surprised if most App Store games were created with Unity - it's expensive and (comparatively) cumbersome to develop in, especially if you're just using 2D graphics (even with the fancy new Unity2D stuff). I can't find any stats on mobile game engine use though, which is a shame because I think it would be interesting to know.

When creating Unity3D games, you have to be reasonably careful about garbage collection - lots of nasty performance dips result if you assume that it just works. Generally you have to cobble together a mixture of object pools and statements attempting to force collection at a convenient point in the game flow amongst other things. It would be quite nice if you could turn it off for specific portions of code :)

mnem · on Nov 10, 2014

Found some interesting statistics on this site: http://www.iresearchchina.com/views/5604.html

I suspect to get anything more detailed you may have to purchase a paid report from one of the various analytics companies.

Sorry, I've wandered rather off topic...

bad_user · on Nov 10, 2014

Unity uses an older Mono (2.10 I believe) which has a poor GC that's not even deterministic, so you can easily suffer memory leaks. It is also not generational. Newer versions of Mono come with a much better GC.

bad_user · on Nov 10, 2014

Ref counting is not deterministic. First of all because while looping on an atomic reference is non blocking, it is not wait free. Malloc/free are not deterministic either, can be quite expensive and while real-time implementations exist, that's not the malloc/free you end up using. With a generational GC allocating new objects involves just incrementing a pointer, deallocating short lived objects happens in bulk, so the cost for short-lived objects is similar to stack allocation. With ref counting memory gets fragmented, apps like Firefox have been suffering for years from fragmentation and for Firefox there was a sustained and very significant effort to fix it.

This can only be solved by either doing stack allocation or by building object pools. And this is for people that know what they are doing.

Rust is the only language (I know) that tries solving this with the ownership concept in the language, but then Rust will have problems in implementing immutable data structures that can be shared amongst threads, data structures which are doing structural sharing, so people will expect reads to scale, except there will be non-obvious contention happening due to usage of reference counting.

The latency in state of the art mainstream GCs is also NOT variable. Good garbage collectors allow you to control the max latency and frequency of STW pauses. That is not the issue for real-time requirements - the issue with real-time being that STW is unacceptable. But so is reference counting.

On the memory requirements, I don't buy it. My Android phone has more memory and more CPU capacity than my computer did 7 years ago. And I'm not seeing a difference in behavior to my iPad. Surely for games it pretty bad to drop frames, but most apps are not games and games are using highly optimized engines built in C++anyway.

pjmlp · on Nov 11, 2014

> Rust is the only language (I know) ...

Have a look at Cyclone, ATS and ParaSail.

nraynaud · on Nov 10, 2014

it's not really "significant" if you don't optimise it that way. It's only the first generation scavenger that needs space for breathing.

lumpypua · on Nov 10, 2014

I do agree with this critique. Object allocation patterns that suck down space in a GC'ed system equally abuse a refcounting system, just you're paying in CPU time rather than memory.

adrianm · on Nov 10, 2014

All GC algorithms are not created equal.

What algorithm are you referring to when you say "better throughput at the expense of significantly increased memory usage"?

Also, why is unpredictable latency acceptable for you on servers but not phones? Wouldn't a latency spike on remote requests from an application degrade user experience just as much as if that latency were localized to the phone?

lumpypua · on Nov 10, 2014

> What algorithm are you referring to when you say "better throughput at the expense of significantly increased memory usage"?

Here's a discussion of the particular paper behind that statement: [1]

I'd also like to recommend the paper "A Unified Theory of Garbage Collection" [2] that breaks down the divide between GC and refcounting. There is a lot of gray area between tracing GC and refcounting. You can make different tradeoff decisions in different parts of your collection algorithm. But fundamentally it breaks down to time-space tradeoffs—you want to save time and get throughput, you're gonna eat some extra space.

> Also, why is unpredictable latency acceptable for you on servers but not phones? Wouldn't a latency spike on remote requests from an application degrade user experience just as much as if that latency were localized to the phone?

We as developers make UI efforts to mitigate network unreliability (fallacy #1 of distributed computing: the network is reliable) so it's ok if a server is being temporarily shitty. It's a lot harder to keep responsive, smooth UI behavior in the face of dropped frames and long GC pauses.

[1] http://stackoverflow.com/questions/2982325/quantifying-the-p...

[2] http://www.cs.virginia.edu/~cs415/reading/bacon-garbage.pdf

gaze · on Nov 11, 2014

I'd love to try implementing C4 or Metronome or Staccato, and I have some interesting applications for them too. The problem is all this development is sponsored by Azul, Microsoft, and IBM. All the Bacon papers have corresponding IBM patents, all the Petrank papers have corresponding Microsoft patents. It sucks. Even if I come up with something new, it'll look like a slight variant on something someone else has done and I don't want to get sued!

bitwize · on Nov 10, 2014

One of the reasons why Lisp is fading into irrelevance is that GC is obsolete tech. You should be using RAII, smart pointers, or ARC for all dynamic-lifetime resource management depending on language and toolset for completely deterministic object lifetimes.

This is also a factor in why Android phones are orders of magnitude slower than equivalently-specced iPhones, even with the new ART.

m0th87 · on Nov 10, 2014

> One of the reasons why Lisp is fading into irrelevance is that GC is obsolete tech.

RAII, smart pointers and ARC aren't free lunches, which is why people had moved from them to GC in the first place.

Of the top 10 languages on github [1] that require memory management, 7 of them use GC [2].

> This is also a factor in why Android phones are orders of magnitude slower than equivalently-specced iPhones

That's a pretty nutty assertion to make without data.

1: http://adambard.com/blog/top-github-languages-2014/

2: Discluding CSS and shell since they don't require memory management. Python's default implementation actually has both RC and GC.

waps · on Nov 11, 2014

> RAII, smart pointers and ARC aren't free lunches, which is why people had moved from them to GC in the first place.

Yet GC solves 10% of the problem that RAII solves. Tbh I would call them "similar". GC has the advantage of being much more tolerant of sloppy programming, but if you compare the complexity of GC with the complexity of RAII, I'd say RAII has a clear advantage.

pjmlp · on Nov 11, 2014

Most RAII use cases can be taken with using/with/try/lambdas

munificent · on Nov 10, 2014

> are orders of magnitude slower

If this was true, an Android phone would refresh less than once per second. How about "why Android phones are maybe 20% slower" or some more actually within the bounds of reason number?

bad_user · on Nov 11, 2014

It's funny that whenever Apple does something, it shifts fashion. Reference counting is suddenly hip, when garbage collectors have been invented to solve problems with reference counting and those problems are still there with ARC.

I'm waiting for the day in which Apple will introduce an optinal GC as alternative to ARC on iOS, along with proclaiming red as the new purple.

bitwize · on Nov 11, 2014

Apple did introduce optional GC into its Mac OS X runtime. It received little uptake and was subsequently removed because Apple and its developer base realized ARC was the better solution.

If your code is leaking because of reference cycles, maybe that's a problem with your code, not ARC. (I.e., structure your data as an acyclic graph or tree and/or use weak references rather than expecting the runtime to compensate for your sloppiness.)

gjm11 · on Nov 11, 2014

There are plenty of situations in which reference cycles would be perfectly sensible (and not at all "sloppy") ways to organize your data if it weren't for reference counting.

For example: suppose you are writing some code that analyses a strategy game. You have (let's say) an object representing a position in the game; each position object has references to the positions you can move from there to. If repeated positions are possible, then you have a reference cycle.

For example: you have a tree-like structure in which things contain other things. It's convenient for each thing to have a reference to its parent, and for each thing to have references to its children. Boom, reference cycles everywhere.

For example: you are implementing a programming language that has closures. So each closure object has a reference to its lexical environment, and some of the things in that environment may themselves be closures defined in that environment. Reference cycles. (The same happens if you have classes, and methods have references to the class where they're defined.)

Of course, all these things can be avoided. Often you can pick some subset of the references (e.g., the parent pointers) and make them weak references, or you can restructure your code to make some of the references go away (e.g., positions don't have references to their successors, they have methods/functions for generating them). But the only reason to do those things is that you need to avoid reference cycles. Refcounting isn't (in these cases) kindly helping you improve your code by forcing you to avoid sloppy constructs; it's taking what would otherwise be perfectly sensible code, making it sloppy, and then forcing you to do something else to avoid the resulting memory leaks.

pjmlp · on Nov 11, 2014

> Apple did introduce optional GC into its Mac OS X runtime. It received little uptake and was subsequently removed because Apple and its developer base realized ARC was the better solution.

You are telling the story wrong.

The reality was the the GC never worked properly, specially when mixing frameworks compiled with and without GC, leading to core dumps.

There there was a list of corner cases causes by having C as part of Objective-C.

Apple did not introduce ARC because it was little uptake.

They introduced because the GC never worked properly, so devs had better things to do than GC core dumps and ARC is a better approach to the Objective-C semantics.

chrismorgan · on Nov 10, 2014

This site could be improved for narrower viewports by changing the .site width to max-width. Combine with something like `padding: 0 0.5em;` for better results.

pshc · on Nov 10, 2014

I feel like GC is an evolutionary dead end. Too much indeterminism and action-at-a-distance. Rust and virtual reality are two clear indicators to me that we're exiting the age of GC; the opposition isn't just Apple anymore, though they dealt the first blow.

kibwen · on Nov 11, 2014

The Rust developers will be the first to tell you that if you can get away with writing your application in a GC'd language, you probably should. Garbage collectors really are fantastic pieces of technology if you aren't heavily bound by memory or speed constraints.

Rust exists to fulfill a need in the domains where GCs are not acceptable, not to obviate GCs entirely. It merely offers the industry a viable alternative to the typical ravages of rampant manual memory mismanagement.

thoughtpolice · on Nov 10, 2014

TIL Rust will be single-handedly used to rewrite every single GC'd program in existence, and furthermore only VR systems like Oculus will be relevant after that, so why bother dealing with any of the actually-hard problems of the real world and improving the existing technology we'll have today, and for years to come?

Good to know.

pjmlp · on Nov 11, 2014

> the opposition isn't just Apple anymore, though they dealt the first blow.

Funny how the technical inability to write a decent GC for Objective-C made them the hero of ARC.

AceJohnny2 · on Nov 10, 2014

what's virtual reality in this context?

pshc · on Nov 10, 2014

Oculus Rift and friends. VR needs extremely consistent latency and has very tight deadlines. Ideally, you render 90 FPS and never miss a frame.

GC pauses are out of the question, so if you're going to use GC, you need to be real-time without any degenerate stop-the-world cases. The linked article ostensibly delivers this, but there's no real-world evidence. Given your tight time budget, you also don't want to pay the "GC tax". GC just makes your life harder in this environment.

There also isn't really any good reason to write a 3D engine in GC'd languages given a) existing C++ engines and b) sightings of Rust on the horizon.

pjmlp · on Nov 11, 2014

I know of missile control systems done in Real Time Java.

Just one example,

http://www.atego.com/pressreleases/pressitem/aonix-perc-ultr...

Better no GC jitter....

AceJohnny2 · on Nov 10, 2014

Ah, when lumped with GC I was wondering if there was a new trend in memory management that was confusingly called VR as well :)

Dewie · on Nov 10, 2014

How does regions fit into this? IIRC regions have deterministic lifetimes, and all values/objects allocated in a region are freed in bulk when the region is freed (so it seems to only depend on how complex the underlying allocation is).

http://en.wikipedia.org/wiki/Region-based_memory_management

edsiper2 · on Nov 10, 2014

man 3 free

tokenrove · on Nov 10, 2014

A typical implementation of free(3) does not give real-time guarantees; whether you use GC or not, you need to know things about your allocator to have any kind of latency guarantees. I get a little disturbed when people treat malloc() and free() as if they were constant-time operations.

tkellogg · on Nov 10, 2014

Reminder: There are real-time allocators available, they just tend to not be used because the average case is worse.

* http://repository-of-projects.googlecode.com/svn/trunk/docum... * http://www.ijera.com/papers/Vol2_issue3/GR2311511156.pdf

nraynaud · on Nov 10, 2014

more exactly malloc and free typically gives you shitty time guarantee.

The usual thing people do in real time is bailing out of any dynamic allocation, they know malloc is not good, they are afraid or don't want to deal with real time GC, and/or they know that if they do all the allocation first, they have a grasp on the worst case of their loop (I think the real answer is that it's a tradition).