our running programs may exceed the amount of space we want them to take
The majority of real-time systems are the small embedded ones where both speed and size are usually highly constrained, so this doesn't look as useful as it may seem. It's well known that GC overhead decreases with increasing available memory, so the result shouldn't be so surprising. A relevant phrase I've heard is "garbage collection is free only if memory is worth nothing."
Limiting the amount of work the GC can do also means that some quite subtle bugs can arise from exceeding the "allocation rate" that it can handle, which is something that might not be as easy to determine in GC-using code.
but not the amount of space that we estimate they could possibly take
I think it's a bad sign when the word "estimate" is used in talking about a real-time system... this is an area of guarantees and proofs, not educated guesses.
It is not only useful for embed system with hard real-time constraints but may also, for example, improve server systems with (very) soft real-time constraints by reducing response time jitter due to garbage collection cycles.
> It turns out that RTGC is very real, and research on it is very active.
I remember being promised the same stuff at JavaOne in 1997. It's a solved problem, we're just waiting on an implementation.
We're still waiting.
That doesn't mean any of this stuff isn't interesting (though I gotta be honest it's getting less so for me every year), or that such a thing is impossible. But that title is dancing around a rather different definition for "real" than most of us use.
At least a commercial JVM with a nonblocking GC is available and people are using it for real time stuff in the finance industry - http://www.azulsystems.com/
I think more languages should adopt Swift and Objective-C's automatic reference counting. The only downside to ARC is retain cycle's which rarely happen in my experiance. A deterministic object life cycle just feels right.
Automatic reference counting has its own issues: 1) allocating short-lived objects is not cheap, because they call down into an underlying malloc/free; 2) as a consequence of (1), throughput is lower; 3) getting acceptable overhead for manipulating references in the heap/stack requires compiler optimizations to elide those reference counts, which adds a layer of complexity to your mental model of the program; 4) implementing the reference-count operating in a multi-core system requires pretty heavy-weight atomic primitives, and race conditions can result in incorrect reference counts.[1]
One of the more interesting avenues of research, especially in mobile devices where you don't want to pay the power cost of having a 2x larger heap your live size, are various hybrids of garbage collection and reference counting. A neat, easy to understand one is: http://users.cecs.anu.edu.au/~steveb/downloads/pdf/rcix-oops....
Also interesting is what you can do when you have more information about what pointers can point to (as in Rust). It's not so much that reference counting in Rust is cheap, but that the language offers a lot of tools to let you avoid reference counted pointers in the first place, in favor of references with static lifetime guarantees.
[1] What Swift or Obj-C do when you overwrite a pointer-valued field is to do a objc_release() for the old pointer, and a objc_retain() for the new pointer. If two threads write to a field at the same time, you can corrupt the reference count (even if the objc_release()/objc_retain() operations are themselves atomic!) As far as I know, Apple's obj-c runtime does not attempt to handle this situation. See: http://clang.llvm.org/docs/AutomaticReferenceCounting.html#o....
I prefer well known memory lifetimes, but I'm undecided about ARC (or somewhat equivalently: std::shared_ptr, etc.). It can be a little confusing and overwrought at times, and I don't think the lifetimes are as clear as they could be.
IMO, more languages shouldn't assume that there is a single memory allocator. That's one of the worst assumptions I see in systems languages -- even C++ (before C++11) got this entirely wrong. Swift gets this wrong, Go gets this wrong. Rust is probably a little better because it actually has a static idea of memory scope, but I haven't seen a way to swap out the memory allocator in various contexts.
Most projects I've been involved with have used region/arena allocators. Not only do you mostly avoid the non-determism of your average GC, but you avoid the hassle of fine grained reference counting (in most cases). This relies on you choosing the scopes for your regions appropriately, but there usually is a clear scope to attach things to (e.g. frames, iteration of an event loop, etc.).
Yes, but (according to the spec) it was stateless -- which made it worthless for the purposes I'm describing. The original purpose was to support custom static strategies for allocation (e.g. i86 near/far pointers or one memory pool for one kind of object for the whole program), not dynamic memory pools and arenas. C++11 fixed that, which is why I mentioned it.
That said, by C++03, most STL implementations supported stateful allocators (and that's what I've used when I've had to use C++), but the standard took a while to catch up.
Sadly though I haven't been able to find any mention of if and when it will be realized. But the sentiment - from what I've gleaned from discussions on it - seems to be that something like that should/need to happen.
As far as I know ARC (like manual memory managenent) can lead to release cascades that can also cause application delays in games eg. Something you have to watch for. Malloc() and free() and equivalents do quite a bit of work under the hood (check today's linked tcmalloc article for example) that takes time.
Release cascades can be a big problem even with plain old RAII in C++. I once diagnosed a performance issue in a C++ network server where the server listening thread would sometimes temporarily hang when clients disconnected. The culprit turned out to be the destructor of a large std::map that directly and indirectly accounted for tens of millions of heap-allocated objects that had to be individually destructed and freed. It's very rare for destructors to have intentional global side effects, so this kind of work could almost always have been done asynchronously, at least in principle.
An unfortunately common symptom of large C++ applications is that they take forever to shut down because they insist on calling destructors on everything in the known universe. In reality, most applications only have a tiny handful of resources that you truly have to release yourself at shut down, and memory certainly is not one of them.
If the mass objects have trivial (ignorable) destructors and don't own any heap-allocated subobjects then it's not a problem. You can have top-level objects like levels or documents or sessions own everything directly and indirectly under them and allocate those things out of a private heap that can ideally be reused but otherwise be bulk freed in a few calls. Unfortunately, the standard C++ philosophy around RAII and value semantics works against this. Since there is something to be said for the elegance and upfront convenience of the value semantics approach, it comes down to a trade-off.
So, if I design a language, how design it well? I suspect is easy to kill all values (ints, str, etc) and arrays/list of it. But how know what is a "don't own any heap-allocated subobject" (sorry to ask if this is obvious, I have not deep experience in this matters)?
Is not circular references the real problem? And what when the object reference a resource like a file/handle/database/etc?
So, could be good idea to mark objects like this (maybe in separated areas of memory?):
Instant kill: Ints, Strs, Bools, Array of all of this
Safe destructors: If it hold a resource (file, handle)
But don't know what to do for objects like Customer.Orders = List<Orders>[Order1.Customer]
This is not necessarily a language-level decision (except for GC, it helps in this case spread out the deallocations when using incremental GC).
You just have to be aware what happens in the destructor when you manually free an object. Libraries can make knowing this more difficult.
I guess you could build incremental alloc/release pools (even with reuse) or such things but it comes down to being aware of the problem as described and avoiding cascaded releases.
And to be honest this is not a super common problem but it can happen.
There is also this paper by Bacon et al that develops a framework that shows the relationship between refcount and tracing GC (duals of each other) plus the various optimizations of the two: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.143....
There are certainly classes of mobile apps that need the guarantees of lower/deterministic memory usage. In my experience, that is not the common app being written however. These concerns seem largely like fantasies not backed up by any concrete evidence for your every-day twitter client/mail app/weather/whatever. Generic list-based apps simply do not need to be acting as if you need to squeeze every ounce out of the processor/ram anymore. They would instead benefit much more from not having to worry about whether self in the closure you're creating should be weak or not. Especially when you consider that many (most?) of the apps on the App Store that are actually pushing things to the limit -- games -- are running in a (old) GC environment (C# in Unity).
I would be quite surprised if most App Store games were created with Unity - it's expensive and (comparatively) cumbersome to develop in, especially if you're just using 2D graphics (even with the fancy new Unity2D stuff). I can't find any stats on mobile game engine use though, which is a shame because I think it would be interesting to know.
When creating Unity3D games, you have to be reasonably careful about garbage collection - lots of nasty performance dips result if you assume that it just works. Generally you have to cobble together a mixture of object pools and statements attempting to force collection at a convenient point in the game flow amongst other things. It would be quite nice if you could turn it off for specific portions of code :)
Unity uses an older Mono (2.10 I believe) which has a poor GC that's not even deterministic, so you can easily suffer memory leaks. It is also not generational. Newer versions of Mono come with a much better GC.
Ref counting is not deterministic. First of all because while looping on an atomic reference is non blocking, it is not wait free. Malloc/free are not deterministic either, can be quite expensive and while real-time implementations exist, that's not the malloc/free you end up using. With a generational GC allocating new objects involves just incrementing a pointer, deallocating short lived objects happens in bulk, so the cost for short-lived objects is similar to stack allocation. With ref counting memory gets fragmented, apps like Firefox have been suffering for years from fragmentation and for Firefox there was a sustained and very significant effort to fix it.
This can only be solved by either doing stack allocation or by building object pools. And this is for people that know what they are doing.
Rust is the only language (I know) that tries solving this with the ownership concept in the language, but then Rust will have problems in implementing immutable data structures that can be shared amongst threads, data structures which are doing structural sharing, so people will expect reads to scale, except there will be non-obvious contention happening due to usage of reference counting.
The latency in state of the art mainstream GCs is also NOT variable. Good garbage collectors allow you to control the max latency and frequency of STW pauses. That is not the issue for real-time requirements - the issue with real-time being that STW is unacceptable. But so is reference counting.
On the memory requirements, I don't buy it. My Android phone has more memory and more CPU capacity than my computer did 7 years ago. And I'm not seeing a difference in behavior to my iPad. Surely for games it pretty bad to drop frames, but most apps are not games and games are using highly optimized engines built in C++anyway.
I do agree with this critique. Object allocation patterns that suck down space in a GC'ed system equally abuse a refcounting system, just you're paying in CPU time rather than memory.
What algorithm are you referring to when you say "better throughput at the expense of significantly increased memory usage"?
Also, why is unpredictable latency acceptable for you on servers but not phones? Wouldn't a latency spike on remote requests from an application degrade user experience just as much as if that latency were localized to the phone?
> What algorithm are you referring to when you say "better throughput at the expense of significantly increased memory usage"?
Here's a discussion of the particular paper behind that statement: [1]
I'd also like to recommend the paper "A Unified Theory of Garbage Collection" [2] that breaks down the divide between GC and refcounting. There is a lot of gray area between tracing GC and refcounting. You can make different tradeoff decisions in different parts of your collection algorithm. But fundamentally it breaks down to time-space tradeoffs—you want to save time and get throughput, you're gonna eat some extra space.
> Also, why is unpredictable latency acceptable for you on servers but not phones? Wouldn't a latency spike on remote requests from an application degrade user experience just as much as if that latency were localized to the phone?
We as developers make UI efforts to mitigate network unreliability (fallacy #1 of distributed computing: the network is reliable) so it's ok if a server is being temporarily shitty. It's a lot harder to keep responsive, smooth UI behavior in the face of dropped frames and long GC pauses.
I'd love to try implementing C4 or Metronome or Staccato, and I have some interesting applications for them too. The problem is all this development is sponsored by Azul, Microsoft, and IBM. All the Bacon papers have corresponding IBM patents, all the Petrank papers have corresponding Microsoft patents. It sucks. Even if I come up with something new, it'll look like a slight variant on something someone else has done and I don't want to get sued!
One of the reasons why Lisp is fading into irrelevance is that GC is obsolete tech. You should be using RAII, smart pointers, or ARC for all dynamic-lifetime resource management depending on language and toolset for completely deterministic object lifetimes.
This is also a factor in why Android phones are orders of magnitude slower than equivalently-specced iPhones, even with the new ART.
> RAII, smart pointers and ARC aren't free lunches, which is why people had moved from them to GC in the first place.
Yet GC solves 10% of the problem that RAII solves. Tbh I would call them "similar". GC has the advantage of being much more tolerant of sloppy programming, but if you compare the complexity of GC with the complexity of RAII, I'd say RAII has a clear advantage.
If this was true, an Android phone would refresh less than once per second. How about "why Android phones are maybe 20% slower" or some more actually within the bounds of reason number?
It's funny that whenever Apple does something, it shifts fashion. Reference counting is suddenly hip, when garbage collectors have been invented to solve problems with reference counting and those problems are still there with ARC.
I'm waiting for the day in which Apple will introduce an optinal GC as alternative to ARC on iOS, along with proclaiming red as the new purple.
Apple did introduce optional GC into its Mac OS X runtime. It received little uptake and was subsequently removed because Apple and its developer base realized ARC was the better solution.
If your code is leaking because of reference cycles, maybe that's a problem with your code, not ARC. (I.e., structure your data as an acyclic graph or tree and/or use weak references rather than expecting the runtime to compensate for your sloppiness.)
There are plenty of situations in which reference cycles would be perfectly sensible (and not at all "sloppy") ways to organize your data if it weren't for reference counting.
For example: suppose you are writing some code that analyses a strategy game. You have (let's say) an object representing a position in the game; each position object has references to the positions you can move from there to. If repeated positions are possible, then you have a reference cycle.
For example: you have a tree-like structure in which things contain other things. It's convenient for each thing to have a reference to its parent, and for each thing to have references to its children. Boom, reference cycles everywhere.
For example: you are implementing a programming language that has closures. So each closure object has a reference to its lexical environment, and some of the things in that environment may themselves be closures defined in that environment. Reference cycles. (The same happens if you have classes, and methods have references to the class where they're defined.)
Of course, all these things can be avoided. Often you can pick some subset of the references (e.g., the parent pointers) and make them weak references, or you can restructure your code to make some of the references go away (e.g., positions don't have references to their successors, they have methods/functions for generating them). But the only reason to do those things is that you need to avoid reference cycles. Refcounting isn't (in these cases) kindly helping you improve your code by forcing you to avoid sloppy constructs; it's taking what would otherwise be perfectly sensible code, making it sloppy, and then forcing you to do something else to avoid the resulting memory leaks.
> Apple did introduce optional GC into its Mac OS X runtime. It received little uptake and was subsequently removed because Apple and its developer base realized ARC was the better solution.
You are telling the story wrong.
The reality was the the GC never worked properly, specially when mixing frameworks compiled with and without GC, leading to core dumps.
There there was a list of corner cases causes by having C as part of Objective-C.
Apple did not introduce ARC because it was little uptake.
They introduced because the GC never worked properly, so devs had better things to do than GC core dumps and ARC is a better approach to the Objective-C semantics.
This site could be improved for narrower viewports by changing the .site width to max-width. Combine with something like `padding: 0 0.5em;` for better results.
I feel like GC is an evolutionary dead end. Too much indeterminism and action-at-a-distance. Rust and virtual reality are two clear indicators to me that we're exiting the age of GC; the opposition isn't just Apple anymore, though they dealt the first blow.
The Rust developers will be the first to tell you that if you can get away with writing your application in a GC'd language, you probably should. Garbage collectors really are fantastic pieces of technology if you aren't heavily bound by memory or speed constraints.
Rust exists to fulfill a need in the domains where GCs are not acceptable, not to obviate GCs entirely. It merely offers the industry a viable alternative to the typical ravages of rampant manual memory mismanagement.
TIL Rust will be single-handedly used to rewrite every single GC'd program in existence, and furthermore only VR systems like Oculus will be relevant after that, so why bother dealing with any of the actually-hard problems of the real world and improving the existing technology we'll have today, and for years to come?
Oculus Rift and friends. VR needs extremely consistent latency and has very tight deadlines. Ideally, you render 90 FPS and never miss a frame.
GC pauses are out of the question, so if you're going to use GC, you need to be real-time without any degenerate stop-the-world cases. The linked article ostensibly delivers this, but there's no real-world evidence. Given your tight time budget, you also don't want to pay the "GC tax". GC just makes your life harder in this environment.
There also isn't really any good reason to write a 3D engine in GC'd languages given a) existing C++ engines and b) sightings of Rust on the horizon.
How does regions fit into this? IIRC regions have deterministic lifetimes, and all values/objects allocated in a region are freed in bulk when the region is freed (so it seems to only depend on how complex the underlying allocation is).
A typical implementation of free(3) does not give real-time guarantees; whether you use GC or not, you need to know things about your allocator to have any kind of latency guarantees. I get a little disturbed when people treat malloc() and free() as if they were constant-time operations.
more exactly malloc and free typically gives you shitty time guarantee.
The usual thing people do in real time is bailing out of any dynamic allocation, they know malloc is not good, they are afraid or don't want to deal with real time GC, and/or they know that if they do all the allocation first, they have a grasp on the worst case of their loop (I think the real answer is that it's a tradition).
The majority of real-time systems are the small embedded ones where both speed and size are usually highly constrained, so this doesn't look as useful as it may seem. It's well known that GC overhead decreases with increasing available memory, so the result shouldn't be so surprising. A relevant phrase I've heard is "garbage collection is free only if memory is worth nothing."
Limiting the amount of work the GC can do also means that some quite subtle bugs can arise from exceeding the "allocation rate" that it can handle, which is something that might not be as easy to determine in GC-using code.
but not the amount of space that we estimate they could possibly take
I think it's a bad sign when the word "estimate" is used in talking about a real-time system... this is an area of guarantees and proofs, not educated guesses.