If you do know that -- like you're writing a GC for a higher-level programming language -- you can write a precise GC instead and avoid all of the (very interesting) muckiness this article goes through.
I wrote an article on that a while back:
"Local variables can also be stored in registers, but we won't worry about this because registers and usually dedicated to local variables, and by the time our function is called they'll probably be saved on the stack anyway"
Ignoring things that 'usually' or 'probably' do not happen is not a good idea when programming, doubly so when doing systems programming. It is one thing when your application crashes, another when the OS crashes or becomes exploitable due to a bug.
If you want to implement a robust conservative garbage collector you will have to study the calling conventions of the language you write it in, will have to study the source of the compiler, and will have to study the release notes of every OS or compiler upgrade. For example:
- rust moved from segmented stacks to conventional ones. That probably broke any code walking the stack.
- on some systems, locals can get stored below the stack pointer (http://en.m.wikipedia.org/wiki/Red_zone_(computing))
(Yes, this example code appears to be broken on x64)
The other way to improve GC if you are using C is to make sure your local allocations are in fact local; use alloca() and friends (but only for sub page sizes, possibly less if you are doing lots of local allocations). This will reduce the amount of GC required.
As far as Red Zone goes if you are in the GC then you can't be in another local function; unless you are running your GC pre-emptively (signals etc), in which case you need to be much much more careful.
Heh. No wonder it's 32 bit! You can only hope to get away with this on a register-poor ISA like i386.
I wonder if you could use setjmp to portably get access to register values?
2. Especially if simple means you don't have any interrupts or threads.
3. Writing anything is a lot easier if you carefully exclude "debugging" from "writing".
When you have GC bugs, they don't always occur in the collector itself, but rather when the code which relies on GC breaks the conventions that allow GC to work correctly (because that code is generated by a C compiler which doesn't understand your conventions, and so they are ensured by hand).
Here is the scenario: a local variable in the C program holds the only pointer to a heap object. The object is no longer needed, so the program overwrites it with NULL. Now it is garbage, right? Wrong! The C compiler is optimizing, and has no clue about garbage collection (unlike, say, a proper Lisp compiler). The C compiler performs a data flow analysis and comes to conclusion that "ptr = NULL" is a dead assignment because the variable has no "next use" in the program graph. The assignment is optimized away, and so a memory location, or a register, continue pointing to the object. Sometimes this happens in some top-level function, and so during some long-lived execution of lower level functions, memory is occupied. Only if you're lucky, the memory location or register get reused for something else and so the stale pointer is overwritten.
If you're using infinite lazy lists, and are counting on the prefix of these lists to be garbage-collected, as you extend the tail, spurious retention of the list head is the kiss of death to your program.
This is wrong; you have to take care of registers. The good noews is that an easy way to do this that will work on a broad range of platforms is to store the machine context to a jmp_buf that is on the stack. (You never use this jmp_buf for a longjmp, obviously.)
What is happening is that the string object has been reduced to its C fundamentals (pointer to data, and length), and the one and only variable ("str") which holds the original object from which these fundamentals are derived has no next use after that.
The fix in the patch is not very good; the real fix is to get rid of the nasty macro zstream_append_input2, and make it a real function.
The function still has to solve the problem of pinning the object while the low level pointer is used, but it's more localized: we don't have to think about the interaction of the macro with the site where it is expanded (and will the hack always work).
In other words, don't write macros that break apart garbage-collected objects into low-level C pointers to their internals that are invisible to the collector. The macros spread dangerous code everywhere they are used.
"Out of scope" as defined how? Not by the C code, it can be in-scope in C but optimized away by the compiler... which seems best illustrated by dissecting the compiler's output.
> The fix in the patch is not very good; the real fix is to get rid of the nasty macro zstream_append_input2, and make it a real function.
That "real fix" doesn't actually fix anything, though. Don't get me wrong, macros are ugly (especially lowercase ones) and that's a fine refactoring to suggest, but it's not a fix.
> In other words, don't write macros that break apart garbage-collected objects into low-level C pointers to their internals that are invisible to the collector. The macros spread dangerous code everywhere they are used.
In this context, there's extremely little difference between a macro and a function that the compiler decides to inline.
Also, while this is a nice explanation, does anyone actually use GC's while writing code in pure C? I never found the idea of calling free() that troublesome. On the other hand, writing a GC in C for another language is obviously a good use case.
And yes there are definitely uses for a pure C garbage collector. For example, one can use it to detect memory leaks, if for some reason valgrind isn't around.
The D programming language currently uses a conservative garbage collector. Before version 1.0, false pointers were a serious problem. The most common causes of false pointers were:
- ASCII text;
- floating-point values;
- high-entropy (compressed/encrypted) data, e.g. image files.
Just 1 MB of random-ish data is enough to pin down allocations of 16 KB or bigger. And of course, it cascades through any false pointers in any pinned memory blocks, and so on.
In D 1.0, the garbage collector now remembers whether any memory block contains any pointers or not. So, now it can tell whether any memory address either definitely contains no pointers, or may contain pointers. Even so, memory leaks due to false pointers on 32-bit are not unheard of. For personal use, I've had to write a tool to diagnose these memory leaks: https://github.com/CyberShadow/Diamond
Rainer Schuetze is working on improving the precision of the GC from one heap allocation block to one machine word, but there's still work to be done on this.
- non-discriminated unions mixing pointers and non-pointers
- linking with C or other languages (as you can't scan their stack frames precisely)
- void*, memcpy, or other ways to move around memory which might have pointers.
A language without all of the above would be of limited utility. However, it's possible to dramatically improve the situation using:
- precise heap scanning - by knowing the type of each object allocated;
- precise stack scanning - by knowing the stack layout of all functions at all points throughout their execution;
- GC hooks for user-defined union types.
I made a comment to that effect higher up before noticing this.
When you're dealing with real world data - timings, statistics, sizes, dates, and then if you add floating point numbers whose representations look like big ints, you will be hitting this a lot, and leaking memory, due to how your mark and sweep works.
It is absolutely critical to have a sure way of disambiguating pointers from data.
I've fixed bugs in a memory allocator that's running in an app deployed on several hundred million PC's. It had a really fancy allocator that put a header with a magic number around malloc output, in addition to what you do, in an attempt to identify specific kinds of allocations. We'd see several tens of thousands of crashes a day due to magic number collision with real world data.
Your intuition is wrong. You can never assume that most local integer variables tend to have small values. Assign -3.0 to a float, and map that to an int, or read the current epoch time into an int, and see what happens.
A C garbage collector would certainly help write some code more easily, but it has to work, and it has to be "precise". An imprecise GC is worthless in production, but fun as an academic exercise.
"Thirdly. Please don't use this code. I did not intend for it to be wholly correct and there may be subtle bugs I did not catch."
I've always thought a compiler was difficult to write until I wrote my first one. Sure it is nothing when compared to a production-level compiler but definitely no longer think that there's something magical about compilers, even the more advanced ones.
I mean, to be a little flippant, all you need is abstraction and application, and you can compute all computable problems, right?
More to the point: what I was trying to do was to point out the irony of saying "building a garbage collector is easy" and then, only a few sentences later, saying "this code may have subtle bugs." Saying "this code is not fast enough for production" would be one thing, saying "this code is not correct" seems to put lie to the assertion that building it is "easy."
I personally have never built a garbage collector in anger. But I suspect that the problem of building a passable garbage collector (that is, one that somebody else would be willing to use) is quite a bit harder than it is made out to be in this article.
This is a pedagogical technique. Your points are valid, but they also ignore that the author is trying to teach concepts through code. That often require simplification.
So too with GC that you start with a hook into your allocator and you end up with coloring references and aging them and disabling compiler optimizations and all sorts of other things that lead to really unexpected behavior in your environment.
Maybe you should mention weak refs and volatile also.
volatile to keep pointers on the stack, so that you don't have to inspect the registers also. And some easy ways to make it precise. E.g. int tagging, float boxing, ...
It scans heap memory, and if a word looks like it points into allocated memory, it's assumed to point into allocated memory, and that memory is marked as used.
When dealing with numbers, such as times, file sizes, etc, you will inevitably store an integer value that looks like a pointer into a valid block.
The reason you can do mark and sweep in Java, is that there is a distinct wrapper for all pointers, so you can disambiguate pointers from data. You can't do that in this algorithm, it's impossible. This is also the same reason that disassembly of executables is so hard; it's not always clear what's data and what's instructions.
Even libgc has this issue and it is impossible to resolve in C.
That's not why the JVM can do this - the JVM knows the location of pointers in objects on the stack, and it only scans them. They aren't wrapped in any way.
In some of Oracle’s implementations of the Java Virtual Machine, a reference to a class instance is a pointer to a handle that is itself a pair of pointers: one to a table containing the methods of the object and a pointer to the Class object that represents the type of the object, and the other to the memory allocated from the heap for the object data.
Presumably the major difference in Java with re: GC is that there's enough introspection left in the RTTI after compilation for the GC to identify internal references.