
Writing a Simple Garbage Collector in C - mbroncano
http://maplant.com/gc.html
======
pcwalton
Issues I found at a glance:

1\. This uses unsigned int for the chunk size, so the allocator will overflow
on requests of 4GB or more despite taking a size_t. It seems that this is
32-bit only.

2\. Even on 32-bit, the num_units calculation will overflow if you request
(for example) 0xffffffff bytes of memory instead of returning an error.

3\. None of this is thread-safe. It needs a global mutex lock.

4\. EBP cannot be relied upon to yield anything sensible with -fomit-frame-
pointer, which is common on 32-bit x86 as it brings the number of GPRs from 6
to 7.

~~~
earenndil
> None of this is thread-safe. It needs a global mutex lock.

Or have a separate allocation chain for each thread.

It would increase fragmentation, but not by a lot, and probably increase
performance by more than enough to make up for it.

~~~
pcwalton
The problem is that you would still need to lock around sbrk.

~~~
earenndil
Fair enough. You would probably switch to mmap, though, which works across
threads.

------
stevefan1999
This really reminds me of a project called TinyGC/tgc[0], made by Daniel
Holden who is currently a Ubisoft researcher. I have also tried his Cello[1]
framework for C99 which also incorporated a garbage collector similar to tgc.
Cello is pretty fun to use but the syntax was still limiting.

[0]: [https://github.com/orangeduck/tgc](https://github.com/orangeduck/tgc)

[1]:
[https://github.com/orangeduck/Cello](https://github.com/orangeduck/Cello)

------
RodgerTheGreat
As another example, here's a simple GC I wrote quite a while ago in Forth:
[https://github.com/JohnEarnest/Mako/blob/master/lib/Algorith...](https://github.com/JohnEarnest/Mako/blob/master/lib/Algorithms/Garbage.fs)

This one is a precise, copying GC, with a reserved arena for persistent
references into garbage-collected objects (in addition to the stacks).
Pointers are identified by reserving a high bit in words.

~~~
eru
In C that's hard to to do in general, because people do pointer arithmetic,
and they sometimes abuse that pointers often come aligned (eg 3 byte aligned),
so they re-use the extra two bits for various flags and mask them out before
de-referencing.

But compare
[https://news.ycombinator.com/item?id=19182779](https://news.ycombinator.com/item?id=19182779)

------
giomasce
This GC does not probably survive to pointer scrambling. If believe in C I can
validly do something like

    
    
        int *ptr = ...;
        intptr_t iptr = (intptr_t) ptr;
        ptr = NULL;
        iptr ^= MAGIC;
        // do something else
        ptr = (int*) (iptr ^ MAGIC);
    

At the end of this ptr is again a valid pointer to the same thing it was
pointing at the beginning. However, if a GC scan will happen during the "do
something else" block, it won't see the actual pointer value and it might free
the pointed object.

I don't think it is possible to write a GC for C if the program is allowed to
do this kind of things, because there is too little structure at runtime. And
in any case, this kind of GC is not a GC "for C", as it heavily relies on
knowing the compiler internals.

EDIT: Re-reading, I didn't mean to be harsh. This is still interesting to
read, I am just noting a weakness that is not mentioned in the article. BTW, I
know that glibc actually does some pointer scrambling like I said to mitigate
some types of attack.

~~~
beefhash
You're _allowed_ to do that, but the standard is making no guarantees: C11, §
6.3.2.3(6) “Any pointer may be converted to an integer type. Except as
previously specified, the result is _implementation-defined, might not be
correctly aligned, might not point to an entity of the referenced type, and
might be a trap representation_.”

I'd consider pointer scrambling to be a pathological case because of that.

Similarly, C11, § 6.5.11(2) constrains XOR to be only valid on integer types,
but not on pointer types, further suggesting that you're not really supposed
to be doing this.

~~~
hvdijk
intptr_t is special though. intptr_t has the property that any void pointer
can be converted to it and back again to produce the original value. This is
in the specification for intptr_t (7.20.1.4), not the general language rules,
so it is easy to miss. (Edit: GP used an int pointer, but the example can be
trivially modified.)

~~~
giomasce
Also, I believe that you can also access the object representation of the
pointer and scramble it. If you fix it up later, I believe it will have to
represent the original pointer, by C11 6.2.6.1 (4).

------
INTPenis
I had a co-worker who was a sysadmin and was writing a GC on his own time.
Just for fun. The guy was seriously over qualified but I guess he chose to
work with simple stuff for his own sanity.

~~~
chousuke
Might be he enjoys coding, but not software projects. Personally I can program
adequately when needed, but I ended up working as a sysadmin because I
determined I don't have the right temperament to deal with software projects.
I prefer dealing with systems as a whole instead of focusing on individual
pieces of software.

------
stevedekorte
It's great to see tutorials like this. FWIW, here's a tiny incremental (baker
treadmill) collector[0] library I wrote. It was used in the implementation of
the Io programming language.

[0]
[https://github.com/stevedekorte/garbagecollector](https://github.com/stevedekorte/garbagecollector)

------
jimbob45
Would love to see a Windows-based version of this article. No sbrk() or mmap()
in Windows makes the implementation a bit different.

~~~
userbinator
Windows has VirtualAlloc which is like mmap, but you're right that it has no
concept of a "break"; in fact, the stack of the main thread is _below_ (most
of) the heap in Windows, below the executable itself. In pseudopictorial form,

    
    
        Linux/most other *nix:
        | executable | libs | heap---> <--- stack |
    
        Windows:
        | <--- stack | heap | executable | heap | libs | heap---> |

~~~
marvy
Does this picture change much between the 32-bit vs 64-bit world? How about if
there are few vs many threads? (For instance, if a program on 32-bit windows
spawns hundreds of threads, surely you can't squeeze all those stacks below
the heap, can you?)

~~~
userbinator
The 64-bit address space is much bigger and even more unpredictable when
there's ASLR, but in my experience the main thread's stack still ends up below
the executable; they're just much farther apart. I believe other threads'
stacks also fit somewhere below, but with 32-bit it will start allocating them
in areas that would've otherwise been heap once the area below the executable
runs out.

~~~
marvy
thanks!

------
matheusmoreira
Another great article on the subject:

[http://journal.stuffwithstuff.com/2013/12/08/babys-first-
gar...](http://journal.stuffwithstuff.com/2013/12/08/babys-first-garbage-
collector/)

