
Golang proposal: Dense mark bits and sweep-free allocation - rjammala
https://github.com/golang/proposal/blob/master/design/12800-sweep-free-alloc.md
======
vessenes
This would be great, and is greatly needed. but I have a question for the
performance junkies here:

    
    
        If the cost of atomically setting bits in the mark bitmap turns out to be 
      high, we could instead dedicate a byte per object for the mark. This idea is 
      mentioned in GC literature [Garner, 2007]. Obviously, this involves an 8× 
      increase in memory overhead. It’s likely that on modern hardware, the 
      cost of the atomic bit operation is small, while the cost of increasing 
      the cache footprint of the mark structure is probably large.
    

I have been trained to think of an atomic operation at the CPU level flushing
the cache line, and possibly L1 and further impacting all cores sharing the
cache. This statement implies the opposite. Can anyone comment on this vis-a-
vis a modern intel chip?

~~~
pslam
This sounds wrong to me, too. For the single core case, there's no such thing
as an atomic byte store - the underlying mechanism reads a cache line and
modifies it. The same goes for a bit operation (possibly needing more
instructions).

For the multi-core case, there's always going to be cache line thrashing
regardless of size. Cache lines will evict/transfer between cores as each
accesses them.

Probably best to go with single bit, unless it can be shown that the overhead
of generating a bit mask and the r/m/w is far more expensive.

~~~
danieldk
The referenced paper is a bit vague about this:

 _An alternative to side bitmaps is a bytemap, a bytegrained data structure on
the side. Bytemaps trade an 8× space overhead to avoid synchronizing on each
set operation (if we assume support for atomic byte-grained stores)._

I think the idea is that, since a bytemap is less dense, two cores less
frequently access the same cache lines, leading to fewer invalidations.
Though, I wonder if it pays off, since it also gives less locality.

~~~
yxhuvud
No, it is referring to the fact that the least unit that can be written to
memory isn't a bit but a byte. To write a bit the whole byte must be read,
modified and then written back.

~~~
danieldk
Sure, but in both cases if a mark needs to be changed 'the whole byte needs to
be read modified and written back', bitmaps or bytemaps.

------
shawnps
If you liked this you might also find Rick Hudson's talk on GC improvements in
Go 1.5 interesting:

[https://www.youtube.com/watch?v=aiv1JOfMjm0](https://www.youtube.com/watch?v=aiv1JOfMjm0)

[https://talks.golang.org/2015/go-gc.pdf](https://talks.golang.org/2015/go-
gc.pdf)

------
nulltype
So it looks from the diagram the spans point to arbitrary points in the mark
bitmaps. Does that mean there's a lookup table to map from span to mark
bitmap/offset?

