Hi, thank you for the in depth response. I really needed to hear these things. I have gone ahead and addressed almost all of the issues that you pointed out. I updated the root to be the only point for compare and swap and on mutations I do a clone instead of mutating the shared pointer. I updated set bit from xor to set and have an explicit clearbit fn for deletions. I also updated k/v aliasing to copy slices on write so that old copies do not share ref to same slice. The node pool has been removed completely for the time being as well. I added in additional tests to test for these cases and added in more concurrent tests as well. this was huge feedback and I really appreciate it.
The point of enumerating all of those specific issues, wasn't to say "here are some bugs" and if you fix them you're good. It was to say "here are some examples of the much more fundamental problem", which seemed to be a fundamental misunderstanding of the language memory model and the guarantees offered by assignments, atomic.CompareAndSwap, etc., and those operations' interactions with package unsafe.
Looking at *only this code path* -- and there are *many more* --
---
Put
- Snapshots the current root pointer with atomic.LoadPointer
- Makes an updated root pointer via putRecursive, given the snapshotted root pointer
- Spins on a CAS of the root ptr and the updated ptr with runtime.Gosched() between attempts
---
atomic.LoadPointer isn't a real snapshot
- It's atomic only over the root ptr, not any interior field
- Those interior fields are mutated in-place via e.g. setBitmap, setChild, etc.
- Any goroutine can see partial data, violating the memory model, etc.
---
putRecursive is unsound
- copyNode performs a shallow copy, child pointers are shared, subsequent setChild, extendTable, etc. mutate nodes other goroutines can still hold -- this is a fundamental bug that seems to remain un-addressed from the previous review
- Those mutations use plain writes (no atomics/locks/etc.) -- data race, memory model violation, etc.
- Get later returns the internal []byte slice directly -- data race, memory model violation, etc.
- Newly created nodes are cast to unsafe.Pointer without an atomic store, bypassing the write barrier required by the GC
---
That compareAndSwap is unsound
- It compares only the root pointer, a shallow copy
- After a successful CAS other writers can still mutate any shared children (see above), so readers following any shared path see data races, memory model violations, etc.
- The retry loop can livelock, details elided
---
The implementation still seems to confuse "atomic pointer swap" with "atomic update of a complex, shared value", misunderstands the requirements of the Go memory model, and consistently mis-uses unsafe.Pointer.
tl;dr here is probably to just stop using package unsafe altogether, until you have some time to properly understand its semantics, requirements, and limitations...
Hey, this is going to sound crazy, but I have been looking for someone to critique my code with as much care as you have and give real genuine feedback. I am going to take your input as learning experience.
Understood, and again, thank you for picking apart my code. I will take some time to fully understand Go mem model and unsafe package before trying to tackle this problem again. In the meantime, do you have any resources I could take a look at to better my understanding?