
Intrusive lists in Doom 3 - liquidmetal
http://gpfault.net/posts/intrusive-lists-doom3.txt.html
======
Narishma
They moved away from intrusive lists to vectors in the BFG edition for
performance reasons. More about that here:
[http://fabiensanglard.net/doom3_documentation/DOOM-3-BFG-
Tec...](http://fabiensanglard.net/doom3_documentation/DOOM-3-BFG-Technical-
Note.pdf)

~~~
btown
Very interesting post - it seems that, like many things in computing,
intrusive lists make a lot of sense _until_ you start thinking about cache
locality. Perhaps on machines of the time, branch prediction was slow enough
that conditional code would be a limiting factor... but on modern
architectures, it's all about that cache, 'bout that cache, no thrashing.

~~~
pmr_
You can strike the intrusive. Your comment is valid for all kinds of linked
lists.

~~~
forrestthewoods
Hell it's valid for everything period. Even vectors! Splitting struct a into
hot/cold chunks is pretty common. Minimize the sise of data being iterated on
to maximize cache usage. Can result in big wins.

~~~
btown
Huh, hadn't heard of hot/cold struct splitting before. This paper seems to
indicate it has a nontrivial effect even in Java code:
[http://research.microsoft.com/en-
us/um/people/trishulc/paper...](http://research.microsoft.com/en-
us/um/people/trishulc/papers/definition_distr.pdf)

------
ars
Instead of allocating each element of the list individually you can do this
one better.

Allocate an array of elements. The prev and next don't point to memory but to
indexes in the array.

How do you handle empty elements? Have a special flag that means "this element
is blank" and use prev and next to point to the empty elements in the list!
(So you are essentially storing two interleaved lists as the same time.)

Also store the first empty (and first full) element separately of course.

You will need to initialize the memory before you use it, filling all the prev
and next pointers for the empty elements.

When you "allocate" an element you just update the prev and next pointers of
all 5 elements, as needed.

If you think about it - malloc also needs to store a list of unused memory
areas, so despite being a bit more complex you can speed things up quite a bit
by doing this.

An optimization: Since filling all the empties can allocate memory, even if
you might never use it, have an optimization that if the next pointer is 0 (or
-1) that implicitly means the next array index, but an uninitialized index.

By doing this you can allocate a 1GB array without worry, and allow the OS to
only fault in pages as needed (i.e. there is no need to ever resize the
array).

~~~
quotemstr
That's great until you start caring about address space use or total process
commit charge. All you've done is build a dubious special-purpose heap
allocator.

------
Todd
This approach is more common in lower level software where each allocation is
carefully managed. The Linux kernel uses them, for example:

[http://www.makelinux.net/ldd3/chp-11-sect-5](http://www.makelinux.net/ldd3/chp-11-sect-5)

as does the Windows kernel:

[http://msdn.microsoft.com/en-
us/library/windows/hardware/ff5...](http://msdn.microsoft.com/en-
us/library/windows/hardware/ff563802%28v=vs.85%29.aspx)

~~~
valleyer
And BSD / OS X. Check out queue(3).

~~~
flohofwoe
And on AmigaOS, pretty much everything in Exec (the OS kernel) was held
together by doubly-linked-lists (of the intrusive kind). Coming from C I'd say
that std::list is the 'strange kind' of lists, and intrusive lists are the
'common kind'. On the other hand I haven't been using lists for years, since
growable arrays (like std::vector) are almost always faster.

------
mahouse
I am somewhat confused by this article since to me it sounds like the writer
has wanted the reader to think intrusive lists are "trickier and weird", when
they have been my bread and butter during all my C work... And I guess I am
not alone.

~~~
simias
They _are_ trickier though because you either have to create a new type for
each list and all all accompanying functions to work on that type (the
PointListNode in the article) or you can make it more generic by discarding
type safety and using tricks like "container_of" (that's actually my preferred
approach when writing C, at least it lets me use the linux kernel's list.h and
I don't have to reinvent the wheel).

In C you don't have much of a choice to make a generic container but in C++
the templated std::list is much safer.

I'm a bit wary of that article to be honest, it's interesting but it oversells
the intrusive list a bit IMO and I can definitely imagine inexperienced C++
coders deciding to use intrusive lists instead of the STL because of it. But
at any rate that would be very premature optimization in my book. Even _if_
intrusive lists happen to be faster in a certain use case (and I'd like to see
benchmarks before I believe it) I'd feel much more comfortable using std::list
while developing/debugging.

~~~
dietrichepp
The problem with std::list is that it's almost always worse than the
alternatives. I would recommend using std::vector as the "default" container,
and then use another container if the performance of std::vector is not good
enough. Or, put another way, I think std::list is a premature optimization,
because you are saying that insertions and deletions are frequent enough that
you need them to be O(1), even at the expense of extra memory usage and tons
of cache misses for simple traversal.

The std::vector container will outperform std::list surprisingly often. Just
think of std::vector as your default container.

------
ambrop7
I really prefer intrusive data structures, for the exact reasons outlined in
the article.

I've myself written many implementations of intrusive structures. But now I'd
like to present the greatest of them, a super-generic implementation of an
intrusive AVL tree in C. See [1], and example [2].

\- It relies on a user-defined concept of a "link". In particular, this means
the structure can be in an array and linked with array indices, and can be
copied correctly with a single memcpy. Of course the user may still use
pointers as links.

\- It supports implementation of user-defined associative operations (e.g.
sum). This consists of operations: CAvl_AssocSum: Calculate the sum of all the
nodes. CAvl_ExclusiveAssocPrefixSum: Calculate the sum of elements from the
first one up to but not including node X.
CAvl_FindLastExclusiveAssocPrefixSumLesserEqual: Find the first node where the
sum of all preceding nodes is >=X. Of course these all have O(log n) time
complexity.

\- The implementation is made generic by include files which rely on macros a
lot. Other than the macro madness, I think this is better then C++ templates,
due to easy #if, where in templates one would need to jump through hoops due
to non-existence of static_if.

[1]
[https://github.com/ambrop72/badvpn/tree/master/structure](https://github.com/ambrop72/badvpn/tree/master/structure)
(see CAvl_* [2]
[https://github.com/ambrop72/badvpn/blob/master/examples/cavl...](https://github.com/ambrop72/badvpn/blob/master/examples/cavl_test.c)
(also see cavl_test_tree.h in the same dir defining some parameters for the
structure)

~~~
simias
> The implementation is made generic by include files which rely on macros a
> lot. Other than the macro madness, I think this is better then C++
> templates, due to easy #if, where in templates one would need to jump
> through hoops due to non-existence of static_if.

And either throw away type safety completely or end up with a huge macro soup
(which is even worse to debug than template soup). I also don't understand
your static_if remark.

I'm not saying intrusive lists are never useful but saying that intrusive
lists are generally better than the STL is just terrible advice.

I'm not a huge fan of C++ but if there's something I miss dearly when writing
C it's generic containers.

~~~
ambrop7
Well I don't want to get into a flame-war regarding intrusive/containing or
C/C++, but I will say that type safety and intrusive/container are orthogonal
things. One can have a type-unsafe container as well as a type-safe (generic!)
intrusive structure. See my code for proof :) I think we'll agree that it's
not too complicated and easy to use.

Implementation:
[https://github.com/ambrop72/aprinter/blob/master/aprinter/st...](https://github.com/ambrop72/aprinter/blob/master/aprinter/structure/DoubleEndedList.h)

Usage:
[https://github.com/ambrop72/aprinter/blob/master/aprinter/sy...](https://github.com/ambrop72/aprinter/blob/master/aprinter/system/BusyEventLoop.h)
(the BusyEventLoop keep track of QueuedEvent objects).

If you wonder what's the purpose of the Accessor, it's for when you can't use
the simple member-pointer based interface due to C++'s eager evaluation, and
can hack around it with forward declaration of a custom Accessor class,
followed by the complete definition later.

I think Boost intrusive structures are type safe too, but I'm not 100% sure.
But Boost is another matter altogether.

------
forrestthewoods
I like intrusive lists. Another consideration is that a std::vector<FooPtr> is
as fast to iterate over as intrusive list of Foos. The memory access pattern
is the same. (Note: typed FooPtr instead of Foo Asterisk because the single
asterisk italicized everything)

Another perk is that an element can remove itself from the list without
actually having to know anything about the list. This makes a lot of shutdown
code super easy and clean.

I do like intrusive lists to be inherited I think. That makes the interface
for adding or removing objects from lists super clean. It also eliminates the
extra obj pointer since the obj root address is known at compile time. Storing
one object in multiple intrusive lists gets a little hairy and you need a tag
parameter but that's a relatively rare need in my experience so it's fine.

One thing I _don 't_ like is lack of good debugging info in visual studio. For
c++ at least their customization tools are seemingly nOT powerful enough to
handle intrusive lists well. Not being able to clearly see all objects in the
watch window is wildly frustrating. Things that make code harder to debug make
me very, very cranky.

~~~
repsilat
> Another consideration is that a std::vector<FooPtr> is as fast to iterate
> over as intrusive list of Foos.

A "normal" linked-list of strings or vectors always makes me sad -- the
reasoning goes,

\- The list elements need to live on the heap because they might live forever,
\- The string object has to be fixed-size because objects are always (at least
officially) fixed-size. \- The string storage has to live on the heap, because
it can be arbitrarily large.

So printing out a linked-list of strings gets two pointer dereferences per
string instead of one. I guess that's the cost of abstraction, though.

(This actually fits into the C++ zero-cost-abstraction narrative pretty well,
though:

\- Linked-lists are not zero cost, \- C++ programmers don't like non-zero-cost
abstractions, \- C++ programmers don't like linked lists.

:)

~~~
sorbits
> _C++ programmers don 't like non-zero-cost abstractions_

I think you’re confusing zero cost with zero overhead.

What C++ programmers like is that the compiler does not add overhead, for
example calling `a[10]` will read the word at address `a + 10` and return
that. No bounds checks are inserted, no function call is done to lookup the
element in some implementation defined sparse data structure, etc.

But C++ programmers are generally not opposed to using things that has a cost
associated with it (basically everything has a cost) — for example `std::map`
is quite popular and certainly not “zero cost”, though it has its running time
defined, and the implementation is actually using a self-balancing search tree
rather than a hash table because we can give guarantees about the running time
of the former, not the latter.

~~~
nly
std::map is implemented using a tree and not a hash table because it's
specified as an ordered data structure.

~~~
sorbits
I am quite sure (based on interviews I’ve read) that Alexander Stepanov’s
primary concern was having a data structure with known (fixed) complexity
rather than a sorted container. The latter is just a nice side-effect.

~~~
dfkf
In the worst case scenario the time complexity of a hash table equals that of
its bucket, which doesn't have to be a list. So the only advantage of the map
over the unordered_map seems to be the ease of use, it only needs a comparer.

~~~
nly
unordered_map pretty much has to have linked buckets due to its interface and
complexity requirements.

------
failrate
"There can be a lot of projectiles, they appear and disappear quickly, and
every time that happens we incur the cost of allocating/deallocating memory
multiple times."

I write a lot of shoot-em-ups (both 2d and 3d). I never do this. I preallocate
all of the bullet entities on level initialization and never destruct them
during runtime. If the number is large enough, I do sort them by status (alive
or dead) and exit processing when I encounter a dead projectile, but usually
that's unnecessary.

~~~
enjoy-your-stay
Do you pre-allocate all bullets per shooter? Or have one big bucket that all
shooters use?

I take it that having a finite number of bullets in the game at any time is
obviously an acceptable trade-off.

------
eps
Just like that old joke -

    
    
      There are two ways to wash dishes -
      after a meal or before it.
    

Same with data containers, there are two mindsets. Either data is primary and
can be optionally organized into containers or containers are primary and they
essentially own the data that's put in them. STL model is of latter and
"intrusive" lists are of former. I leave it up to you to decide which one is
an ass-backward approach in the name of type safety :)

~~~
barrkel
It depends on whether you have generic code (e.g. map, filter, sort etc.) that
you want to apply to the list, vs efficiency concerns, or whether your code
needs to iterate through the list from an item.

There's no one true way here. I personally find having a library of operations
like map and filter more productive for most code that I write, rather than
needing to specialize them per structure.

------
yason
I didn't think these "non-intrusive lists" were actually used much in real
world programs. Even if allocations weren't a big issue performance-wise, the
sheer amount of work required manage the allocations themselves would put me
off using them.

~~~
pubby
Just use Boost.Intrusive:
[http://www.boost.org/doc/libs/1_57_0/doc/html/intrusive.html](http://www.boost.org/doc/libs/1_57_0/doc/html/intrusive.html)

It makes the job much easier.

~~~
fit2rule
Except: boost.

------
SamReidHughes
You can also make these using this structure:

    
    
        template <class T>
        struct node { node<T> *next, *prev; };
    
        template <class T>
        struct intrusive_list : private node<T> { };
    

Your type foo then subclasses the node<foo> type, and in some places
intrusive_list<foo> has to statically cast them back down.

This means you don't need that messy "T *owner;" pointer anywhere.

The downside is that if you want your type to be a member of multiple
intrusive lists, you have to make spurious parent classes and inherit from
those.

~~~
repsilat
One possible alternative is having your objects be members of "lists of
slightly different type". For example,

    
    
        template <ptrdiff_t offset, class T>
        struct node {
            node<offset> *next, *prev;
            T& operator*() { return *(this-offset); }
        };
    

Maybe? It has been a while since I've written C++. I guess you have a big
problem setting up your base class in a portable way, though, because the
types of its member nodes depend on their offsets, and I don't think you can
do that.

    
    
        class Foo {
            int data;
            node<offsetof(Foo, list1_elem), Foo> list1_elem;
            node<offsetof(Foo, list2_elem), Foo> list2_elem;
        };
    

:/

~~~
SamReidHughes
I'm not sure if that would work, because computing the offset of list1_elem
could depend on the template instantiation of list1_elem because that could
(ostensibly) affect the alignment. But you can drop the template altogether on
the list element type and put the offsetof expression in the list type
declaration. An example of that is linked by some other comment in this
thread: [http://www.codeofhonor.com/blog/avoiding-game-crashes-
relate...](http://www.codeofhonor.com/blog/avoiding-game-crashes-related-to-
linked-lists)

The base class option is, I think, best only for the case where you have only
one kind of list you want the object to be part of.

Also your pointer should be cast to char before you do offsetof arithmetic on
it.

------
jokoon
I don't understand. Is it really well explained ?

Aren't ListNode just a way to index a list ? Isn't this Point2D just "marked"
to be on a certain list ?

Why does he say "less dereferencing" ?

> Dereference the appropriate "next" pointer, get the next object from memory.

the prev/next pointer is in the ListNode, not in the object itself, so it's
obviously a dereference...

I did not understand it very well... I can understand that a linked list will
be faster thanks to branch prediction, but what does it have to do with an
intrusive list ?

~~~
pushrax
> the prev/next pointer is in the ListNode, not in the object itself, so it's
> obviously a dereference...

The ListNode is embedded in the Point, so their storage is actually combined.
The memory layout would be something like

    
    
      0x0  x
      0x4  y
      0x8  next
      0xc  prev
      0x10 owner
    

with the total structure being 20 bytes if pointers are 32-bit.

> Aren't ListNode just a way to index a list ? Isn't this Point2D just
> "marked" to be on a certain list ?

The list itself is formed entirely by the ListNodes themselves. There is no
backing array to index into; the storage for points (and nodes, which are
contained in points) is created with an individual malloc per point. The way
you reference a list is by holding a reference to the head, which is just a
regular ListNode.

~~~
ant6n
If you use a std::List<T>, doesn't it uses nodes like

    
    
        T data
        p next
        p previous
        p owner
    

anyway? It shouldn't create a pointer to the data, just like a std::vector<T>
isn't an array of pointers to T.

~~~
desdiv
Let's say I have a bunch of _Bullet_ objects, and each _Bullet_ belongs to
multiple lists. In that case, std::list<Bullet> doesn't work and I have to use
std::list<Bullet*> instead, thus the extra pointer dereference.

------
overgard
The thing about linked lists is their Big O time looks nice, in theory, but in
reality they do not work well with how most CPU's are designed. (Because it
makes the CPU cache mostly useless, and clock speeds aren't really improving,
so the CPU cache is pretty much the driver of better performance at this
point). I'd rather just use a vector/array and take the hit when I need to
resize it, most of the time.

------
gpfault
Oh, look, it's my post :)

------
mischanix
Here's another example [1] of the same. Blizzard games from WC2 to SC2 use
this code nearly verbatim (testing 1 instead of INT_MAX for 64-bit compat in
newer titles).

1\. [http://www.codeofhonor.com/blog/avoiding-game-crashes-
relate...](http://www.codeofhonor.com/blog/avoiding-game-crashes-related-to-
linked-lists)

------
noodly
Wayland also uses intrusive circular linked lists - but it doesn't need owner
pointer, since that pointer can be computed from pointer to link, and offset
of that link in owner structure. I wonder if doom 3 really needed it.

------
astrobe_
It's funny how this technique looks like something coming from a newbie
programmer. This reminds me parallel areas. Looks like done by a profane, but
hey, it's actually cache-efficient these days.

------
andrea_s
Does anybody know what would the performance gain be in this case?

~~~
nly
There isn't one. std::list<T> has the same space requirements, memory locality
and capabilities of an intrusive list. Period.

The naivety of an 'intrusive' list, where all your objects are always 'list
ready', is that it allows the insertion and removal of elements without
_copying_ them, which is beneficial if your lists experience a lot of churn.
The thing is, you can achieve this without intrusion anyway by keeping all
your 'not in a list' items... well, in a list. You then get memory management
for free and can use splice()[0] to efficiently move nodes around.

My philosophy on lists is: if you've never used or needed to splice then you
probably don't really need a linked list. Splicing is what linked lists are
all about.

[0]
[http://en.cppreference.com/w/cpp/container/list/splice](http://en.cppreference.com/w/cpp/container/list/splice)

~~~
sharpneli
What about modifying the elements while keeping them on multiple lists?

With intrusive list you need only modify any element on a single place (A
bullet location as an example).

How could you achieve this with std::list?

~~~
CrLf
If you're storing pointers to elements in the list, you only have to change
the pointed object.

~~~
latitude
OK, then.

I'll give you a pointer to an element and you remove it from all containers
that it is currently on, in O(1).

The only way to do that is to keep a bunch of iterators in the item itself,
which is virtually identical to the intrusive container model _except_ it
involves more pointers, more heap allocations and introduces an extra
invariant into the data model (the iterator stored in the item must point at a
list that actually contains said item). And the only benefit you get for all
this pain and suffering is that you don't have to do any "funny" casting via
container_of() contraptions.

Point being is that once data items sit in more than one container, intrusive
container model generally results in a simpler code that is also more
efficient.

------
sehugg
I think I remember Turbo Pascal's container library using a similar approach.

------
julbaxter
> " Every time a new projectile is created it has to be added to a bunch of
> lists. There can be a lot of projectiles, they appear and disappear quickly,
> and every time that happens we incur the cost of allocating/deallocating
> memory multiple times." Lists can be pre-allocated though

------
showHNshow
The article contains many false claims and its main statement that intrusive
lists are better than non-intrusive ones is simply wrong.

Outright wrong:

1) There is no additional dereferencing needed when using his 'ListNode'
implementation, since the object is stored directly in the ListNode.

2) If you want objects to be non-pointer members of ListNodes of several
lists, you can achieve that with both intrusive and non-intrusive lists in
exactly the same way (contrary to the authors claim).

The article is a disgusting example of someone trying to proof a point he
heard somewhere else. So it is not his original idea and he is failing hard.

Why are there no comments on HN (except for jokoon) addressing that? Instead
everybody uses the comment system as a platform to show off his own bad ass
C++ skills... stupid. You deserve each other

~~~
logicallee
Your contribution would be very welcome here and upvoted, if you kept it
factual (your points #1 and #2 are interesting and fine; your speculation that
the author heard this elsewhere is almost okay, if you had elaborated.)

But in your last two paragraphs you switch to using the language of trolls
('disgusting example', 'stupid', 'you deserve each other'). no need - you went
from being an obvious upvote to an obvious downvote and probably risk being
hellbanned. (Where you see your contributions but nobody else does - then
nobody can see you no matter how good your points are.)

see commenting guidelines:
[https://news.ycombinator.com/newsguidelines.html](https://news.ycombinator.com/newsguidelines.html)

I hope you will read that and be a positively contributing part of this
community.

