

Avoiding game crashes related to linked lists - adj
http://www.codeofhonor.com/blog/avoiding-game-crashes-related-to-linked-lists

======
Negitivefrags
Before you consider intrusive lists, in fact, before you consider almost any
other data structure, try a vector.

People grossly overestimate the cost of vector relocation while
underestimating the benefits of cache coherency. Vector relocation is so cheap
because all the bits are in the cache. You can relocate all the elements in a
reasonably sized vector faster than you can dereference a pointer to somewhere
in memory that isn't in cache.

If you want log(n) lookup like a set/map you can keep the vector sorted. If
you want fast erase (and don't care about order) you can move the last element
to the middle and resize down by one (no memory allocation).

The <algorithm> header makes these operations trivial.

C++11 move semantics mean that there are a lot of types you can keep by value
in a vector that you wouldn't have before and they stay fast.

~~~
codeka
The only problem with std::vector is that you can only belong to one (unless
you have a vector of pointers, but that almost negates the whole point).

The way I've typically done this is by having one global "all entities"
std::vector and then lists, maps or whatever for specific subsets.

Usually, my "entity" object is little more than a container for behaviours so
in reality its a little more complicated than that...

~~~
nostrademons
Traversing a vector of pointers isn't much less efficient than traversing an
intrusive linked list, and is significantly _more_ efficient than traversing a
normal linked list. With a vector of pointers, you need to do an arithmetic
operation (usually on a register), a dereference to L1 cache (to fetch the
pointer), and a dereference to main memory (to fetch the object). With an
intrusive linked list, it's just a dereference to main memory. With a normal
linked list, it's 2 dereferences to main memory, one for the link and one for
the object. On most processors, accessing cache and performing address
arithmetic takes a tiny fraction of the time that accessing RAM does, so for
practical purposes your vector-of-pointers and intrusive linked list
implementations should be pretty similar in performance.

If you can own the objects in one by-value vector, that's even better, then
traversing over them involves _only_ address arithmetic on objects that are
already probably in the cache.

~~~
gsg
You forgot insertion and removal, which intrusive lists provide in constant
time and vectors do not.

If you can own the objects, linked list (of any kind) is usually not
appropriate. The essence of the efficiency of an intrusive linked list is that
the same indirection that must point at an object that is not owned is reused
to give the list structure. Without this trick, linked lists are not much
good.

~~~
phaedrus
Again, you're not getting the point. Constant time != "fast", it just has to
do with how the operation scales with N. The point, though counter intuitive,
is that the constant time of the list operation is _larger than_ the linear
time of the vector operation _for many values of N_.

~~~
cygx
> The point, though counter intuitive, is that the constant time of the list
> operation is larger than the linear time of the vector operation for many
> values of N.

Citation needed. The kernel guys in particular would probably be interested in
a faster alternative to intrusive linked lists.

Keep in mind that we're talking about vectors of pointers due to the problem
domain (multiple lists of potentially large or polymorphic objects), so using
vectors won't really help locality.

------
beder
There's plenty of discussion relating to the article he references at
<http://news.ycombinator.com/item?id=4455225>, and much of the same applies
here.

1\. He's not comparing apples-to-apples between `std::list` and his intrusive
linked list; the proper analogue would be to unlink a node from its iterator,
for which the running time is still O(1).

2\. The main (only?) reason to use intrusive lists is for the single
indirection (which helps both memory and speed). In his example for how using
std::list would crash his server because of the O(N) node removal, he's just
not storing the right data for each connection (again, use an iterator).

3\. He looked at boost's intrusive list, but I'm guessing he didn't actually
try it out. The examples are pretty good, and it's much easier than it
"looks". (That is, boost libraries can look intimidating when you first look
at them because they're so template-heavy.)

4\. It may even be that a vector, plus an auxiliary data structure for lookup,
may be faster.

~~~
huhtenberg
THE reason to use "intrusive" containers is to let a piece of data to sit in
multiple containers, none of which is primary. I'll give you an skbuff and you
show me how to put it on several linked lists and a couple of hashmaps with
STL-style containers.

~~~
morsch
Store multiple pointers to the same datum?

~~~
gsg
Then for deletion you need to traverse every data structure in which you store
said pointers.

------
eps
Ah, the potent mix of the offsetof and C++. It takes one junior dev to
sprinkle a bit of inheritance on top and wonderous things will start happening
in your crash-proof code. In other words, it takes much more displine to use
embedded data containers instead of embedding ones, the C-style discipline,
which is clearly not for everyone.

------
kev009
What's funny is he mentions the ZMQ in C entries as an impetuous but then
essentially writes the list the way any novice C programmer _would_ (vs.
expert/"leet" C++ prgorammers from the article). To me, this unwittingly plays
into the "why ZMQ would be better in C" meme far more than the other way
around :o)

See also <sys/queue.h>.

~~~
ChuckMcM
I think that is somewhat intentional. C++ obfuscation gives folks a false
sense of security about what they are doing. C programmers come out of school
realizing they have to be careful. Nothing quantifiable mind you, just my
experience in hiring them.

------
pubby
The reason this guy thinks std::list is buggy is because he's using it
incorrectly. There's not reason to write removal functions like delete_person
when they already exist with list::remove, list::erase, find, search, etc.
There's no reason to use std::list<foo*> either when std::list<foo> and
std::list<std::unique_ptr<foo>> work just as well.

His example code is very dubious as it looks like C-with-classes rather than
C++, mostly due to the lack of RAII.

Intrusive lists are still worth knowing and using, it's just that the author's
reasoning was terrible. I found the Boost.Intrusive page to be much more
knowledgeable:
[http://www.boost.org/doc/libs/1_35_0/doc/html/intrusive/intr...](http://www.boost.org/doc/libs/1_35_0/doc/html/intrusive/intrusive_vs_nontrusive.html)

~~~
jemfinch
The reason you think this guy is using std::list incorrectly is because you're
not thinking about his requirements. Instead of spending a few seconds to
understand why he would keep a std::list<foo* >, you immediately ran back to
HN to make a comment about it.

He's a game programmer. He's keeping pointers to instances because these
entities are part of an inheritance hierarchy[0]. That level of indirection is
essential to his problem and completely unavoidable. On the other hand,
std::list<std::unique_ptr<foo>> wouldn't change anything at all about the
particularly inefficient memory arrangement of std::list<foo* >;
std::unique_ptr is, after all, just a class with one pointer member.

When smart people write things you think are obviously dumb, you should invest
some additional time trying to understand the context of their statements
before writing them off. You'll learn more, and you won't come off as a
flippant junior developer.

[0] [http://www.codeofhonor.com/blog/tough-times-on-the-road-
to-s...](http://www.codeofhonor.com/blog/tough-times-on-the-road-to-starcraft)

~~~
pubby
> The reason you think this guy is using std::list incorrectly is because
> you're not thinking about his requirements.

The reasoning and requirements for him using intrusive lists was quite clear
to me. The remaining article that starts at "Using intrusive lists" I found
quite interesting and insightful.

My entire argument is against his evidence backing up statements like, "If
you’re a programmer who uses the C++ STL std::list, you’re doing it wrong."
It's unfair to improperly use std::list, compare it to a better solution, and
then make a sweeping generalization. That's just outrageous.

~~~
jemfinch
It's clear from the subtitle of the blog ("Game design, game programming,
deployment automation and more"), the title of the specific blog post,
("Avoiding game crashes related to linked lists"), and the very next paragraph
of your quotation ("Based on watching the successes and failures of
programmers writing games like Starcraft and Guild Wars...") that the author
is talking about a very _specific_ technique applied to a very _specific_ sort
of programming, and yet you're still complaining that he is making "a sweeping
generalization"?

Stop wasting your time and ours arguing about an interpretation of his words
the author never intended. Such literalistic argument-making serves only to
mislead and misdirect, and it contributes nothing to the conversation.

~~~
pubby
Shot him a quick email for clarification:

> "Is "std::list considered harmful" in the general field of C++ or are you
> only talking about the very specific game programming field you used it
> for?"

> "I think it's harmful for all programming, but if you read the comments
> you'll find almost as many different opinions on the subject as comments!

The problem is that manually managing lists is error-prone, which is why I use
a solution that automatically handles them; I think all programs, not just
games, would benefit from their use."

------
fusiongyro
I'm not a game programmer and I seldom use C or C++, but I don't find the
article particularly convincing. If the motivation is to reduce bugs caused by
additional allocations, he wouldn't be avoiding boost or suggesting a copy-
and-paste job with his off-the-cuff locking regime. If the motivation is
really performance, it seems like one should consider other data structures
such as std::vector. Part of the reason to use std::list et. al. is to benefit
from the STL functions—hand-writing remove, find, sort, etc. is additional
code which will need to be made correct and performant and maintained
internally.

I understand game programming is a very different enterprise with very
different tradeoffs and motivations, but I don't feel well-sold for
"pedestrian" application development.

------
ggchappell
This points out a real problem, but It think it seems a bit confused about
what the problem actually is.

In particular, this isn't something "wrong" with std::list. He has a situation
where he wants an object to manage its own membership in a mutable container.
He says you can't do this efficiently with a simple non-intrusive linked list.
He is right. You also can't do it efficiently with an array (std::vector, in
this context).

You _can_ do it efficiently with an intrusive linked list, as he points out.
You can also use a non-intrusive linked list in which each object holds an
iterator to itself. Or you can use an associative structure (std::map,
std::unordered_map), in which each object holds its own key.

The instrusive linked list solution is going to have the fastest container
insert & delete operations of all of these. But that doesn't mean it is the
best solution for every circumstance.

Another point to be made, which he kinda-sorta gets at, is that it is a good
idea to know how to code a linked list. The bulk of data structure decisions
are just figuring out what already written package to use. But there is
definitely still a place for a custom, application-specific linked list, and
these are not difficult to write.

~~~
Evbn
And yet the classic old "code a linked list" problem is now a reviled and
banned interview question at enlightened tech firms....

------
xtdx
One thing about intrusive lists, you have to know and specify every list the
item may be on. Maybe you like that, maybe you don't.

Also, it doesn't appear the provided code gracefully handles removing an item
from a list twice.

~~~
kevingadd
He mentions that it's fine to call Unlink twice. Presumably the first Unlink
zeroes out the prev/next fields.

~~~
xtdx
Indeed, I may have read the code wrong. But look at line 188.

    
    
        m_prevLink->m_nextNode = m_nextNode;
    

I don't see where m_prevLink is changed. If the previous link has gone away,
and you call RemoveFromList on this node a second time, it's going to chase
that pointer.

~~~
Ogre
RemoveFromList is a private method. All the public methods, other than the
destructor, that call it also modify m_prevLink.

------
ericbb
See also: <http://lwn.net/Articles/336255/>

(A discussion of Linux data structures, including lists).

------
ioquatix
While valid and useful in specific cases, I think that this approach is short-
sighted in general. In-place algorithms can lead to significant problems as it
is typically hard to enforce strong invariants during the game update loop. In
many cases, you wouldn't be erasing elements except as a final step in your
game update loop anyway, and you can normally do this as part of a loop where
you'd have access to the non-intrusive iterator which can then be removed
O(1). I see little benefit to using intrusive linked-lists in this context.

------
btmorex
I think there is a subtle bug in his second two person::~person examples.
Specifically, from the perspective of other threads an object is no longer
valid once its destructor has been called. So, imagine two person objects
side-by-side in a linked list. If they are both deleted simultaneously and
both destructors get called at the same time, they can no longer safely unlink
from each other even with locking because they are technically no longer
valid.

The first two examples don't have this problem though.

------
jheriko
'best defence, no be there'

linked lists are seldom the right answer, contiguous blocks of memory are
cache - and therefore algorithm - friendly.

the stl performance is usually quite easy to beat in special cases as well
(i.e. all game code). some stls are terrible as well - the ms one is riddled
with locks and all sorts of sledgehammer thread safety measures, which you
just don't need if you know which bits of code are threaded and which arent.

~~~
gsg
Linked lists are appropriate when you need (usually multiple) sequences of
pointers to objects: thus, the real alternative is not vector of t but vector
of pointer to t.

A vector of pointers gives no contiguity advantage over an intrusive list
during traversal (or any other operation).

------
cpeterso
EA open-sourced their EASTL game-optimized container library back in 2007,
including intrusive lists. Here is a detailed introduction:

[http://www.open-
std.org/jtc1/sc22/wg21/docs/papers/2007/n227...](http://www.open-
std.org/jtc1/sc22/wg21/docs/papers/2007/n2271.html)

And a github repository that is still maintained:

<https://github.com/paulhodge/EASTL>

------
Luyt
_Reliability is more important than speed, and if you’re reduced to using
those hacks for speed-gains your program needs help. Remember Y2K bugs!_

I think the reason for dropping centuries from dates was not to gain speed,
but to save two bytes. In the 70's of previous millennium, two bytes of
storage would cost a lot more than today, and also space on punch cards was
limited.

