

The Dwarfs and The Fast Marking Algorithm - jaspervdj
http://jaspervdj.be/posts/2010-07-11-the-dwarfs-and-the-fast-marking-algorithm.html

======
tptacek
Short summary: instead of relying on initialization to a known value, ignore
the initial values of the array and rely instead on a consistency constraint
(data may be random, but won't be random and satisfy the constraint) that you
can set and check. A neat trick you can probably use in other places as well.

~~~
shasta
Anyone know the name of this algorithm? There is a similarly solved classic
problem in complexity theory: Given an O(m) list of O(n)-bounded integers,
determine if there are duplicates in O(m) time and O(n) space.

~~~
ori_b
I don't know if it has a name (it seems to simply be referred to as "an
efficient sparse set representation"), but I think I initially remember seeing
it introduced in "Compilers: Principles, Techniques, and Tools" in an exercise
at some point in the past.

A good description of it is here: [http://research.swtch.com/2008/03/using-
uninitialized-memory...](http://research.swtch.com/2008/03/using-
uninitialized-memory-for-fun-and.html)

------
duskwuff
This seems trivially incorrect. If both arrays are randomized at the outset,
there is nothing to prevent them from containing valid data indicating a state
having objects marked. In fact, if you free one set of arrays used by this
structure and immediately allocate another, there's a good chance that your
"random" initial state will, in fact, be the exact one you just freed.

~~~
jaspervdj
The trick is in the counter variable. Read the update section, and the source
code (it's not much, really).

~~~
sesqu
Right, you neglected to mention the use for the counter in your prose.

~~~
RiderOfGiraffes
I'm often left confused by people who downvote me without leaving a correction
or counter-argument, so I guess I ought to explain why I've downvoted you
here.

Who is the "you" in your sentence? The author of the comment to which you
reply? They never mention the counter. The author of the original article?
You're not replying to them.

And the counter isn't mentioned in the original prose, but it's clearly
mentioned in the description of the algorithm. Surely that's reasonable. It's
a single variable, hence constant to allocate in both space and time.

So your comment seems to make an unreasonable criticism that doesn't add
value, and that's why I've downvoted it.

~~~
sesqu
I was referring to jaspervdj, who is both the author of the comment to which I
reply and the author of the original article (well, I presume).

The article states: " _an element in_ array _can, by chance, point to an
element in_ marks _, but this won’t matter since the element in_ marks _won’t
point back, and so we can determine it’s fake._ " This sentence is missing the
part where the counter is made use of, and as such is erroneous. The mentioned
element _can_ point back, and its index must be compared with the counter to
ascertain whether it is valid.

Although that is a small omission, it is in no way trivial, and probably the
source of all the confused comments the posting received.

~~~
RiderOfGiraffes
OK, I missed the coincidence (by which I mean they are identical, not that
it's by chance) or the names of the authors. That clears up my confusion on
that point.

Regarding the lack of mention of the counter in the section of the description
you mention, I thought it was obvious. In the section before he says: _we keep
the number of marks set._ That means, to me, that if the element in _array_
accidentally/incidentally points to an element in _marks_ that
accidentally/incidentally points back, it's not counted as pointing back
because it's known (by using the counter) not to be valid.

It is subtle, I agree, but for me the explanation is clear enough already.
Adding more detail could simply overwhelm the reader with words, without
letting the reader see the wood for the trees.

I agree, now that you've expanded it, that your point is valid, but I,
personally, prefer the version as it is.

In this I expect we should simply agree to disagree. One person's "necessary
detail" is another's "pointless nit-picking."

