Section 2.3, "Analysis of p" is incomplete. The ideal value is 1/e. In most practical applications, you want to start with setting it to 1/e, which is a good speed/memory tradeoff. I tried for years to get Redis and some widely-used libraries to update their default values for everyone's benefit, as this one of my favorite data structures.
Zip trees are novel but their performance (and therefore also skip lists, since they are isomorphic) lacks behind other linked structures like Treaps and especially LBSTs. [1] I personally find skip lists to be overhyped binary search trees in disguise.
Absolutely trees in disguise, in fact there are deterministic versions of skip lists that are equivalent to B-trees. An 1-2 skip list (where each pointer to next can skip from 1 to 2 pointers on the level below) is isomorphic to a 2-3 tree.
I always felt like Pugh was pointedly ignoring the cost of inconsistently sized data structures and felt like more transparency there was necessary. It's been a really long time since I dug into them though so I could be out of date.
I'd also like to see better investigations into Treaps balanced not for fairness but for average access time, so that more frequently used values are faster to look up than uncommon values.
I don't know anything about zip trees, but I'd think in a common scenario the memory addresses are reasonably consecutive. Would that be a problem? If not, why not just assign consecutive ranks in the first place?
First sentence of the abstract: "Skiplists have become prevalent in systems."
They have? I've always viewed them as a curiosity. They're a ton easier to understand than balanced trees and have the same performance behavior, which sounds great. But the allocation mess[1] makes them lose to RB or AVL trees in, basically every system I can think of. Is any major software using skiplists as a standard ordered container or map?
[1] You either need to pay for Log2(N) pointers per item, or allocate them from a heap with variable header sizes. Both of those choices are really pessimal when compared with fixed-size metadata. Skiplists pretty much can't be intrusive, for example.
> Balancing binary search trees suffer from lock contention more than skip lists.
Hm... I guess the argument would be that the various list insertions can be independently synchronized? Certainly lookup is going to be a r/w lock or whatever and basically a wash. I vaguely buy that but would want to see numbers.
But that said, the hash made of the heap due to the variable size nodes and lack of intrusivity is going to have exactly the opposite effect for any high performance implementation. Maybe Java doesn't play in that sandbox, I guess.
Garbage collectors are really nice for concurrent data structures, making lock-free algorithms very practical. Java's skip list does not require any locks for write, and requires essentially no synchronisation at all for reads (just suitable platform-dependent memory barriers). I believe reads are also wait free.
There is a locality penalty for lookups, although I don't think this is core to skip-lists, just an impracticality of the Java language and how you can use its standard libraries. The variable size of the nodes is not a problem for the Java heap, due to how compacting garbage collectors work.
Pretty sure I learned about skip lists as a lock-free data structure that is relatively easy to reason about (but it was a while ago and I’m not finding confirmation in a quick search).
the title makes this review sound a lot less comprehensive than it really is. it covers basically every variant of pugh's skip list ever published, so you can in fact skip most of them
but this is the paper you want if you need to look up what variants exist of, say, the interval skip list, and how they compare. as it happens, that's exactly what i needed today
Ref: http://www.sciencedirect.com/science/article/pii/03043975940...