"SkipLists can give you log time operations on a list"
So can balanced binary trees, and a number of other structures. I'd demand benchmarks, because Python is not C (and I don't mean just trivially, I mean, it is very not just C); just as one for instance, are you hitting a pessimal case with the GC with this algorithm? You never know.
I'm not actually demanding them, since it didn't seem the point of this exercise. My point is that you can't just assume.
Interesting and valid points. My point is that the SkipList algorithms are, using the assumed primitives, logarithmic. Your point is that in Python the assumptions about the primitives may not be valid and need to be tested.
It's especially interesting to me that the standard large body of work on algorithmic complexity is assumed by many to be getting less and less relevant as other issues come into play. Issues such as memory access time, where it now matters if something is in cache or not, and if so, which cache. Issues such as whether your machine will stop completely at some inconvenient moment to perform a GC when you least expect or desire it. Issues such as whether your language is in fact caching stuff in hash tables, and so what appears in your code to be linear, isn't.
And so on.
More and more it seems necessary to resort to experiment and tests to see what happens, rather than being able to determine things for sure and definite via analysis.
You're right to be skeptical about the value of skip lists in comparison to other O(log n) type data structures like balanced trees and the like. Skip lists trade ease of implementation for probabilistic bounds, and have some minor cache advantages over balanced binary trees owing to the data structure generally not changing shape all that much, and it's hard to say whether these differences would materialize in Python.
However if I had no O(log n) structures at all and needed to implement some, I'd have to be a maniac to want to do balanced binary trees, which have notoriously delicate balancing algorithms, over skip lists which are very sweet and simple.
Exactly. Skip lists are great for two reasons (and as far as I know, only these two reasons):
1. They're really cool and clever. This is a valid reason.
2. They offer easier concurrency than most alternatives. It's straightforward (though not quite easy, unless you have transactional memory) to make a lock-free concurrent skip list. Try that with a heap or a red-black tree, and you'll quickly run into all sorts of memory conflicts and crazy-complicated locking. The fact that a skip list only provides probabilistic logarithmic time bounds really makes coordination between threads easier.
(Note that it's possible to do some similar stuff with modified versions of other data structures. For example, you can make a good concurrent dictionary by taking a red-black tree, relaxing the invariants, and adding a periodic rebalancing thread to run in the background. But that's a topic too long to fit into this post.)
I think the biggest advantage of skip lists is that if a man held a gun to my head and forced me to implement a logarithmic time algorithm I could implement skip lists in an hour or so from memory. Any of the others...maybe given a week and a couple of good books.