Hacker News new | comments | show | ask | jobs | submit login

The speediness you get is algorithmic. SkipLists can give you log time operations on a list - Python can largely only give you linear time.

(The above is a serious simplification, but contains enough truth to be worth stating. No doubt others more knowledgeable than I can expand and enhance.)

"SkipLists can give you log time operations on a list"

So can balanced binary trees, and a number of other structures. I'd demand benchmarks, because Python is not C (and I don't mean just trivially, I mean, it is very not just C); just as one for instance, are you hitting a pessimal case with the GC with this algorithm? You never know.

I'm not actually demanding them, since it didn't seem the point of this exercise. My point is that you can't just assume.

Interesting and valid points. My point is that the SkipList algorithms are, using the assumed primitives, logarithmic. Your point is that in Python the assumptions about the primitives may not be valid and need to be tested.

It's especially interesting to me that the standard large body of work on algorithmic complexity is assumed by many to be getting less and less relevant as other issues come into play. Issues such as memory access time, where it now matters if something is in cache or not, and if so, which cache. Issues such as whether your machine will stop completely at some inconvenient moment to perform a GC when you least expect or desire it. Issues such as whether your language is in fact caching stuff in hash tables, and so what appears in your code to be linear, isn't.

And so on.

More and more it seems necessary to resort to experiment and tests to see what happens, rather than being able to determine things for sure and definite via analysis.

Exciting times, but slightly disappointing.

You're right to be skeptical about the value of skip lists in comparison to other O(log n) type data structures like balanced trees and the like. Skip lists trade ease of implementation for probabilistic bounds, and have some minor cache advantages over balanced binary trees owing to the data structure generally not changing shape all that much, and it's hard to say whether these differences would materialize in Python.

However if I had no O(log n) structures at all and needed to implement some, I'd have to be a maniac to want to do balanced binary trees, which have notoriously delicate balancing algorithms, over skip lists which are very sweet and simple.

Besides, if I remember correctly, the most convincing point of skip list is that it provides better concurrency. In the environment where python still have GIL, I would be skeptical about that.

Exactly. Skip lists are great for two reasons (and as far as I know, only these two reasons):

1. They're really cool and clever. This is a valid reason.

2. They offer easier concurrency than most alternatives. It's straightforward (though not quite easy, unless you have transactional memory) to make a lock-free concurrent skip list. Try that with a heap or a red-black tree, and you'll quickly run into all sorts of memory conflicts and crazy-complicated locking. The fact that a skip list only provides probabilistic logarithmic time bounds really makes coordination between threads easier.

(Note that it's possible to do some similar stuff with modified versions of other data structures. For example, you can make a good concurrent dictionary by taking a red-black tree, relaxing the invariants, and adding a periodic rebalancing thread to run in the background. But that's a topic too long to fit into this post.)

I think the biggest advantage of skip lists is that if a man held a gun to my head and forced me to implement a logarithmic time algorithm I could implement skip lists in an hour or so from memory. Any of the others...maybe given a week and a couple of good books.

  ... if a man held a gun to my head and forced me
  to implement a logarithmic time algorithm I could
  implement skip lists in an hour or so ...
Does this happen to you often?

> The speediness you get is algorithmic.

Don't you mean asymptotic?

No, I mean that the speed increase you get is from a change in the algorithm.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact