On the other hand, cache-oblivious algorithms (and upon first skim, the presente...

pmjordan · on April 12, 2011

As danvet has hinted at, on-disk data structures don't exist in isolation. You've generally got a CPU and memory system attached to the disk that is orders of magnitude faster[1], so if you can reduce the number of dependent disk I/O operations by increasing the required processing power, that's usually a trade-off you want to take. As an example, linear searches through each 4kiB disk block you read are absolutely acceptable, whereas doing that for an associative data structure in memory is madness.

In any case, having inspected the 2 papers a little more in depth, this data structure is very different from Clojure's in that it holds multiple versions directly.[2] Additionally, it does indeed take all the trade-offs to minimise disk I/O in favour of more computation. I find it unlikely that much insight can be gained from this for in-memory data structures.

[1] you can expect ~10ms latency for spinning disks, ~10-100µs for solid-state drives, ~10ns for memory and ~1ns for the CPU. spinning disk to memory is 6 orders of magnitude, 3-4 OOM for solid-state drives.

[2] In Clojure the MVCC STM is implemented at the ref level, not the collection level. Doing it at the collection level may be possible, but probably not desireable as complexity will go through the roof.

andytwigg · on April 12, 2011

In general, I agree - as you'd expect, practical implementations of this structure will naturally deviate quite far from the paper's presentation, and indeed we do make many optimizations for SSDs and rotating disks. But, in theory, disk vs RAM is no different to RAM vs L2. A central trick is the 'stratification' procedure that operates on arrays - that could be helpful elsewhere.

crux_ · on April 12, 2011

I agree with everything you said (hence the winky hanging off of "theory"), except for this:

> I find it unlikely that much insight can be gained from this for in-memory data structures.

Reasons for my skepticism:

First, better big O results usually come with some valuable insights that can be reapplied.

Second, the tradeoffs in constant factors are really hard to predict. E.g. the extra CPU cycles may pay for themselves by increased cache coherency or better behaviour under multi-core contention. Hard things to even measure, let alone predict. ;)

Third, besides caring about absolute speed, degradation is important in some cases -- an in-memory-only implementation that the OS decides to swap out to disk won't hit (as much of) a brick wall if/when that happens.

Fourth -- although the extra complexity is prohibitive, MVCC tuned specifically for a specific data structure will almost certainly do way better on all kinds of metrics than a language/runtime-wide implementation. Appropriate for clojure? Not my call. Useful in general? Probably so.

danvet · on April 12, 2011

The problem is that the amount of raw cpu power you can burn for a storage miss is _much_ larger than for a cache miss. So the tuning is invariable completely different, leading sometimes to completely different algorithms.

pjscott · on April 12, 2011

The theory guarantees optimality to within a constant factor. I've seen some pretty big constants, though, so in practice of course you have to actually measure how it performs on a representative workload.