Finger Trees: A Simple General-purpose Data Structure

wandermatt · on March 22, 2013

"This is a very nice theoretically clean and powerful sequential data structure, but in practice it is very very very very very very very very very very very slow... very very slow. No one uses finger trees because they're so slow." -Daniel Spiewak

From his 2011 NEScala talk on functional data structures, covering practical considerations like locality of references and caching. Available on vimeo http://vimeo.com/20262239

fusiongyro · on March 22, 2013

Haskell's Data.Sequence is based on finger trees. I'd be surprised if the Real World Haskell guys would recommend it so strongly if it performed "very"^N badly. They're well-known for being performance-obsessed.

"Both Haskell's built-in list type and the DList type that we defined above have poor performance characteristics under some circumstances. The Data.Sequence module defines a Seq container type that gives good performance for a wider variety of operations." -- http://book.realworldhaskell.org/read/data-structures.html#d...

carterschonwald · on March 22, 2013

Indeed. As one of those performance obsessed haskellers, a golden rule is "choose your data structures based upon your work load". This means not just the time space complexity of the operations , but also whether you want persistence vs shared state. Etc etc. finger trees are nice for workloads that are heavy on adding / removing values on the ends, with some random access, sequential scans are ok too.

I may wind up using this or a similar data structure in a number of current commercial and open source projects

dons · on March 22, 2013

The data structure has great complexity, and is closely relate to hash-array mapped tries, and ropes, which have good implementations.

Individual implementations might suck. Others are just fine (http://hackage.haskell.org/packages/archive/fingertree/0.0/d...) - though even this could be optimized (for constant factors).

marssaxman · on March 22, 2013

Slow compared to what? You'd certainly be wasting your time to use a finger tree as a simple queue, but if you need an immutable / persistent container with fast access to the ends and not-terrible random lookup, I can't think what else you would use.

wandermatt · on March 22, 2013

Compared to a Bitmapped Vector Trie.

If you don't want to watch the video, a version of this talk was also given at Strange Loop, the slides for that one are available online: http://www.infoq.com/presentations/Functional-Data-Structure...

marssaxman · on March 22, 2013

Thanks, never heard of a "bitmapped vector trie" before. I'll have to look it up.

jasonwatkinspdx · on March 22, 2013

A more common name is Array Mapped Trie. See the papers by Phil Bagwell (RIP). DJB's crit-bit Trie page is good also. Many people forget that it's rather easy to pack tries for locality since there is no rebalancing to fragment nodes.

AMT's are one of clojure's most heavily used container types.

DanWaterworth · on March 22, 2013

The Haskell unordered-containers package also uses HAMTs.

This technique of increasing the node size in order to speed up traversals on conventional architectures is actually quite general. You could do the same thing to finger-trees for example. One disadvantage is that lots of different versions of the same structure take up more space than they would otherwise.

An advantage of applying tree-broadening to a finger-tree rather than a generic balanced tree structure, is that, depending on the ordering, it may make in order traversal faster, because elements that are near each other in the structure are also (in some sense) close together in memory.

weareconvo · on March 22, 2013

You mean to tell me that even in the Functional Programming world, there are still tradeoffs between memory and performance to consider?

fusiongyro · on March 23, 2013

I'm not sure where you would have gotten the idea that there wouldn't be. By and large, we tend to be more interested in correctness than performance, but that doesn't mean we aren't interested in performance.

snprbob86 · on March 22, 2013

Tries are basically just sorted prefix trees, and you're right, they form the basis of sorted maps and sets in Clojure. However, Clojure also has https://github.com/clojure/data.finger-tree for cases where tries don't make sense: namely, double-ended queues (amortized constant time push and pop from both ends) and sorted sets with log-n index access.

AaronFriel · on March 22, 2013

NEScala -- was he implementing finger trees as Java objects? Java objects have extraordinary overhead, and Finger Trees are only efficient when the per-entry overhead is low. So naturally he prefers array-mapped tries, because arrays have substantially less per-object overhead in Java.

See this overview of the difficulty of building efficient datastructures in Java: http://www.cs.virginia.edu/kim/publicity/pldi09tutorials/mem...

AaronFriel · on March 22, 2013

Some further exposition.

Some takeaways from this slide deck: a TreeMap<Double, Double> has a minimum of 82% overhead. If you allocate 100 megabytes of memory to that tree map, only one fifth of that memory is your data, the rest is object overhead. An array, on the other hand, would have only 16 bytes of data, and thus would store close to five times as much data per megabyte. Even if you implement some strange accounting scheme on top of your array to make it act like a tree, you're probably already reaping performance benefits due to greater data locality, likelihood of having data in cache lines, etc.

I'm reminded of when I worked with libraries for Judy arrays in 2006 or so. Judy arrays are a radix tree like data structure with performance optimized for cache line sizes and dereferences at every node of the structure. Pointers to leaves are tagged in their lowest 3 bits to indicate which of 8 types of node they are. A very full node then will usually be allocated as an array, a node with fewer entries than roughly 3/4 of a cacheline worth of pointers will have an array of bytes, and an array of pointers to child nodes, located adjacently. The first list is searched, and if there's a hit, it uses that index to find the pointer to the child in the second list.

All of these optimizations are really fantastic, Judy arrays not only have more "features" than hash tables, such as very efficiently stored arbitrary length keys and the ability to access the first, last, and random elements of the tree very efficiently, but also a Judy array can be accessed in lexicographic order which cannot be done with a conventional hash table. On top of this, in practice, they outperform naive hash trees straight out, due to their eye to optimizing for real world processors' cache line size.

All of this is fantastic, except if you implemented this in Java it would be, ahem, "in practice it is very very very very very very very very very very very slow... very very slow".

wandermatt · on March 23, 2013

Yes, this on the JVM. No, that was not the issue.

DanWaterworth · on March 22, 2013

If you're unfamiliar with finger trees, I recommend staring at this [1] for a while.

[1] http://commons.wikimedia.org/wiki/File:2-3_finger_tree,_samp...