Treaps: A Simple Balanced Binary Tree 73 points by pavpanchekha on Sept 11, 2011 | hide | past | web | favorite | 36 comments

 I don't think the left-rotation and right-rotation functions in this implementation look much different from the "special cases" in red-black trees. People hate red-black trees because they learned them in Java or C, where they do suck. But if you implement them in Haskell, you'll see how beautiful they are.Here's what my implementation from some time ago looks like:`````` data Color = Red | Black deriving (Show, Eq) data Tree a = Nil | Tree Color (Tree a) a (Tree a) deriving Show insert :: Ord a => a -> Tree a -> Tree a insert x t = case ins x t of Tree _ l x r -> Tree Black l x r ins :: Ord a => a -> Tree a -> Tree a ins x Nil = Tree Red Nil x Nil ins x t@(Tree c l y r) = case compare x y of EQ -> t LT -> balance \$ Tree c (ins x l) y r GT -> balance \$ Tree c l y (ins x r) balance (Tree Black (Tree Red (Tree Red a x b) y c) z d) = balanced a b c d x y z balance (Tree Black a x (Tree Red b y (Tree Red c z d))) = balanced a b c d x y z balance (Tree Black a x (Tree Red (Tree Red b y c) z d)) = balanced a b c d x y z balance (Tree Black (Tree Red a x (Tree Red b y c)) z d) = balanced a b c d x y z balance x = x balanced a b c d x y z = Tree Red (Tree Black a x b) y (Tree Black c z d) `````` It's very simple; insert just ensures that the tree we get from ins is Black. ins inserts a node into the correct side of the tree, then balances it. balance takes the four cases of unbalanced subtrees and balances them into their most-balanced representation. It's quite a beautiful data structure.
 Here's an interesting partial implementation (no delete):https://github.com/yairchu/red-black-tree/blob/master/RedBla...The AVL has a fuller implementation:https://github.com/yairchu/red-black-tree/blob/master/AvlTre...It's significantly more complex, but it gives a type-level/compile-time guarantee that the RB tree invariants are kept, which is nice.
 Your implementation is indeed elegant and simple.However, there are several things I don't like about it.1) There are still specific cases that require memory. The cases for treaps are always the obvious ones --- comparing priorities and comparing keys.2) It obscures the connection to B-trees, which is what leads one to derive all of the special cases after all.3) Maybe I'm dumb, but the running time guarantees aren't obvious.4) If you're doing rbtrees in Haskell, you might as well enforce your invariants with the type system.
 > 2) It obscures the connection to B-trees, which is what leads one to derive all of the special cases after all.You mean 2-3-4 trees rather than general B-trees. Anyway, that is true of all implementations of red-black trees I have seen, and intentionally so. When you consider the number of possible parent/child configurations when splitting and merging 2-3-4 nodes, there's a whole lot of them. By expanding a 2-3-4 node as a cluster of red-black nodes, the local combinatorics are more manageable.Even the presentations of red-black trees in textbooks (e.g. CLRS) usually obscure the connection with 2-3-4 trees, with one exception being Sedgewick's own book. That's too bad.> 4) If you're doing rbtrees in Haskell, you might as well enforce your invariants with the type system.If you're going to do that, it's probably easier to use the isomorphism with 2-3-4 trees and work with that. There's a standard trick for representing trees with the 2-3-4 invariant using non-regular data types:`````` data Tree a = Zero a | Succ (Tree (Node a)) data Node a = Node2 a a | Node3 a a a | Node4 a a a a `````` Alternatively, you could use left-leaning red-black trees, which have an isomorphism with 2-3 trees.But it's not clear to me that there's much of a practical win in encoding the balance invariant in the type system here.
 > Even the presentations of red-black trees in textbooks (e.g. CLRS) usually obscure the connection with 2-3-4 trees, with one exception being Sedgewick's own book. That's too bad.This continues to make me sad. My original (ugrad) algorithms class was taught from Sedgewick's book, and I always found RB trees to be elegant and conceptually simple (including the insertion and deletion cases); when I got to grad school, where CLR was the textbook of choice (and RB trees were actually taught in an earlier course anyway), man oh man was there a lot of wailing and gnashing of teeth over having to memorise the complicated deletion cases. Really? I just rederived them from scratch each time, and I was faster, and I was right.It's a serious disservice to teach someone the red-black tree algorithms without first teaching 2-3-4 trees and then teaching the isomorphism, and it's a shame that so many textbooks haven't figured this out.
 I think that Haskell's standard library implementation of a binary search tree (Data.Map) had long had a balancing bug...It probably doesn't anymore, but I guess it probably makes it a practical win to get type-level guarantees.Also, guarantees about correctness are easier to get from tools like QuickCheck. Balance errors will be harder to spot.
 > There's a standard trick for representing trees with the 2-3-4 invariant using non-regular data types:It should be noted that this puts a Theta(lg n) singly-linked list of Succ nodes at the top of the tree, thus doubling the effective height of the tree.
 Deletion is trickier:http://matt.might.net/articles/red-black-delete/I suspect treap delete is simpler.
 Writing tree structures is always fun. Writing it in an optimized way is a lot of work though. And what's worse one needs to reimplement quite a few things if the problem solved by the particular tree changes.To avoid that issue Tarjan and Werneck designed self-adjusting trees so the user doesn't have to understand implementation details while still having access to an effective data structure. There is a free implementation (with relevant papers linked) so if you don't want to reinvent the wheel have a look :)
 I remember a paper that compared fast implementations of treaps, AVL, and red-black trees, and found that they are approximately equally well suited. rbtrees are best at insertion, AVL at selection, treaps at overall performance (if I recall correctly). I don't have the reference though...
 That result surprises me. Red-black trees do their work at insertion time, and AVL trees split the work between insertion and selection. This should make red-black trees faster at selection and AVL trees faster at insertion, and both should do a set of insertions and selections in the same time.(A red-black tree is a plain-old binary search tree for everything but modifications.)
 AVL trees give shorter path lengths. So the resulting structure ends up being better, depending on your exact balance of insertion and selection.
 Whaaa? AVL trees don't split the work between insertion and selection. Selections are plain old read-only BST lookups.
 I read a paper in the spring that purported to compare, among others, red-black trees and treaps. It found that treaps were better for ordered operations but worse for random operations.edit: For the actual comparison between the three: treaps give no guarantees. Red-black trees sacrifice some lookup speed in favor of insertions.
 I thought top trees were a data structure for implementing "dynamic trees" or "link-cut trees". How can they be used for implementing search trees?Do the logarithmic update times in top trees take into account the time to locate the node to update?
 Interesting, though it's balanced only on average. I suppose it's possible that you can both get elements inserted in order and have the random priorities selected in decreasing order. The chance of ending in a worst case for a tree of non-trivial size would be exceedingly low so I don't see it causing much issue in practice. It is random, though, and sometimes you might prefer systems that are deterministic (say if you absolutely can't afford bad cases).Anyway, when it comes to simplicity I will admit it is much more appealing than red-black trees and their 5 billion cases for insert and delete - none of them unreasonable, true, but I didn't like debugging that thing at all the first time I attempted to implement it. So, yeah, that's something you can reliably implement when you don't have time/ can't be bothered to look up the ridiculously complicated stuff and you just want to dish out a fast data structure that you know works (binary heap - 1, suffix tree - 0, treap - 1, RBTree - 0).As for the problem, it should be quite simple if you just keep tabs on the element count on the left and right subtrees, which is small price to pay (under a few million elements, but even with you're working with hundreds of millions - suck it up and buy an extra gig of ram). Something like this pseudocode (hopefully with added error checking) :`````` pos n root = if root.right.count >= n then pos n root.right else if n == 1 || root.right.count - n == 1 then (root.key, root.value) else pos (n - root.right.count) root.left `````` Overall, quite an interesting mix of ordered (search) trees and heaps, I like it.
 Keeping the element counts gives you something called weight-balanced binary trees. Here's a paper on them by Stephen Adams:`````` http://groups.csail.mit.edu/mac/users/adams/BB/ `````` I used a variation on these for my FSet functional collections library for Common Lisp.
 I wrote a binary tree that used node counts in a similar way. However, the insertion rules included checking whether a rotation on the inserted node would result in a subtree that could again be rotated by the same rule. This gave a cutoff ratio of left-to-right counts, below which it didn't make sense to rotate-- and hence kept every subtree pretty nicely balanced.Another benefit to having the node counts-- and this, I think, was the original motivation-- was it made the tree "binary" not only for looking up a particular key, but also for looking up the Nth key-ordered entry (which was important in the application).
 I've never understood the allure of treaps. As long as it's only probabilistically balanced, the only thing it offers beyond random binary search trees is unpredictability in the face of ordered input.In essence, the treap is the same approach to balancing as the skip list. What's wrong with AVL trees?edit: that said, if you do have a genuine need to prioritize some nodes, the observation that the BST property does not preclude the heap property is a very nice one.
 > I've never understood the allure of treaps.Ignoring the simplicity argument, treaps can perform some other operations (local updates and nearby searches) in optimal expected time (see theorems 3.1 and 3.2):http://sims.berkeley.edu/~aragon/pubs/rst96.pdf> As long as it's only probabilistically balanced, the only thing it offers beyond random binary search trees is unpredictability in the face of ordered input.The randomness treaps use should be independent of the input keys, so ordered input should produce a treap with a shape drawn from the same distribution as the one for unordered input.
 Thanks for that paper, it looks to show that treaps are are better on average than I presumed. I'll have to take some time and read the whole thing.That said, their definition of optimal is big-O, which is unimpressive and I believe also reached without any balancing, provided the input has no duplicates.
 > That said, their definition of optimal is big-O,There are balanced search trees with shorter maximum height, but they often take asymptotically longer for local operations (finger update and finger search).> which is unimpressiveWhat do you mean?> I believe also reached without any balancingI don't know what this means in context. Treaps are randomly balanced, making them have height O(lg n) with high probability.> provided the input has no duplicates.Duplicate keys are handled properly. The problem of duplicate priorities is discussed in the paper.
 By unimpressive, I meant precisely that all balanced search trees have shorter maximum height - they are "more optimal". I expected other performance indicators.I believe, but haven't verified, that expected logarithmic height is achieved just by randomly permuting the input sequence and not doing any subsequent balancing - so the treap approach is simply deferring the permuting to insertion time.I can't think of a search tree with worse finger search performance than the treap - can you expand on that?
 > I believe, but haven't verified, that expected logarithmic height is achieved just by randomly permuting the input sequence and not doing any subsequent balancing - so the treap approach is simply deferring the permuting to insertion time.That is false. (edit: What I mean is that your belief about the mechanism of randomization is false. If you look at the paper I linked to above, you will see that the authors explain that the randomness must be independent of the input key values.)> I can't think of a search tree with worse finger search performance than the treap - can you expand on that?I'm not sure anymore, but I stand by my "asymptotically worse performance on finger updates" claim.
 > What I mean is that your belief about the mechanism of randomization is false.Sorry, I'm not following. Does the treap balancing mechanism not result in a tree equivalent to what one would get permuting the input sequence with "order by rand()"? I'm pretty sure it does. That has nothing to do with the input key values.> I stand by my "asymptotically worse performance on finger updates" claim.Excellent. Can you give me an example of that, then? I've never read an analysis of finger updates, but I'm willing to tentatively accept better expected-case performance. I'm not able to handwave my way to anything asymptotically better, though.
 > Does the treap balancing mechanism not result in a tree equivalent to what one would get permuting the input sequence with "order by rand()"? I'm pretty sure it does.Yes, it does, but I was responding to you comment that "expected logarithmic height is achieved just by randomly permuting the input sequence". It achieves the effect of having randomly permuted the input sequence, but it does not actually perform input sequence permutation.> Can you give me an example of that, then?Any tree with nodes that store the number of descendents (weight-balanced search trees, for instance) take Omega(lg n) for finger updates. See also "Heterogeneous Decomposition of Degree-Balanced Search Trees and Its Applications" by Shan Leung ("Maverick") Woo, section 1.1.4 and "Robust balancing in B-trees" by Scott Huddleston and Kurt Mehlhorn. They both mention that the original flavors of B-tree have sequences of k insertion and deletion operations where the update costs alone are omega(k).http://reports-archive.adm.cs.cmu.edu/anon/2009/CMU-CS-09-13...http://www.mpi-inf.mpg.de/~mehlhorn/ftp/Huddleston-Mehlhorn-...Another example is repeated insertion and then deletion of a single element in a full binary tree viewed as an AVL tree. Node height (or even single bit height-balance information like "which child is heavier") propagates all the way to the root for both operations.
 Simplicity of code, and the existence of hooks to add orthogonal functionality. Not all of programming is picking an optimal data structure and then implementing it; sometimes, it involves planning for maintenance and change.
 Fair enough, but I personally believe the AVL structure is sufficiently simple.Let me compare treaps and AVL trees: Both insert at the same point, then proceed to rebalance. The treap will rotate upwards until p_new>p_parent. The AVL will update balance information upwards until bal_parent%2==1, then update and possibly rotate once.That's a couple of conditional operations extra, and field operations instead of tree operations. The final step in particular is less elegant, but straggling cleanup operations are everyday things and well worth the better performance, to me.
 > What I'm calling a "binary tree" here is normally called a binary search tree.This is because it associates keys with values. Standard binary trees usually skip the values.I wonder why BST are not implemented as a standard binary tree plus a hash table. I suppose the trade off depends on the size of the values.
 >> What I'm calling a "binary tree" here is normally called a binary search tree.> This is because it associates keys with values. Standard binary trees usually skip the values.Don't think so. A Binary Search Tree is a Binary Tree in which an inorder traversal goes through the data -- keys, in this case -- in sorted order. Of course, usefulness dictates that keys typically have associated values, but they do not have to be there, for it to be a BST. (Why have keys with no associated values? That's one way to represent a set.)
 Well put.
 I've worked alot on binary search trees. Note that balancing a tree is only good when you will seek only one leaf in each request. If you may seek from zero to n leaves in one request, the fact the tree is balanced is absolutely not particularly a good thing. In this case, here is my theory:A binary tree is good when the probability of a node to cut its set into a sought set and a not-sought set is the highest on average. And here is a draft of mine of such a binary tree but I don't know in fact if it is right :p
 The problem that a binary search tree optimizes for is the lookup or sequence of lookups of data in a set with a total ordering. If you want to do range lookups instead, the binary search tree is still excellent, but possibly not optimal, though I'm under the impression that k-trees are usually just trivial extensions of a BST.As for your heuristic, I'm not clear on your meaning. It's true that the "best" greedy top-down heuristic for a BST is to cut at the conditional median - but whether this is the same as a datum partitioning the sought from the unsought depends on more detail, especially since we're now talking about range searches.Also, provided you have the distribution at hand and are willing to write a lot of hairy code, you can reduce the average path length by less than 1 (yay).
 Thx for reply. The difference between BSTs and k-trees is that balancing the tree is a huge misconception for k-trees.Also, in my paper, I pose some basic hypothesis which are: data are points, without any indication of distribution and the search range is a convex in the point space. Something like a school case.With this hypothesis which are general, balancing the tree (ak. Octree) is a hard misconception coming from BST. It will be difficult to me to explain here why, but it's not really hard to see it.Take this exemple and my hypothesis: you have a logarithmic distribution of points in 1D (but you don't know it particularly) and you are searching the points in random continuous ranges, then the best tree is a total unbalanced tree. And note that my heuristic will gives you a total unbalanced tree. If you have regurlarly distributed points, my heuristic will gives you an octree.One day, I hope, I will find a damn good mathematician to prooves that the K-Means Tree is the best tree in the general hypothesis.:p or maybe i'm wrong =)
 Without a distribution, the algorithms generally match what one would get with a uniform distribution, and convexity is a very small relaxation from a unit cube. These by themselves don't change much, provided the shape of the region is nice. Edit: actually, the convexity might - but then to actually use the convexity property, you'd have to move to theoretical distributions and maybe something like veb-trees.What do you mean by total unbalanced tree? For an exponential distribution, a complete tree would naturally be suboptimal. The real trouble, as I understand it, with k-trees using the median heuristic is that there is no total ordering in more than one dimension, barring special cases, and so binary partitionings can no longer keep all sought nodes inside them.If your larger point is that assuming a uniform distribution in octrees (instead of using a balanced tree structure) is vulnerable, then yes, I agree fully. If not, I'll conclude that you're talking about some sort of clustering, where the partitioning is spatial but the sought points are not necessarily described by a range. That's interesting, too.

Search: