
Treaps: A Simple Balanced Binary Tree - pavpanchekha
http://pavpanchekha.com/programming/treap.html
======
jrockway
I don't think the left-rotation and right-rotation functions in this
implementation look much different from the "special cases" in red-black
trees. People hate red-black trees because they learned them in Java or C,
where they do suck. But if you implement them in Haskell, you'll see how
beautiful they are.

Here's what my implementation from some time ago looks like:

    
    
        data Color = Red | Black deriving (Show, Eq)
        data Tree a = Nil | Tree Color (Tree a) a (Tree a) deriving Show
    
        insert :: Ord a => a -> Tree a -> Tree a
        insert x t = case ins x t of
          Tree _ l x r -> Tree Black l x r
    
        ins :: Ord a => a -> Tree a -> Tree a
        ins x Nil = Tree Red Nil x Nil
        ins x t@(Tree c l y r) = case compare x y of
          EQ -> t
          LT -> balance $ Tree c (ins x l) y r
          GT -> balance $ Tree c l y (ins x r)
    
        balance (Tree Black (Tree Red (Tree Red a x b) y c) z d) = balanced a b c d x y z
        balance (Tree Black a x (Tree Red b y (Tree Red c z d))) = balanced a b c d x y z
        balance (Tree Black a x (Tree Red (Tree Red b y c) z d)) = balanced a b c d x y z
        balance (Tree Black (Tree Red a x (Tree Red b y c)) z d) = balanced a b c d x y z
        balance x = x
    
        balanced a b c d x y z = Tree Red (Tree Black a x b) y (Tree Black c z d)
    

It's very simple; insert just ensures that the tree we get from ins is Black.
ins inserts a node into the correct side of the tree, then balances it.
balance takes the four cases of unbalanced subtrees and balances them into
their most-balanced representation. It's quite a beautiful data structure.

~~~
pavpanchekha
Your implementation is indeed elegant and simple.

However, there are several things I don't like about it.

1) There are still specific cases that require memory. The cases for treaps
are always the obvious ones --- comparing priorities and comparing keys.

2) It obscures the connection to B-trees, which is what leads one to derive
all of the special cases after all.

3) Maybe I'm dumb, but the running time guarantees aren't obvious.

4) If you're doing rbtrees in Haskell, you might as well enforce your
invariants with the type system.

~~~
psykotic
> 2) It obscures the connection to B-trees, which is what leads one to derive
> all of the special cases after all.

You mean 2-3-4 trees rather than general B-trees. Anyway, that is true of all
implementations of red-black trees I have seen, and intentionally so. When you
consider the number of possible parent/child configurations when splitting and
merging 2-3-4 nodes, there's a whole lot of them. By expanding a 2-3-4 node as
a cluster of red-black nodes, the local combinatorics are more manageable.

Even the presentations of red-black trees in textbooks (e.g. CLRS) usually
obscure the connection with 2-3-4 trees, with one exception being Sedgewick's
own book. That's too bad.

> 4) If you're doing rbtrees in Haskell, you might as well enforce your
> invariants with the type system.

If you're going to do that, it's probably easier to use the isomorphism with
2-3-4 trees and work with that. There's a standard trick for representing
trees with the 2-3-4 invariant using non-regular data types:

    
    
        data Tree a = Zero a | Succ (Tree (Node a))
        data Node a = Node2 a a | Node3 a a a | Node4 a a a a
    

Alternatively, you could use left-leaning red-black trees, which have an
isomorphism with 2-3 trees.

But it's not clear to me that there's much of a practical win in encoding the
balance invariant in the type system here.

~~~
blahedo
> _Even the presentations of red-black trees in textbooks (e.g. CLRS) usually
> obscure the connection with 2-3-4 trees, with one exception being
> Sedgewick's own book. That's too bad._

This continues to make me sad. My original (ugrad) algorithms class was taught
from Sedgewick's book, and I always found RB trees to be elegant and
conceptually simple ( _including_ the insertion and deletion cases); when I
got to grad school, where CLR was the textbook of choice (and RB trees were
actually taught in an earlier course anyway), man oh man was there a lot of
wailing and gnashing of teeth over having to memorise the complicated deletion
cases. Really? I just rederived them from scratch each time, and I was faster,
and I was right.

It's a serious disservice to teach someone the red-black tree algorithms
without first teaching 2-3-4 trees and then teaching the isomorphism, and it's
a shame that so many textbooks haven't figured this out.

------
katox
Writing tree structures is always fun. Writing it in an optimized way is a lot
of work though. And what's worse one needs to reimplement quite a few things
if the problem solved by the particular tree changes.

To avoid that issue Tarjan and Werneck designed self-adjusting trees so the
user doesn't have to understand implementation details while still having
access to an effective data structure. There is a free implementation (with
relevant papers linked) so if you don't want to reinvent the wheel have a look
:)

<https://github.com/katox/toptree>

~~~
pavpanchekha
I remember a paper that compared fast implementations of treaps, AVL, and red-
black trees, and found that they are approximately equally well suited.
rbtrees are best at insertion, AVL at selection, treaps at overall performance
(if I recall correctly). I don't have the reference though...

~~~
jrockway
That result surprises me. Red-black trees do their work at insertion time, and
AVL trees split the work between insertion and selection. This should make
red-black trees faster at selection and AVL trees faster at insertion, and
both should do a set of insertions and selections in the same time.

(A red-black tree is a plain-old binary search tree for everything but
modifications.)

~~~
pavpanchekha
AVL trees give shorter path lengths. So the resulting structure ends up being
better, depending on your exact balance of insertion and selection.

------
IvoDankolov
Interesting, though it's balanced only on average. I suppose it's possible
that you can both get elements inserted in order and have the random
priorities selected in decreasing order. The chance of ending in a worst case
for a tree of non-trivial size would be exceedingly low so I don't see it
causing much issue in practice. It is random, though, and sometimes you might
prefer systems that are deterministic (say if you absolutely can't afford bad
cases).

Anyway, when it comes to simplicity I will admit it is much more appealing
than red-black trees and their 5 billion cases for insert and delete - none of
them unreasonable, true, but I didn't like debugging that thing at all the
first time I attempted to implement it. So, yeah, that's something you can
reliably implement when you don't have time/ can't be bothered to look up the
ridiculously complicated stuff and you just want to dish out a fast data
structure that you know works (binary heap - 1, suffix tree - 0, treap - 1,
RBTree - 0).

As for the problem, it should be quite simple if you just keep tabs on the
element count on the left and right subtrees, which is small price to pay
(under a few million elements, but even with you're working with hundreds of
millions - suck it up and buy an extra gig of ram). Something like this
pseudocode (hopefully with added error checking) :

    
    
      pos n root =
        if root.right.count >= n then
          pos n root.right
        else if n == 1 || root.right.count - n == 1 then
          (root.key, root.value)
        else
          pos (n - root.right.count) root.left
    

Overall, quite an interesting mix of ordered (search) trees and heaps, I like
it.

~~~
ScottBurson
Keeping the element counts gives you something called weight-balanced binary
trees. Here's a paper on them by Stephen Adams:

    
    
      http://groups.csail.mit.edu/mac/users/adams/BB/
    

I used a variation on these for my FSet functional collections library for
Common Lisp.

------
tantalor
See also: <http://en.wikipedia.org/wiki/Treap>

------
sesqu
I've never understood the allure of treaps. As long as it's only
probabilistically balanced, the only thing it offers beyond random binary
search trees is unpredictability in the face of ordered input.

In essence, the treap is the same approach to balancing as the skip list.
What's wrong with AVL trees?

edit: that said, if you do have a genuine need to prioritize some nodes, the
observation that the BST property does not preclude the heap property is a
very nice one.

~~~
jbapple
> I've never understood the allure of treaps.

Ignoring the simplicity argument, treaps can perform some other operations
(local updates and nearby searches) in optimal expected time (see theorems 3.1
and 3.2):

<http://sims.berkeley.edu/~aragon/pubs/rst96.pdf>

> As long as it's only probabilistically balanced, the only thing it offers
> beyond random binary search trees is unpredictability in the face of ordered
> input.

The randomness treaps use should be independent of the input keys, so ordered
input should produce a treap with a shape drawn from the same distribution as
the one for unordered input.

~~~
sesqu
Thanks for that paper, it looks to show that treaps are are better on average
than I presumed. I'll have to take some time and read the whole thing.

That said, their definition of optimal is big-O, which is unimpressive and I
believe also reached without any balancing, provided the input has no
duplicates.

~~~
jbapple
> That said, their definition of optimal is big-O,

There are balanced search trees with shorter maximum height, but they often
take asymptotically longer for local operations (finger update and finger
search).

> which is unimpressive

What do you mean?

> I believe also reached without any balancing

I don't know what this means in context. Treaps are randomly balanced, making
them have height O(lg n) with high probability.

> provided the input has no duplicates.

Duplicate keys are handled properly. The problem of duplicate priorities is
discussed in the paper.

~~~
sesqu
By unimpressive, I meant precisely that all balanced search trees have shorter
maximum height - they are "more optimal". I expected other performance
indicators.

I believe, but haven't verified, that expected logarithmic height is achieved
just by randomly permuting the input sequence and not doing any subsequent
balancing - so the treap approach is simply deferring the permuting to
insertion time.

I can't think of a search tree with worse finger search performance than the
treap - can you expand on that?

~~~
jbapple
> I believe, but haven't verified, that expected logarithmic height is
> achieved just by randomly permuting the input sequence and not doing any
> subsequent balancing - so the treap approach is simply deferring the
> permuting to insertion time.

That is false. (edit: What I mean is that your belief about the mechanism of
randomization is false. If you look at the paper I linked to above, you will
see that the authors explain that the randomness must be independent of the
input key values.)

> I can't think of a search tree with worse finger search performance than the
> treap - can you expand on that?

I'm not sure anymore, but I stand by my "asymptotically worse performance on
finger updates" claim.

~~~
sesqu
> What I mean is that your belief about the mechanism of randomization is
> false.

Sorry, I'm not following. Does the treap balancing mechanism not result in a
tree equivalent to what one would get permuting the input sequence with "order
by rand()"? I'm pretty sure it does. That has nothing to do with the input key
values.

> I stand by my "asymptotically worse performance on finger updates" claim.

Excellent. Can you give me an example of that, then? I've never read an
analysis of finger updates, but I'm willing to tentatively accept better
expected-case performance. I'm not able to handwave my way to anything
asymptotically better, though.

~~~
jbapple
> Does the treap balancing mechanism not result in a tree equivalent to what
> one would get permuting the input sequence with "order by rand()"? I'm
> pretty sure it does.

Yes, it does, but I was responding to you comment that "expected logarithmic
height is achieved just by randomly permuting the input sequence". It achieves
the _effect_ of having randomly permuted the input sequence, but it does not
_actually_ perform input sequence permutation.

> Can you give me an example of that, then?

Any tree with nodes that store the number of descendents (weight-balanced
search trees, for instance) take Omega(lg n) for finger updates. See also
"Heterogeneous Decomposition of Degree-Balanced Search Trees and Its
Applications" by Shan Leung ("Maverick") Woo, section 1.1.4 and "Robust
balancing in B-trees" by Scott Huddleston and Kurt Mehlhorn. They both mention
that the original flavors of B-tree have sequences of k insertion and deletion
operations where the update costs alone are omega(k).

[http://reports-archive.adm.cs.cmu.edu/anon/2009/CMU-
CS-09-13...](http://reports-archive.adm.cs.cmu.edu/anon/2009/CMU-
CS-09-133.pdf)

[http://www.mpi-inf.mpg.de/~mehlhorn/ftp/Huddleston-
Mehlhorn-...](http://www.mpi-inf.mpg.de/~mehlhorn/ftp/Huddleston-Mehlhorn-
RobustBalancing.pdf)

Another example is repeated insertion and then deletion of a single element in
a full binary tree viewed as an AVL tree. Node height (or even single bit
height-balance information like "which child is heavier") propagates all the
way to the root for both operations.

------
tantalor
> What I'm calling a "binary tree" here is normally called a binary search
> tree.

This is because it associates keys with values. Standard binary trees usually
skip the values.

I wonder why BST are not implemented as a standard binary tree plus a hash
table. I suppose the trade off depends on the size of the values.

~~~
ggchappell
>> What I'm calling a "binary tree" here is normally called a binary search
tree.

> This is because it associates keys with values. Standard binary trees
> usually skip the values.

Don't think so. A Binary Search Tree is a Binary Tree in which an inorder
traversal goes through the data -- keys, in this case -- in sorted order. Of
course, usefulness dictates that keys typically have associated values, but
they do not have to be there, for it to be a BST. (Why have keys with no
associated values? That's one way to represent a _set_.)

~~~
tantalor
Well put.

------
FredBrach
I've worked alot on binary search trees. Note that balancing a tree is only
good when you will seek only one leaf in each request. If you may seek from
zero to n leaves in one request, the fact the tree is balanced is absolutely
not particularly a good thing. In this case, here is my theory:

A binary tree is good when the probability of a node to cut its set into a
sought set and a not-sought set is the highest on average. And here is a draft
of mine of such a binary tree but I don't know in fact if it is right :p

[http://www.gamedev.net/page/resources/_//fsweet-
snippet/k-me...](http://www.gamedev.net/page/resources/_//fsweet-
snippet/k-means-heuristic-for-general-purpose-binary-search-trees-and-
animated-data-sets-r2607)

~~~
sesqu
The problem that a binary search tree optimizes for is the lookup or sequence
of lookups of data in a set with a total ordering. If you want to do range
lookups instead, the binary search tree is still excellent, but possibly not
optimal, though I'm under the impression that k-trees are usually just trivial
extensions of a BST.

As for your heuristic, I'm not clear on your meaning. It's true that the
"best" greedy top-down heuristic for a BST is to cut at the conditional median
- but whether this is the same as a datum partitioning the sought from the
unsought depends on more detail, especially since we're now talking about
range searches.

Also, provided you have the distribution at hand and are willing to write a
lot of hairy code, you can reduce the average path length by less than 1
(yay).

~~~
FredBrach
Thx for reply. The difference between BSTs and k-trees is that balancing the
tree is a huge misconception for k-trees.

Also, in my paper, I pose some basic hypothesis which are: data are points,
without any indication of distribution and the search range is a convex in the
point space. Something like a school case.

With this hypothesis which are general, balancing the tree (ak. Octree) is a
hard misconception coming from BST. It will be difficult to me to explain here
why, but it's not really hard to see it.

Take this exemple and my hypothesis: you have a logarithmic distribution of
points in 1D (but you don't know it particularly) and you are searching the
points in random continuous ranges, then the best tree is a total unbalanced
tree. And note that my heuristic will gives you a total unbalanced tree. If
you have regurlarly distributed points, my heuristic will gives you an octree.

One day, I hope, I will find a damn good mathematician to prooves that the
K-Means Tree is the best tree in the general hypothesis.:p or maybe i'm wrong
=)

~~~
sesqu
Without a distribution, the algorithms generally match what one would get with
a uniform distribution, and convexity is a very small relaxation from a unit
cube. These by themselves don't change much, provided the shape of the region
is nice. Edit: actually, the convexity might - but then to actually use the
convexity property, you'd have to move to theoretical distributions and maybe
something like veb-trees.

What do you mean by _total_ unbalanced tree? For an exponential distribution,
a _complete_ tree would naturally be suboptimal. The real trouble, as I
understand it, with k-trees using the median heuristic is that there is no
total ordering in more than one dimension, barring special cases, and so
binary partitionings can no longer keep all sought nodes inside them.

If your larger point is that assuming a uniform distribution in octrees
(instead of using a balanced tree structure) is vulnerable, then yes, I agree
fully. If not, I'll conclude that you're talking about some sort of
clustering, where the partitioning is spatial but the sought points are not
necessarily described by a range. That's interesting, too.

