
Left-leaning red-black trees are hard to implement (2008) - luu
http://t-t-travails.blogspot.com/2008/04/left-leaning-red-black-trees-are-hard.html
======
userbinator
The link to the implementation is 404 and I can't find it in that directory
either. A quick search of LLRB trees suggests that its only advantage is
supposedly "easier to implement" but that doesn't appear to be the case in
practice. In fact one of the links I found is
[http://www.read.seas.harvard.edu/~kohler/notes/llrb.html](http://www.read.seas.harvard.edu/~kohler/notes/llrb.html)

It almost feels like LLRB trees are a bit of a scam.

------
kcl
This is a valuable article that I read thoroughly when I was re-reviewing
search trees. Nobody wants to spend 60 hours learning a new tree only to have
it fail to live up to its promise. I'll summarize what I know (or think I
know) about trees here, since it may be of use to anyone who had enough of an
interest to check out the above article.

Sedgewick claims that LLRB trees are simpler, but this is hotly contested.
Most of the sources I read said that LLRB trees were more complicated, often a
lot more, than regular RB trees. All of the timings I found suggested that
LLRB trees were slower than RB trees. Looking at the code, and at the detailed
timings, I tend to agree that LLRB trees are slower and more complicated. If
there is an explanation/implementation which realizes Sedgewick's promise of
simplicity I have not seen it, and it isn't in the published LLRB material.

If you need a tree that is slower and simpler than an RB tree, with a similar
implementation, you could look at AA trees instead.

My opinion however is that, in terms of practical applications, there is no
reason to learn RB trees today instead of AVL trees. If you do research in
data structures, RB trees may be worth knowing. I think the prevalence of RB
trees is due to several factors:

1\. RB trees are featured in the common Cormen algorithms book in lieu of AVL
trees (a mistake). RB trees are also featured in the Sedgewick book.

2\. The Cormen book shows how to implement delete for RB trees. Knuth's AVL
description, and indeed most AVL descriptions, omit the delete method, which
is a serious oversight. This has caused many tree implementations to avoid
deletes or avoid AVL altogether. You can verify this via Google or GitHub.
While Knuth initially had good reason, ultimately omitting delete was a
mistake.

3\. Plenty of theoretical conjecture predicts RB trees are faster. Actual
benchmarks show that AVL trees perform better, or at worst, about the same.
(If anyone has data that shows scenarios where this is not the case, I would
be interested to see it.)

This is an unfortunate state of affairs, as the AVL tree appears to be a
strictly better tree than the RB tree. The height is bounded at about 1.5x
optimal instead of 2x optimal. You could argue the AVL tree is roughly as
difficult to understand and code, though I would say it is simpler and more
natural. The AVL tree performs generally better in actual benchmarks, and
substantially better in certain important cases (sorted inserts).

~~~
nz
If you're interested in this stuff, you might want to check out the draft
paper linked below. It benchmarks performance and memory efficiency of various
logarithmic time sorting data structures. It's a work in progress, and
feedback is always welcome :)

[http://nickziv.files.wordpress.com/2014/02/vis_ds.pdf](http://nickziv.files.wordpress.com/2014/02/vis_ds.pdf)

~~~
adrianN
This seems to confirm my hunch that there is no reason to use anything but
b-trees on modern computers, at least for single threaded code. Storing more
values in the nodes and increasing the branching factor is better because you
do more of your work inside the cache and avoid random memory access.

I also find b-trees much simpler to implement than either RB or AVL trees.

~~~
kcl
I stop short of recommending B-trees as the only tree structure you need to
know. I think if it matters at all what kind of tree you're using, you're deep
enough in the guts that it pays to know the differences between B-trees and
basic trees.

To me, this is especially relevant today, since if you need a "B-tree", you
probably need a standard relational database (MySQL/PostgreSQL/SQLite), and if
that doesn't work for you, you'd be advised to know what the tradeoffs are
with a regular tree. There are scenarios where B-trees are suboptimal.

Cache-obliviousness, which is still new, is also changing how we look at
trees, since in one sense a B-tree is a hack on the basic tree structure to
exploit locality. Representation matters more today as well. This is a longer
conversation. Suffice to say I think this is an area of active and useful
research, and wouldn't rule out either class of tree.

------
anyfoo
Here's a functional discussion and implementation of (primarily) deletion that
I once did as a project thesis: [http://www.reinference.net/llrb-delete-
julien-oster.pdf](http://www.reinference.net/llrb-delete-julien-oster.pdf)

The reason I mention that: "There already is an algorithm for deletion
available, which Sedgewick [2008] presented alongside his introduction to
Left-Leaning Red-Black trees. This algorithm, however, while being pretty
concise, follows a very imperative approach, containing a lot of nested if-
else-blocks and breaking several invariants during its run by design, later
fixing everything up using several auxiliary functions. [...]

Instead, we follow an approach which is closer to what Okasaki [1999] did with
his seminal, purely functional implementation of insertion in Red-Black trees.
That is, our algorithm relies purely on pattern matching and simple recursive
decent. Moreover, of the three functions it consists of, the two which
implement the algorithm’s bulk are completely self-contained, i.e. do not rely
on auxiliary functions."

~~~
taeric
I didn't see it in the linked paper, but how do the runtimes compare? (Did I
just miss this?)

------
gregfjohnson
It can be difficult to develop good intuition for the ideas behind the insert
and delete algorithms for even standard red black trees. Here is an
interactive web site that presents the essence of these algorithms in what I
hope is "as simple as possible but no simpler" pseudo-code, and supports
interactive play. (Insert, delete, single-step forward and backward through
the algorithms, etc.) I would be delighted to hear if people find that this
site makes the algorithms more intuitive, and any thoughts or suggestions for
improvment.

[http://www.gregfjohnson.com/redblackbuilder.html](http://www.gregfjohnson.com/redblackbuilder.html)

------
Cieplak
After reading the following implementation in python, I must agree:

[https://github.com/peterhil/leftrb/blob/master/leftrb/llrb.p...](https://github.com/peterhil/leftrb/blob/master/leftrb/llrb.py#L56)

~~~
seri
I may as well throw in my implementation in Go here:
[https://github.com/seri/goalgo/blob/master/st/llrb.go](https://github.com/seri/goalgo/blob/master/st/llrb.go)

LLRB definitely isn't as sweet as advertised, but saying it's a scam is a bit
too much.

------
duckingtest
Author's argument against using C++ isn't true. New/delete can be easily
redefined to use malloc/free [0]. You can compile (a bit limited) C++ code
without any special C++ dependencies.

[0] [http://ptspts.blogspot.com/2010/12/how-to-write-c-program-
wi...](http://ptspts.blogspot.com/2010/12/how-to-write-c-program-without-
libstdc.html)

~~~
dsp1234
The author's only argument against C++ is located in the article comments and
is " _Regarding the use of C++, that 's a non-starter; malloc goes in libc!_".
As the author was replacing the malloc[0][1] implementation, that seems a
reasonable argument.

[0] -
[http://www.freebsd.org/cgi/man.cgi?query=jemalloc&sektion=3](http://www.freebsd.org/cgi/man.cgi?query=jemalloc&sektion=3)

[1] - [http://www.canonware.com/jemalloc/](http://www.canonware.com/jemalloc/)

