
Efficient AVL Tree in C# - NicoJuicy
https://bitlush.com/blog/efficient-avl-tree-in-c-sharp
======
EdSchouten
Interesting about AVL trees is that almost all implementations out there
either use recursive algorithms or require nodes to have parent pointers. This
is due to the fact that most literature on AVL trees describe top-down
insertion/deletion and bottom-up rebalancing.

Now the interesting thing is that rebalancing can also be done using a lesser-
known top-down algorithm, allowing for a non-recursive algorithm that doesn't
rely on parent pointers.

[https://web.archive.org/web/20180315093634/http://neil.brown...](https://web.archive.org/web/20180315093634/http://neil.brown.name/blog/20041124101820)
[https://web.archive.org/web/20180320115617/http://neil.brown...](https://web.archive.org/web/20180320115617/http://neil.brown.name/blog/20041124141849)

The downside of such an approach of using two successive top-down algorithms
is that a naïve implementation would require twice as many object comparisons.
Opposed to traversing a tree bottom-up, you now have to make a choice. As both
traversals (insertion/deletion and rebalancing) are identical, this can easily
be prevented by storing the path during the first traversal. As AVL trees are
strongly balanced and only a finite number of nodes fit in an address space,
storing a path only takes very little space (a bitmask would use less than two
uintptr_t's).

I've implemented such an algorithm for FreeBSD's copy of the POSIX tsearch()
and tdelete() functions a couple of years ago:

[https://svnweb.freebsd.org/base/head/lib/libc/stdlib/tsearch...](https://svnweb.freebsd.org/base/head/lib/libc/stdlib/tsearch.c?view=markup)
[https://svnweb.freebsd.org/base/head/lib/libc/stdlib/tdelete...](https://svnweb.freebsd.org/base/head/lib/libc/stdlib/tdelete.c?view=markup)

~~~
ufo
Do you happen to remember what improvements you could get from the new
algorithm when it comes to memory usage and execution time? Or was this more
about getting rid of `void *`?

~~~
EdSchouten
This was already a couple of years ago, so I forgot some of the details.

It made a lot of sense in this case to not have a parent pointer in the tree
nodes, for the reason that that would bump the size of a node just over 32
bytes (left, right, parent and data pointers, and a balance count). This would
cause libc's allocator to allocate 64 bytes per node instead.

I also seem to remember that the performance of this implementation was pretty
much on par with a conventional recursive implementation. The only differences
were that use of stack space was a lot less (i.e., constant) and that the
machine code generated by Clang was a tiny bit smaller. I suspect that's
because the optimizer didn't have to deal with recursive code.

------
huhtenberg
> _a complete AVL tree that doesn 't use recursion and is a whole lot faster
> for it!_

A simple benchmark would've gone a long way here.

Edit - also this:

    
    
              if (_comparer.Compare(key, node.Key) < 0)
              {
                 node = node.Left;
              }
              else if (_comparer.Compare(key, node.Key) > 0)
              {
                 node = node.Right;
              }
    

needlessly doubles the number of Compare() calls in Delete() and Search().
Insert() however does it right.

------
twtw
Lots of statements about performance, with no benchmarks or numbers. If I've
learned one thing about performance optimization, it's that what's obviously
faster is frequently slower (unless cache behavior, ILP, etc is obvious to
you) - always measure.

I enjoyed the article and applaud the author's work, but I don't feel good
about the claim to beat the standard library without discussing the benchmark
that indicates this. Insertion pattern matters a lot for this kind of thing.

~~~
loeg
AVL (and other binary search) trees in general are just a poor choice for most
situations given the realities of cache behavior.

~~~
MaxBarraclough
Hence the development of the 'Judy array' \- actually a high-arity tree
optimised for good cache behaviour - right?

See also the 'masstree' which has a similar premise -
[https://news.ycombinator.com/item?id=18199086](https://news.ycombinator.com/item?id=18199086)

~~~
loeg
I was thinking of classic B-trees, but yeah something with high arity is the
right way to go.

------
wmccullough
> "It out performs Microsoft’s generic SortedDictionary<TKey, TValue> (which
> is actually a red-black tree) by a factor of 2 for inserts and a factor of 4
> for searches."

I may have missed it, but I'd love to know the versions of .net framework this
was performed against. I saw a really interesting post in MSDN about how
they've been working on performance improvements for underlying data
structures. I would love to see this up against dotnet core 2.1 or >. For
reference, this was the post:

[https://blogs.msdn.microsoft.com/dotnet/2018/04/18/performan...](https://blogs.msdn.microsoft.com/dotnet/2018/04/18/performance-
improvements-in-net-core-2-1/)

~~~
NicoJuicy
The algorithm is faster, it's not about the underlying framework.

Eg, if they test this against SortedDictionary in cotnet core, they should
also implement the mentioned algorithm in dotnet core.

I think what they mean is, their algorithm is faster than the red-black tree
algorithm.

~~~
lozenge
But there's nothing preventing Microsoft from changing SortedDictionary to use
an AVL tree. The asymptotic complexity of all the operations would be the
same.

~~~
wmccullough
I'm actually tempted to do a PR against dotnet to see if they'd accept this
improvement. My guess is that just like with Roslyn, customers have come to
count on the implementation, bugs and all, and it may not be accepted.

------
chusk3
The author mentions that the recursive way is harder to code in .net, but I
wonder if this were reimplemented in F# which supports tail-recursion if that
caveat would disappear. Also be interesting to see how this implementation
compares/contrasts with data structures over in
[https://github.com/fsprojects/FSharpx.Collections](https://github.com/fsprojects/FSharpx.Collections)

~~~
adyavanapalli
The author says the opposite:

"The non-recursive way is more efficient as the CLR does not have to keep
pushing and popping its call stack (which is quite slow). The non-recursive
way is unfortunately harder to code. So the challenge was on!"

~~~
Rizz
That's what the tail call optimization is for, it optimizes the recursive
function into a loop.

------
ignasl
Here is java comparison of various self balancing trees
([https://intelligentjava.wordpress.com/2015/04/09/self-
balanc...](https://intelligentjava.wordpress.com/2015/04/09/self-balancing-
binary-search-trees-comparison/)). It seems that AVL tree is doing quite well.

~~~
loeg
This article only evaluates self-balancing _binary_ trees. Binary trees have
terrible cache properties and something like a B-tree (or LSM-tree for the
write-heavy variant, or B-eta trees) will have much better performance than an
AVL tree.

------
nayuki
I'd like to make a bunch of disconnected comments:

* I have a different take on writing "efficient" AVL trees. Optimizing for human comprehension, I have an implementation strictly under 100 lines of code: [https://www.nayuki.io/res/aa-tree-set/BasicAvlTreeSet.java](https://www.nayuki.io/res/aa-tree-set/BasicAvlTreeSet.java) . (Unlike bitlush, mine uses recursion, explicit height tracking, and no parent pointers.)

* Java's TreeMap<K,V> is similar to C#'s SortedDictionary<TKey, TValue>. Java also uses a red-black tree.

* bitlush claims a 2× to 4× speed-up over the standard library's SortedDictionary. Red-black trees are asymptotically optimal (O(log n)), so they can't improve by more than a constant factor. I get the vague sense that bitlush was able to make a constant speed-up by simplifying some unused code or offering less functionality than the standard library. I doubt that the standard library is being intentionally slow.

* If they wanted a faster sorted dictionary, they should use a B-tree. Rust does this, and it is much more cache-friendly. For example, see [http://dtrace.org/blogs/bmc/2018/09/28/the-relative-performa...](http://dtrace.org/blogs/bmc/2018/09/28/the-relative-performance-of-c-and-rust/) .

* I think having parent pointers is a more expensive tradeoff than recursion. A binary tree node must have 2 child pointers, and adding a parent pointer brings the total to 3; this costs O(n) space. This cost is borne by all nodes all the time. Meanwhile, using recursion for insert/delete/iterator costs O(log n) space, and only temporarily during the operation.

* The author claims "I also wrote a lot of tests to ensure the code is rock solid.". The test suite is at [https://github.com/bitlush/avl-tree-c-sharp/tree/master/Bitl...](https://github.com/bitlush/avl-tree-c-sharp/tree/master/Bitlush.AvlTree.Tests) . The suite looks a bit short to me, and I don't see more than ten elements be tested. I would suggest the author implement one of my favorite ways of testing data structures: Performing thousands of random operations on your data structure, and comparing the results against a naive or standard library implementation.

~~~
lozenge
Or let a property testing framework do it for you.

------
danbruc
_I like the various .NET dictionaries but have been unimpressed by their
performance._

So how das it compare to Dictionary<TKey, TValue>? Given that operations on a
balanced search tree are O(log(n)) while they are O(1) amortized - and O(n)
worst case - on a hash-based dictionary, it doesn't seems too plausible that
even a fast implementation could score a win. Not to mention the much worse
locality of a tree. In some rare scenarios having O(n) worst case performance
might be an issue but unless you actually need an ordered dictionary I have a
hard time believing that a tree-based implementation can outperform a hash-
based implementation.

------
fjfaase
I recently work (just for fun) on a persistent set in C++ using AVL tree. See:
[http://www.iwriteiam.nl/D1807.html#31](http://www.iwriteiam.nl/D1807.html#31)
. However, I did use recursion and also stored the height.

I also worked (as a mental challenge) on an interval set using a balanced
tree. I was less succesful in keeping the tree balanced in this case. See:
[http://www.iwriteiam.nl/D1808.html#4](http://www.iwriteiam.nl/D1808.html#4)

------
xmichael999
Methods look cool and the whole project very interesting, but why no usage
example?

------
partycoder
1) [2012]

2) How is it efficient? compared to what exactly?

~~~
zamalek
Back in 2012 efficient meant a whole different thing in .Net - it mostly came
down to "this thing is probably fast."

