
Modern B-Tree Techniques (2011) - pointy_hat
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.219.7269&rep=rep1&type=pdf
======
makmanalp
For those who don't know, Goetz Graefe is sort of the patron saint of btrees.
He recently won the Sigmod Codd innovations award for his contributions:

[https://sigmod.org/sigmod-awards/people/goetz-
graefe-2017-si...](https://sigmod.org/sigmod-awards/people/goetz-
graefe-2017-sigmod-edgar-f-codd-innovations-award/)

He also has a lot to say about locking in B-trees:

[http://15721.courses.cs.cmu.edu/spring2016/papers/a16-graefe...](http://15721.courses.cs.cmu.edu/spring2016/papers/a16-graefe.pdf)

------
cryptonector
> Probably the strongest arguments for B-trees over hash indexes pertain to
> multi-field indexes and to nonuniform distributions of key values. A hash
> index on multiple fields requires search keys for all those fields such that
> a hash value can be calculated. A B-tree index, on the other hand, can
> efficiently support exact-match queries for a prefix of the index key, i.e.,
> any number of leading index fields

Yes, well, if none of the columns/fields in the index are sufficiently
selective, then the ability to search a B-tree by prefix may not do you much
good. Don't get me wrong though: I fully agree with all of the given reasons
for B-tree superiority to hash tables.

Although the newest storage technologies (particularly very fast byte-
addressable NVMs) -- too new for TFA -- might well change things. When you
have fast byte-addressable storage it will pay to not read pages, and hash
tables may yet involve fewer accesses than B-trees in that case, though I
suspect B-trees will successfully adapt anyways.

~~~
makmanalp
One thing that doesn't make the two interchangeable is total ordering and
range queries, which are only supported by B-trees and not by hash indexes. So
the question is whether you're willing to trade that away for faster insertion
and lookup.

> When you have fast byte-addressable storage it will pay to not read pages

It always pays to not read pages :-) An alternative way to look at it is that
faster storage is more forgiving of more accesses, meaning the benefits of
hash indexes might be less prominent.

~~~
narag
_So the question is whether you 're willing to trade that away for faster
insertion and lookup._

I wouldn't characterize the decision as a trade off. They're different tools
for different situations.

------
hyc_symas
Well, it's an interesting survey. I'm surprised at how much space it devoted
to illustrating B-trees in the context of RDBMSs as a lot of the concepts
there really had no specific connection to B-trees.

So much space devoted to logging, locking and latches. All heavily used in the
industry up till the last decade. All obsolete now. Not a single mention of
immutable/persistent B-trees. Were there really no academic papers on the
topic written in a time covered by this survey?

------
mistercow
This is from 2011, in case you missed that. I was confused by sentences like:

>Flash memory, flash devices, and other solid state storage technology are
about to change the memory hierarchy in computer systems in general and in
data management in particular.

until I looked up at the publication date.

~~~
sctb
Thanks, we've updated the title!

------
bootcat
Really good read ! Must read for people who like algorithms and competitive
programming !

