Hacker News new | past | comments | ask | show | jobs | submit login
Discovering the Computer Science Behind Postgres Indexes (codeship.com)
105 points by kilimchoi on June 22, 2015 | hide | past | web | favorite | 21 comments

If this interests you, I highly recommend exploring the documentation in the postgres source:



One of the best classes I took in college CS was the database class, where we dove into the internals of Postgres and learned how the core CS concepts we had learned the year prior applied in real life. It was a great lightbulb moment that solidified the education unexpectedly well, in hindsight. And of course, gave me great respect for the code quality and rigor of the Postgres system.

In my undergrad CS class we had to write a small RDBMS using the techniques learned in class and then had to dive into the internals of Postgres. It remains my favorite class taken during my undergrad, followed closely by the graduate level computer graphics course I was able to take.

Bravo for the amazingly detailed writeup. I know how long these things can take, and give kudos to the author.

I've wanted something like this for years! A great real world example of a data structure that really made an impression on me during my computer science education, connecting theory to implementation

Am I correct in thinking that if it weren't for the order-by statement in the first query, it would have stopped after the first occurrence?

It seems that way to me, certainly.

No. Without any indication that the value in that column is unique the query engine must read the entire table in order to insure that it has found every row that meets the search criteria.

But the original query includes "limit 1". Why is it necessary to find every row meeting the search criterion? Only one will be returned, and without an ORDER BY clause it doesn't matter which one.

You appear to be assuming that the limit is applied while fetching tuples, rather than while filtering the tuple set after fetch.

It occurs to me that if you handle LIMIT during fetch, you'll add complexity the fetch, and might only see run time gains in the cases where the number of desired rows is small.

If a column contains unique data it should be marked as such in whatever way your RDBMS requires (e.g. UNIQUE constraint and/or UNIQUE index)

Yes, without the limit it will stream only the first matching row.

"""B-Link-Trees: Lehman and Yao’s paper actually discusses an innovation they researched related to concurrency and locking when multiple threads are using the same B-Tree. Remember, Postgres’s code and algorithms need to be multithreaded because many clients could be searching or modifying the same index at the same time. By adding another pointer from each B-Tree node to the next sibling node — the so-called “right arrow” — one thread can search a tree even while a second thread is splitting a node without locking the entire index"""

The big innovation of Lehman and Yao was that their right link technique obviates the need for "latch coupling", or in Postgres terms the crabbing of buffer locks. In other words, the Postgres implementation need only lock one page/buffer at a time as an index scan descends the B-Tree, rather than alternately locking a single page, then its child, then unlocking the parent, and once again locking the child (the grandchild of the original page). Certain other B-Tree implementations must do this until their descent reaches a leaf page, and locking multiple pages at once just to service index scans can hurt concurrency a lot (I think that they might need to be exclusive locks for writes, which is particularly poor).

These low-level locks are not to be confused with lock manager locks that have deadlock detection and so on. They're low-level Readers–writer locks, and are far more lightweight (hence the Postgres term "LWLock").

The basic idea of Lehman and Yao is that index scans of all types may detect and recover from a concurrent page split my following a right-link (having found that the page high key is logically less than the scankey value). It won't matter that the parent page was observed to not have the new downlink that is inserted after a page split, except that we need to look right.

I'm always at a loss.

"indices is generally preferred in mathematical, financial, and technical contexts, while indexes is relatively common in general usage"

I think of "indices" as the plural of index and "indexes" as a present verb meaning to index (e.g. the "the background process indexes the tables").

This isn't hard to explain -- most people producing "general usage" don't know Latin and nearly never refer to even a singular index.

Different languages have different plurals. Both are accepted because of the history of using latin suffixes (i.e. indices) and because it is the correct way to pluralize regular noun (i.e. indexes). This is the same reason "iris" has three accepted plurals: iris (genitive, from common use and genus name), irises (english plural), and irides (latin plural).

Something's gone wrong here. If iris is the genitive, no plural form can be irides. (Also, I tend to think of iris as being from Greek, although apparently the English word does come from a Latin borrowing.)

Yes, sorry, you're correct, I had the nominative and genitive cases swapped when I checked the dictionary.

"The term B-Tree actually stands for “balanced tree.”"

B-tree means "binary tree" and are not inherently balanced. There are plenty of specific types of self balancing tree algorithms, but I believe the default implementation of b-trees by definition are not self-balanced.

Edit: this comment is incorrect. There is a difference between b-trees and binary trees.

The wikipedia article on b-trees suggests a variety of possibilities for the etymology, "binary" not really being one of them:

After a talk at CPM 2013 (24th Annual Symposium on Combinatorial Pattern Matching, Bad Herrenalb, Germany, June 17–19, 2013), Ed McCreight answered a question on B-tree's name by Martin Farach-Colton saying: "Bayer and I were in a lunch time where we get to think a name. And we were, so, B, we were thinking… B is, you know… We were working for Boeing at the time, we couldn't use the name without talking to lawyers. So, there is a B. It has to do with balance, another B. Bayer was the senior author, who did have several years older than I am and had many more publications than I did. So there is another B. And so, at the lunch table we never did resolve whether there was one of those that made more sense than the rest. What really lives to say is: the more you think about what the B in B-trees means, the better you understand B-trees."[3]


I was confused. The wiki article says at the top "Not to be confused with Binary tree." Then in the first paragraph:

"The B-tree is a generalization of a binary search tree in that a node can have more than two children (Comer 1979, p. 123). Unlike self-balancing binary search trees, the B-tree is optimized for systems that read and write large blocks of data."

Long ago in one of my CS classes, I wrote binary tree, 2-3, 2-3-4, red black, and AVL tree algorithms. So I guess I learned about binary vs self-balancing. I did not know about b-trees. I made the mistake of thinking that b-tree meant binary tree.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact