Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

An important detail which this guide leaves out: The "actual location of the row" is usually the leaf node of another B-Tree. That is the primary B-Tree and it's indexed by the primary key.

The major consequence is every non-primary index query involves dereferencing N pointers, which could mean loading N pages from disk/SSD to get leaf nodes spread out across the primary tree. Whereas if you query a range in the primary B-Tree directly, the rows are consecutive so you would only load N/M consecutive pages based on the ratio of rowsize to pagesize.

That's why some people use composite keys for primary key, to get better data locality in the primary B-Tree index.

See "How We've Scaled Dropbox"[1] to hear the same thing.

At 48:36 they explain why they changed from PRIMARY KEY (id) to PRIMARY KEY (ns_id, latest, id). ns_id is kinda like user_id. So that change groups journal entries for users together on disk

Specifically. PRIMARY KEY (id) orders things by creation date whereas PRIMARY KEY (ns_id, latest, id) orders things by ns_id primarily.

[1]: https://youtu.be/PE4gwstWhmc?t=2914



This is true of MySQL (which the guide uses), but not necessarily of other databases such as Postgres.


You're right. Postgres doesn't give any control over primary data locality. That might cause querying 1 row a bit faster in Postgres (no log(N) traversal of the primary B-tree index) but picking out N rows could be a lot slower.

https://www.postgresql.org/docs/13/indexes-index-only-scans....




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: