
PostgreSQL Index Internals - snaga
https://www.pgcon.org/2016/schedule/events/934.en.html
======
ivoras
The thing is, there's not much documentation on where each index type would be
applicable, what their upsides and downsides are in particular situations.

For example, the hash index is (currently) not WAL-logged, which means it's
not crash safe and not replicated. It also may have some concurrency issues,
the resources are sparse on this one.

The BRIN index _should_ be ideal for situations where there is a huge storage
of data (TBs) inserted in sequential order and rarely queried - a BRIN index
for such a thing would be only a few MB in size. I say _should_ because when I
tried it on 9.5.0, the query planner kept ignoring the index.

Both GIN and GIST indexes can be used for full text search (tsearch2), but GIN
is faster for queries while GIST is smaller and faster for insertion.

The SP-GIST index is so underdocumented that, when I wanted to try it out in a
project, I could not even understand how it works and does it work correctly
as a radix tree for strings.

What's also somewhat underdocumented are the "operator classes" and which are
and aren't implemented for different index types. I.e. a table of which index
type can be used with which data type (and what exactly does such combination
produce) would be great.

~~~
malisper
> The BRIN index should be ideal for situations where there is a huge storage
> of data (TBs) inserted in sequential order and rarely queried - a BRIN index
> for such a thing would be only a few MB in size. I say should because when I
> tried it on 9.5.0, the query planner kept ignoring the index.

A BRIN index only keeps summary information for a range of pages. Because of
this, a BRIN can only be used for a bitmap scan and cannot be used for a
normal index scan. This makes the set of queries where the planner will use
the BRIN index really small. It is also possible, since it is such a common
mistake, that you forgot to run ANALYZE and so the planner had no clue the
data was in sequential order.

------
koolba
To all the upvoters, did you actually read the linked slides or are you just
upvoting because it has "PostgreSQL" in the name[1]? Interesting to see this
many votes with zero comments.

It is a great set of slides by the way. The btree piece would be old news to
anyone familiar with database related data structures but the detail on GIN,
GIST, and particularly BRIN make for a good read. Seeing the physical layout
for each of those gives you a much better understanding of the cost of
maintenance and what situations would they'd be useful.

Also, if you happened to follow the recent Postgres/Uber thread, this gives a
lot more context as to how updates to a row would cascade to many writes for
each index.

[1]: _... which isn 't a bad idea anyway because PostgreSQL is awesome!_

~~~
HCIdivision17
At least for me, I have two different voting style:

\- Yeah, this is pretty cool. Upvote!

\- I've got to save this for later when I can really read it

PostgreSQL stuff tends to be pretty reliably good, and usually with an
educational bent. So I'll tend to use the upvote as a bookmark to go back to
later.

I gotta say, I don't really like slides, but I've yet to be disappointed by a
PostgreSQL post. And usually, a high vote/comment ratio is a nice indicator of
quality.

EDIT: formatting

~~~
jlgaddis
You might be interested in the "favorite" link under each submission title.

~~~
elevenE
But I think the "favorites" are publicly view-able, isn't it? Is there any way
to make them private?

~~~
jlgaddis
From viewing my profile it appears that they are public. I've never actually
used it, just noticed the feature a while back. I've been using Instapaper for
ages for my "read later" links.

------
kbradero
is there a book/documents about DB internals ? I have found some books about
theory but not about an actual implementation.

~~~
pgaddict
Because there are many different implementations, with different trade-offs
and design decisions. Also, the implementation-specific documentation is
typically kept fairly close to the code, as the docs need to be kept in sync.

So for example if you need to know how indexes in PostgreSQL work, look into
the READMEs in the proper directory:

b-tree indexes:
[https://github.com/postgres/postgres/blob/master/src/backend...](https://github.com/postgres/postgres/blob/master/src/backend/access/nbtree/README)

gin:
[https://github.com/postgres/postgres/blob/master/src/backend...](https://github.com/postgres/postgres/blob/master/src/backend/access/gin/README)

brin:
[https://github.com/postgres/postgres/blob/master/src/backend...](https://github.com/postgres/postgres/blob/master/src/backend/access/brin/README)

etc. The READMEs also include links to related papers etc.

~~~
kbradero
hey thanks!, tho I agree with this, I was thinking, if there are books about
kernel internals there may be some about db internals.

but Thanks for your links !

