
Pluggable Storage Committed in Postgres - craigkerstiens
https://www.postgresql.org/docs/devel/tableam.html
======
pella
context:

"PLUGGABLE STORAGE IN POSTGRESQL" (2018 PGCONF.EU ; Andres Freund)

pdf: [https://anarazel.de/talks/2018-10-25-pgconfeu-pluggable-
stor...](https://anarazel.de/talks/2018-10-25-pgconfeu-pluggable-
storage/pluggable.pdf)

 _" "Why?

● ZHeap – UNDO based storage, address bloat and write amplification problems

● Columnar Storage

● Experiments

""_

 _" PostgreSQL has long prided itself in being extensible. But for a number of
years people have wished for the ability to not only introduce new datatypes
and functions, but also to add new forms of storing data. The introduction of
foreign data wrappers (FDWs) allowed to satisfy some of those use-cases. They
however are not fully suitable for native data storage, quite fundamentally
they don't allow for index creation, foreign keys, etc.

Over the last years people have on and off worked on making table storage
pluggable. It looks like PostgreSQL 12 might finally get builtin support for
that, based on work of Haribabu Kommi, myself and others.

This talk will go over the reasons why it is useful to make storage pluggable
(my personal reason is to allow the introduction of zheap, a new undo based
table storage that is nearly free of table bloat) and how the new APIs work,
and what further use-cases the pluggable storage APIs have."_

~~~
michelpp
Thank for the context, this is great! I'm looking into this now to see if I
can use it with my project pggraphblas, i was hoping to make an FDW where each
column is backed by a matrix, making essentially a "property graph" that can
be queries with straight SQL, but this looks much more interesting and also
easier to use.

------
anarazel
FWIW, in hindsight, this never should have been called pluggable storage, but
rather pluggable table access methods, or at least pluggable table storage.
But it was was in the subject on the first thread this patchset originated in
(by Alvaro Herrera), and stuck...

At least the code doesn't call it pluggable storage...

------
nn3
Is that a good thing?

Instead of a single postgres this there will be a lot of subtle or very
different variants, all needing completely different tuning, like it happened
with MySQL. May well fragment the ecosystem significantly.

~~~
anarazel
I think it's a _significant_ danger.

But on the other hand our current table storage has some architectural issues
that are hard to fix in an upward compatible manner (causing a number of
issues around vacuum and write amplification due to hint bits etc). And even
if there were an easy-ish incremental upgrade path, it's hard to e.g. move to
an UNDO based MVCC implementation without regressing some workload (&
introduce bugs) - allowing that development to happen in parallel and the
adoption be by choice, that's much more realistic.

Additionally, the best type of table storage is also very workload dependent.
E.g. for some analytics (or even just long term storage) it's quite useful to
have a columnar store, but for lots of transactional workloads that's not
appropriate. (Note that the current tableam interface would allow for some
simple columnar store, but that'd still need a good bit of planner and
executor smarts to be useful for anything but higher storage density.)

So I think on balance it's _probably_ worth it.

~~~
greggyb
From my perspective, columnar storage is an enormous win. One of the best
things I've been able to do for clients on MSSQL has been to tell them simply
to create a clustered columnstore index on their large tables for analytical
workloads.

At one point we had a need to perform a diff between a yesterday-snapshot and
a today-snapshot of a billion row table. If I recall correctly, this took 1 or
2 minutes with clustered columnstore vs nearly an hour for rowstore. The order
of magnitude is certainly correct, even if my specific numbers are wrong.

If you're curious _why_ we had to do a full diff between two billion-row
tables, that is a story for a different time, but there were good reasons and
we also removed those good reasons through other design decisions.

------
coleifer
Sqlite has a concept called virtual tables, which sit somewhere between
"pluggable storage" and postgres foreign-data-wappers. They are fairly easy to
implement and can be incrementally improved by providing more sophisticated
hints to the planner. I think the folks at tarantool have done some work
hooking deeply into sqlite's pager/wal, presumably others (the folks who
connected sqlite to lmdb?). And of course Oracle, who ripped out the sqlite
btree and replaced it with BerkeleyDB.

I think sqlite4 (defunct) was going to have pluggable storage, but they ended
up scrapping it. Presumably over performance concerns, but please correct me
if I'm wrong.

Anyway, interested to see how this feature is adopted and to learn more.

------
jerrysievert
huh. on one hand, this is awesome! something like this should be fantastic for
things like citus’s columnar storage - a pluggable storage system can be a
huge boon. on the other hand there have been some bugs popping up on pg-bugs
like the parallel queries that look a little questionable. hopefully no
negative impact for anyone upstream while everything gets hammered out.
context: i’m The maintainer of a fairly popular pg extension

editing to note that the bugs popping up have nothing to do with this, just
the hope that it doesn’t become ... overwhelming

~~~
macdice
Hi Jerry, Yeah, parallelism is hard... but speaking as someone involved in
several bug fixes in that area, I can tell you that we put a _lot_ of effort
into chasing those bugs down, and even made changes to allow extensions that
were doing illegal things in _PG_init() to keep working under parallelism.
Bugs are inevitable, and in complicated code it can take someone else's
workload to uncover them. IMHO the important question is how well you deal
with them, and I think we do a good job at providing work-arounds (primarily
by providing GUCs so you can turn new stuff off it it's causing problems), and
getting fixes into the tree ASAP. I think the worse recent case was the "DSM
handle collision" one, which was hiding in code committed years ago but only
discoverable with very particular timing and newer query plans. All currently
known problems fixed in that department, but unfortunately those patches
missed the February cut-off for 11.2 and will have to wait until May.

~~~
jerrysievert
> but speaking as someone involved in several bug fixes in that area, I can
> tell you that we put a lot of effort into chasing those bugs down,

yup! i've definitely seen the efforts, and really appreciate the work that
everyone is doing to solve them - it's great seeing that kind of commitment.

> IMHO the important question is how well you deal with them

you are absolutely correct! and i can't fault how quickly bugs in parallelism
have been triaged and fixed. but that wasn't what i was getting at - just that
there have been a lot of fairly large changes (i guess you can call it a big
increase in velocity?) that still seem to have (or uncovered as you said) a
lot more to do - parallelism was just an example of a big change that seems to
have some pretty big impacts on stability and planning and adding another big
change makes me wary since that has direct impact on code that i maintain. it
wasn't an attack, and i hope that it didn't come across that way, just an
observation ... the more major changes the more balls in the air (to borrow a
juggling metaphor), and pg has a ton of them in the air right now.

> I think we do a good job at providing work-arounds (primarily by providing
> GUCs so you can turn new stuff off it it's causing problems)

GUC's are helpful, but there are times when direct access to postgresql.conf
isn't really feasible ... and setting GUC's at the beginning of every session
can become cumbersome/untenable.

also, GUC's aren't always feasible when some major changes have negative
impact across whole stacks (the major changes in CTE's and how they impacted
planning is one that i can think of off the top of my head).

> illegal things in _PG_init()

just curious, can you point to some of these? i want to do my own extension
audits and considering that plv8 does/did a lot of things based on
postgres/src/pl/* it would be good to make sure it's not doing any of those as
well.

regardless, thanks for the response - it is appreciated.

~~~
macdice
>> illegal things in _PG_init()

> just curious, can you point to some of these? i want to do my own extension
> audits and considering that plv8 does/did a lot of things based on
> postgres/src/pl/* it would be good to make sure it's not doing any of those
> as well.

[https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit...](https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=6c3c9d418918373a9535ad3d3bd357f652a367e3)

------
joshberkus
Wow, congrats PG devs! This has been a dream for Postgres for like 8 year now.

