
“Column Tetris” – Calculating and Saving Space in PostgreSQL - nreece
https://stackoverflow.com/questions/2966524/calculating-and-saving-space-in-postgresql/7431468#7431468
======
ralusek
It's really unfortunate how much of database technology requires manual
tweaking, what is sensible and optimized should be the default behavior.
Obviously for ALTER statements reordering of columns could mean a huge table
rewrite, but there'd still be ways to handle it. On table create, though, you
really should have to specify a manual override in order to prevent the DB
from automatically performing these kind of obvious optimizations, IMO.

On a side note, I've started to hear more buzz around this, but when we can
have obvious indexing, sharding, caching, and query planning handled by
ML...we will see a real revolution in the way we handle data. The models could
be generalized to some degree across data sets and query patterns, but pretty
easily progressively trained in production for a particular data set. The
outputs we optimize for are basically read time, write time, and storage
space...which all strike me as relatively easy targets for training. How great
would it be to just move some sliders for those 3 around based off of
preference and call it a day?

~~~
beefhash
To me, this just seems to be struct packing[1] on the database layer. I would
find the opposite behavior of automatic reordering of columns strange.

[1] [http://www.catb.org/esr/structure-
packing/](http://www.catb.org/esr/structure-packing/)

~~~
vertex-four
As a counter-argument: databases do ~interesting~ things with your queries and
the layout of your data anyway. SQL is meant to be declarative - it's not
meant to define exactly how an operation is done, only that it's done in a
method that results in something that _logically_ looks like what you've
declared it to look like. So as long as the row is ordered correctly when you
do a `SELECT *`, it'd be perfectly reasonable in my opinion to store it
physically in any order the database engine prefers.

------
smmnyc
I run into answers by Erwin Brandstetter on StackOverflow just about every
time I search for something related to Postgres. I’m curious about his
background and how he became such an expert in this area... it’s inspiring and
extremely helpful!

------
postila
This is one of the interesting optimizations that DBMS should do automatically
but it doesn't.

[https://github.com/NikolayS/postgres_dba](https://github.com/NikolayS/postgres_dba)
\-- this toolset includes a version of report (see report a1) which analyses
all tables in your database and shows opportunities to optimize.

------
tejasmanohar
This was one of the first "non-obvious" optimizations I learned about w/ SQL
DBs. Obviously, if you understand how the database works (and allows you to
ALTER it quickly), you'll know padding always matters off the bat, but man,
wouldn't it be nice if Postgres just optimized your CREATE TABLEs for you?

