
Implement table partitioning - rachbelaid
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=f0e44751d7175fa3394da2c8f85e3ceb3cdbfe63
======
samcheng
Any support for "rolling" partitions? e.g. A partition for data updated less
than a day ago, another for data from 2-7 days ago, etc.

I miss this from Oracle; it allows nice index optimizations as the query
patterns are different for recent data vs. historical data.

I think it could be set up with a mess of triggers and a cron job... but it
would be nice to have a canonical way to do this.

~~~
lobster_johnson
How does this work in Oracle? Seeing as the partitioning constraint would be
time-dependent, wouldn't it need to re-evaluate it at regular intervals in
order to shuffle data around? Is the feature explicitly time-oriented?

~~~
mulmen
I don't think oracle can do this exactly but the query planner does understand
time based partitions so if you do something like:

    
    
       SELECT * FROM partitioned_table WHERE partition_date_key > SYSDATE - 1;
    

The query planner will only use the most recent partition. Combine this with
Oracle's ability to merge partitions and you get "daily" partitions that
become "weekly" partitions when the new week starts. Alternately you could
wait a month and combine all the days of last month into a single partition
and then even combine months into years.

The partition intervals are based on specific dates/times, not on the relative
time from query execution.

Oracle also supports row movement which is the biggest missing feature here I
believe.

------
rachbelaid
The conversation on the patches are really interesting:
[https://www.postgresql.org/message-
id/flat/55D3093C.5010800@...](https://www.postgresql.org/message-
id/flat/55D3093C.5010800@lab.ntt.co.jp#55D3093C.5010800@lab.ntt.co.jp)
[https://www.postgresql.org/message-
id/flat/ad16e2f5-fc7c-cc2...](https://www.postgresql.org/message-
id/flat/ad16e2f5-fc7c-cc2d-333a-88d4aa446f96@lab.ntt.co.jp#ad16e2f5-fc7c-cc2d-333a-88d4aa446f96@lab.ntt.co.jp)

~~~
aidos
I'm always amazed by the PG community - it seems like such a constructive
place.

Those patches are absolutely insane. Makes you remember how much hard work
goes into building the software you use on a day to day basis.

[https://www.postgresql.org/message-
id/attachment/45478/0001-...](https://www.postgresql.org/message-
id/attachment/45478/0001-Catalog-and-DDL-for-partitioned-tables.patch)

~~~
MarHoff
I've been professionally focused on PostgreSQL based works for the last 5
years. At the highest point of the BigData hype I sometimes felt a little bit
off-track, because I never got the time to investigate NoSQL solutions...

Only recently did I realize that being focused on actual data and how to
process it inside PostgreSQL was maybe the best way I could spend my working
time. I really can't say what's the best part of PostgreSQL, the hyperactive
community, the rock solid and clear documentation or the constant roll-out of
efficient, non-disruptive, user-focused features...

------
egeozcan
If you also didn't know what exactly partitioned tables are, here's a nice
introduction from Microsoft:

[https://technet.microsoft.com/en-
us/library/ms190787(v=sql.1...](https://technet.microsoft.com/en-
us/library/ms190787\(v=sql.105\).aspx)

It is for the SQL server but I assume it would be mostly relevant. Please
correct me if I'm wrong.

~~~
SideburnsOfDoom
So this is all about partitioning data into different storage files on the
same server? What is the main benefit of that?

~~~
Lozzer
Metadata operations on partitions can be very fast. One simple example is date
based housekeeping. Deleting a month of data will be quite intensive on most
databases, whereby dropping a partition from the table is effectively instant.

Partion switching is also fast. Say you have a summary table that is rolled up
by month, but you want to recalculate the summaries every so often. You can
build a month into a new table and then switch the new table for a partition
in the summary table.

~~~
CWuestefeld
When the circumstances allow it ('cause there are some limitations on where it
can be used), this pattern is HUGE. We've got a few places in our system that
do this, and the optimization achieved an improvement of a couple orders of
magnitude.

------
ktopaz
I don't get it? Table partition is already supported in PostgreSQL now and has
been for a long time now (at least since 8.1); Where I work we utilize table
partitioning with PostgreSQL 9.4 on the product we're developing.

[https://www.postgresql.org/docs/current/static/ddl-
partition...](https://www.postgresql.org/docs/current/static/ddl-
partitioning.html)

~~~
sapling
It sounds like this is column level partitioning.Each column or columns (based
on partitioning expression) is stored as different subtable (or something
similar) on disk.If only few columns are frequently accessed, they can be put
on cache/faster disk or other neat optimizations for join processing.

~~~
MarHoff
Maybe I don't get you, but i don't think so, PostgreSQL is not a columnar
database.

If i got this patch right, each partitioned table will have the same data
structure and store whole rows (it's even more restrictive than previous
inheritance mechanism that allowed extending by adding additional columns).

Column or expression should only define in which table an inserted row is
supposed to be stored. A single row will never been torn apart. Still it look
like a foundation that facilitate sharding BigData(Set) between multiple
servers when used in conjunction with foreign data. However a lot of
performance improvements will still be needed to compete against solid NoSQL
projects (in which you really have a BigData use case).

But looking a bit forward, developing performances improvements on top of an
ACID compliant distributed database seems less difficult than to develop a
NoSQL project for it to become ACID.

~~~
spacemanmatt
Generally PostgreSQL is a row database but Citus has release column drivers
that make it perform similar to other columnar databases.

But this patch is about horizontal partitions.

------
tajen
About donations: I believe PostgreSQL now deserves more advertising and
marketing to develop its adoption in major companies and, hence, get more
funding. If I donate on the website, it says it will help conferences. Where
should I donate?

------
bigato
Supposing the case in which all partitions are on the same disk and that you
manage to index your data well enough according to your usage that postgres
does not need to do full table scans, are there any additional performance
benefits on partitioning?

~~~
Jweb_Guru
Yes. Less latch contention for nodes of a single btree index, for instance.

~~~
bigato
I didn't know what latch is, so I googled it and found a nice explanation:

[https://oracle2amar.wordpress.com/2010/07/09/what-are-
latche...](https://oracle2amar.wordpress.com/2010/07/09/what-are-latches-and-
what-causes-latch-contention/)

"A latch is a type of a lock that can be very quickly acquired and freed."

That brings me a couple more questions:

1\. May I infer then that the only benefit from partitioning the table (fully
located on the same disk) that can not be achieved by indexes is that queries
will wait less time for this kind of lock to be released?

2\. May I assume while a table is only being read and not changed, there's no
performance gain from partitioning a table (fully located on the same disk)
that can not be achieved by indexes?

~~~
dhd415
There are other possibilities as well. For example, if your partitioning
strategy is such that it improves the selectivity of an index, it could
improve query plans for queries that were on an index that was less selective.
As an example, I once had a table with over a billion rows distributed among
~100 tenants on which queries were typically run by tenant and date range.
Partitioning that table by tenant dramatically improved the performance of
those queries because those queries no longer had to scan through rows of
which only ~1% were for the tenant of interest.

~~~
bigato
If you had built composite index instead, wouldn't it work just the same
towards improving the performance? Well I can see that an composite index
would occupy more space while partitioning wouldn't, and that should be
something to take into consideration. But performance-wise, wouldn't it be the
same?

~~~
dhd415
No, for example, a composite index on (tenant, date) would be highly non-
selective whereas an index on date in each individual tenant partition is
highly selective and therefore higher-performance (in my case, much higher
performance).

~~~
bigato
What if you made an index on (date, tenant) instead?

~~~
Jweb_Guru
The issue he or she is referring to is that the underlying rows are fragmented
within each fetched page (so you might need to fetch 40 pages for 40 index
entries, even if they would all fit on one partitioned page). Fiddling with
the index order isn't going to change that (the current index order is already
optimal for that type of query). There's another solution, which is to expand
the index to include any covered rows so you can use index-only scans and not
hit the main rows at all, but that's trading away a large amount of index
size, which you usually want to avoid unless you have no other choice.

------
gdulli
This message was confusing to me because I've been using/abusing Postgres
inheritance for partitioning for so long that I forgot Postgres didn't
technically have a feature called "partitioning".

What I'm looking forward to finding out is if I can take an arbitrary
expression on a column and have it derive all the same benefits of range
partitioning like constraint exclusion.

------
vincentdm
I really like this addition. We store a lot of data for different customers,
and most of our queries are only about data from a single customer. If I
understand it correctly, if we would partition by customer_id, once the query
planner is able to take advantage of this new feature, it will be much faster
to do such queries as it won't have to wade through rows of data from other
customers.

Another common use case is that we want to know an average number for all/some
customers. To do this, we run a subquery grouped by customer, and then
calculate the average in a surrounding query. I hope that the query builder
wil eventually become smart enough to use the GROUP BY clause to distribute
this subquery to the different partitions.

------
tda
I just tried to implement table partitioning in PostgreSQL 9.6 this week. With
some triggers and check constraints this seem to work quite nicely, but I was
a bit disappointed that hash based partitioning is currently not possible (at
least not without extensions).

Will hash based partitioning be included in PostgreSQL 10? The post notes

    
    
      A partitioning "column" can be an expression.
    

so I can assume it will be supported?

~~~
amitlan
Not natively, as in there is no PARTITION BY HASH (<list-of-columns>). What
limitations do you face when trying to roll-your-own hash partitioning using
check constraints (in 9.6)?

~~~
tda
I wanted to partition a table by the foreign key, as the table receives a few
hundred rows per foreign key per hour (it is a timeseries db).

So I figured partitioning the table by foreign key would group all data
together in a way that allows for faster access (typical access pattern would
be select * where foreign_key = x). However, as the number of keys in the
foreign table is unbounded and can be quite large, I wanted to partition the
data to a limited number of tables, with

    
    
      mod(foreign_key, number_of_partions) 
    

If I understood correctly, check constraints can't operate on a calculated
value

~~~
amitlan
Yes, it is not possible to optimize (ie, prune useless partitions for quicker
access) the query select * from tab where key = x. You'd need actual hash
partitioning for that. The mechanism Postgres uses to perform partition-
pruning (constraint exclusion) does not work for the hashing case.

------
amitlan
...this is the beginning, not the end... [https://www.postgresql.org/message-
id/CA%2BTgmobTxn2%2B0x96h...](https://www.postgresql.org/message-
id/CA%2BTgmobTxn2%2B0x96h5Le%2BGOK5kw3J37SRveNfzEdx9s5-Yd8vA%40mail.gmail.com)

------
vemv
While seemingly extensive, I don't quite like the commit message.

I doesn't say what TP _is_ , and what its use cases would be. That's the first
thing you should say, else how am I going to understand / keep interest in the
rest of the text?

~~~
pilif
The commit is written by postgres developers for postgres developers. I would
say that 90% of the intended audience of that commit message doesn't need an
explanation what table partitioning does.

For them this would be needless clutter that's not at all relevant to the
commit.

Once we're reaching the 10.0 release, human-friendly release notes, additional
manual chapters and sample code will be written for the users to understand
(in-fact, the commit linked by this submission already contains quite a bit of
additional documentation to be added to the manual).

