
Agg: Parallel aggregations for PostgreSQL - pella
http://www.cybertec.at/en/products/agg-parallel-aggregations-postgresql/
======
pella
other alternatives:

now:

* Pivotal Greenplum Database - is now Open Source : [https://github.com/greenplum-db/gpdb](https://github.com/greenplum-db/gpdb)

later:

* CitusDB 5.0 will be open source (2016Q1-Q2) :

( "An extension of PostgreSQL to be distributed/parallel query engine" )

[https://twitter.com/frsyuki/status/667031289506062336](https://twitter.com/frsyuki/status/667031289506062336)

[http://info.citusdata.com/CitusDB-5-Pre-
Beta.html](http://info.citusdata.com/CitusDB-5-Pre-Beta.html)

* PostgreSQL 9.6 will have a "Parallel Sequential Scans" :

[http://amitkapila16.blogspot.hu/2015/11/parallel-
sequential-...](http://amitkapila16.blogspot.hu/2015/11/parallel-sequential-
scans-in-play.html)

~~~
no1youknowz
Wow, I didn't know about the CitusDB 5.0 open source version.

Does anyone know if this is response to the 9.6 PSS?

Either way, us PG users win.

~~~
rachbelaid
There is also a patch to bring columnar storage into 9.6 (coming from the Axle
project, EU founded). Definitively the gap becoming smaller but CitusDB is
also having a lot of the glue built already with easy sharding (pg_shard) ..
It's an exciting time for PostgreSQL!

------
endymi0n
Would be a nice enhancement if it was in vanilla Postgres, otherwise I don't
quite put too much trust in this. Few real world queries are this simple and I
don't think there are many gains left when executed against a more real-world
TPC benchmark suite. But even if this was the case, it's probably better to
take something better suited for this use case, like Redshift, BigQuery or
Hive.

On top, my gut feeling tells me this company seems a little shady and their
business model seems unclear to me. Are they for real?

~~~
quizotic
While not 'vanilla' Postgres, this appears to be packaged reasonably as a PG
extension. Since no vanilla PG code was changed, you should be able to upgrade
PG without worry.

PG 9.5 introduces some support for parallelization along with custom scans. So
this capability seems like a natural fit.

As for utility, big data queries tend to do heavy aggregations, and
parallelization is a nice performance enhancement where it could be sorely
needed.

Redshift has good name recognition, but under the covers, it's ParAccel, which
started with an early (8.2?) version of Postgres and modified it.

~~~
joevandyk
I think the parallel stuff is in 9.6, not 9.5.
[http://rhaas.blogspot.com/2015/11/parallel-sequential-
scan-i...](http://rhaas.blogspot.com/2015/11/parallel-sequential-scan-is-
committed.html)

------
pella
PostgreSQL Announce - mail list ( 2015-12-01 ) :

[http://www.postgresql.org/message-
id/0B9D5042-8862-413F-AE1E...](http://www.postgresql.org/message-
id/0B9D5042-8862-413F-AE1E-C1B2CBF0528A@cybertec.at)

------
craigds
Isn't this in postgres 9.6 anyway, due to the parallel sequential scans?

i.e. if your aggregate function is defined with `PARALLEL SAFE`, then this
should magically work anyway. Yes?

(I'm aware this solution works with postgres 9.5 too)

~~~
rachbelaid
From what I've been reading on the on the development of 9.6
([https://commitfest.postgresql.org/](https://commitfest.postgresql.org/)), we
may only see parallel seq scan and ordering coming to 9.6. They added the
foundation to make more thing parallele but I don't think that we can hope to
see parallel aggregation coming into 9.6

