
PostgreSQL 9.6 Beta 1 Released - linuxhiker
http://www.postgresql.org/about/news/1668/
======
koolba
Parallel query will probably win for sexiest new feature ( _rightfully so!_ )
but remote apply is very useful in the real world. It will greatly simplify
reading newly written data off replicas:
[http://michael.otacoo.com/postgresql-2/postgres-9-6-feature-...](http://michael.otacoo.com/postgresql-2/postgres-9-6-feature-
highlight-remote-apply/)

~~~
dtech
Could you elaborate on exactly what it is? Will it mean the named elements of
queries will automatically be parallelized?

~~~
koolba
Postgres currently uses a single process per connection. That means at most
one CPU core is used per query executing a connection. Parallel query allows
breaking up query plans to allow multiple processes (and by extension cpu
cores) to process a part of a query plan.

The classic example is a sequential scan, say for a COUNT(*) or SUM(foo). A
single process would scan through all the blocks in order. A parallel scan
would split the blocks into N buckets, scan them via N separate processes, and
then combine the results.

~~~
Someone1234
There's nothing to complain about with this. But aren't most large database
queries ultimately IO capped, rather than CPU capped? Wouldn't this just put
more load on the IO and it would just bottleneck there?

~~~
gshulegaard
Yes and no. Take the following with grain of salt as it is just my personal
opinion/understanding. Only 1 or 2 points are actually from challenges I have
run into.

Restricting every query to a single core places an artificial CPU cap on a
system that would otherwise be IO capped.

Disks/storage is also getting faster ([http://www.geek.com/chips/new-intel-
storage-is-1000-times-fa...](http://www.geek.com/chips/new-intel-storage-
is-1000-times-faster-than-your-ssd-1629656/)).

As analytics keeps gaining popularity, the need/desire to move more and more
CPU bound processing to the data storage layer is increasing.

And finally, PostgreSQL's advanced query planning/partitioning support would
immensely benefit from being able to split partition queries into multiple
process buckets.

~~~
TylerE
And also RAM keeps getting cheaper. 64GB is absolutely nothing these days and
can fit a LOT of data.

------
melor
From the Release Notes:

Major enhancements in PostgreSQL 9.6 include:

Parallel sequential scans, joins and aggregates

Elimination of repetitive scanning of old data by autovacuum

Synchronous replication now allows multiple standby servers for increased
reliability

Full-text search for phrases

Support for remote joins, sorts, and updates in postgres_fdw

Substantial performance improvements, especially in the area of improving
scalability on many-CPU servers

------
utternerd
This is excellent, the read-balancing consistency is a huge feature, and one
we've been waiting for as trying to do read balancing without it is jumping
through way too many hoops (unless data consistency just isn't your thing).

------
pella
and 9.7 have started planning

* [ "asynchronous and vectorized execution" ]

[http://www.postgresql.org/message-
id/CA+Tgmobx8su_bYtAa3Dgrq...](http://www.postgresql.org/message-
id/CA+Tgmobx8su_bYtAa3DgrqB+R7xZG6kHRj0ccMUUshKAQVftww@mail.gmail.com)

------
ormod_
The features that I'll personally probably enjoy the most are really the FDW
join pushdown work, parallel query. (though it's still early days for the
parallel stuff)

..and the thing that's a killer feature for many larger databases, the freeze
map. Not having to FREEZE things all the time is a godsend.

~~~
mistermann
Could you possibly quickly explain what a freeze map is?

~~~
anarazel
Postgres uses a specific brand of MVCC
([https://en.wikipedia.org/wiki/Multiversion_concurrency_contr...](https://en.wikipedia.org/wiki/Multiversion_concurrency_control))
where multiple versions of a row are stored in the heap. That's why there's
vacuuming to remove those.

To avoid having to vacuum whole relations all the time there's the "visibility
map", which keeps track of which blocks have no dirty rows; so they can be
skipped during vacuum.

To understand which versions are visible postgres stores transaction-ids in
the row headers; these are four byte wide. On systems with significant
throughput, those don't last very long. So every now and then there's anti-
wraparound vacuums; which replaces xids which are about to become "too old"
with a special marker ("frozen").

The problem < 9.6 is that these anti-wraparound vacuums have to scan the whole
table, thereby are a lot more expensive. These happen every 200 million xids
by default (write transactions and/or savepoints), configurable up to ~2
billion. With the freeze map, that doesn't have to happen, only non-frozen
pages have to be re-scanned.

------
desireco42
I love new features in Postgres, but I hate upgrading. Transition, especially
for dev environment is always way too complex and not obvious at all.

If there is one feature I wish PG had, it is recognizing old format db and
offering one command that can in place upgrade damn db, without me googling it
every time.

~~~
nunwuo
> If there is one feature I wish PG had, it is recognizing old format db and
> offering one command that can in place upgrade damn db, without me googling
> it every time.

So your having to Google for five minutes when doing a major version upgrade
is the biggest flaw in modern postgres? Oh come on now.

P.S. pg_upgradecluster in Debian would probably do this for you without
googling

~~~
jeffdavis
It's not an unreasonable request, and certainly worth some effort to improve.

That being said, if upgrade is the biggest problem, that's great news!

------
agopaul
'Full-text search for phrases': what's the difference between the current full
text search implementation and the one for phrases?

~~~
elchief
currently searching for 'new york' would match 'york new', as the parser
doesn't understand phrases, but new version would let you match '"new york"'
to 'new york' but not 'york new'

~~~
agopaul
That is a great addition then

------
pella
full Release Notes :

[http://www.postgresql.org/docs/9.6/static/release-9-6.html](http://www.postgresql.org/docs/9.6/static/release-9-6.html)

------
rpedela
Does anyone know if parallel query execution just happens by default or do you
have to tell PG that you want it?

~~~
jeltz
You have to enable it by configuring the maximum number of workers per query,
and then the query planner tries to figure out if the query would benefit from
a parallel execution. This is the first release with parallelism so I would
not be surprised if the query planner gets this wrong in some cases.

------
qaq
UPSERT from previous release plus Parallel sequential scans, joins and
aggregates and Full text search for phrases once 9.6 is ready will make PG
almost perfect :)

------
kbenson
Is there something specifically interesting in the 9.6 release that makes it
more noteworthy than other releases? I enjoy reading PostgreSQL release
announcements, as I myself haven't used it but have plans to do some testing
with is when some time free up, but I'm a bit confused as to why a _beta_
release announcement is getting so much attention.

~~~
anarazel
I guess because beta usually determines which features are going to be in the
final release. So if you're looking what release to choose for a new version
of your software, it might be more worthwhile to look at beta than the final
release.

------
oskari
PG 9.6 Beta is available for testing at our Aiven.io Cloud Database service:
[http://blog.aiven.io/2016/05/help-test-postgresql-96-via-
aiv...](http://blog.aiven.io/2016/05/help-test-postgresql-96-via-aiven.html)

------
forkandwait
I think that great parallel infrastructure would be worthy of a Postgres 10.0.

------
elchief
Isn't that like 8 months ahead of schedule?

~~~
pilif
The Postgres beta periods are usually really, really, really long. Mid/End may
is traditionally when the first beta comes out for a release in late autumn.

~~~
makmanalp
As they should be! Enough time to know whether stuff breaks in real life or
not, and to deal with fixing any issues that come up.

~~~
snuxoll
Especially for software so mission critical as a RDBMS. If your web framework
has a bug users get errors or things act wonky, if your database has a bug you
can get data corruption and other horrible, horrible things.

