
New in PostgreSQL 10 - okket
https://wiki.postgresql.org/wiki/New_in_postgres_10
======
alceta
If anyone even remotely involved with the maintenance and development of pg
reads this thread - Thank you! - for all your efforts in building and
improving a first class product that keeps me amazed at the strides it takes
with each major. release.

~~~
irrational
Truly. We've recently moved from Oracle (after using it for 15 years) to
Postgresql. It's like a breath of fresh air. The documentation for Postgres is
unbelievably superior to Oracle. So far its performance is equal to or better
than Oracle. We had to go through and rewrite thousands of queries, but the
sql syntax of Postgres was always simpler and more logical than the equivalent
in Oracle (I think Oracle has too much baggage from being around too long).
All in all I'm so impressed by Postgres. I'm sure there are features in Oracle
that Postgres doesn't have that keep people on Oracle, but I would imagine
that the vast majority of Oracle installs could be moved to Postgres.

~~~
patates
Do you plan to write more about the migration, like a blog post? That would be
very interesting to read.

~~~
unixhero
This needs to become a thing. Where people produce writeups on Oracle To
Postgres, and how much better it is; under something like
#RunsMuchBetterWithPostgres ...

It is strange that there aren't more writeups on Postgresql migrations.

Yandex had a good one. Posted here a long while ago:
[https://news.ycombinator.com/item?id=12489055](https://news.ycombinator.com/item?id=12489055)

~~~
KingMob
> It is strange that there aren't more writeups on Postgresql migrations.

It's probably because switching databases is very painful and rare.

~~~
unixhero
I would think that not having to pay the Oracle/MS/Sybase/Teradata licenses
are a good enough incentive...

Speaking of economic incentives.

That could be a thesis topic for my hereto unwritten PhD in Behavioral
Economics:

"Choosing to pay" What are the behavioral aspects of top leadership - Why do
people still chose proprietary" A quantitative case study."

~~~
Ygg2
If I had to wager, it's support. Oracle/MS generally provide much better
support that open source software.

~~~
jeltz
Oracle is definitely not known for their good support, and there are plenty of
companies which provide really good PostgreSQL support.

------
gshulegaard
Just...wow.

* Native partitioning

* Parallel query

Honestly, some very welcome quality of life improvements for use cases even
outside of what I would consider "Big Data".

> This means that users no longer need to create triggers for routing data;
> it's all handled by the system.

Trigger routing has always been a performance foot gun...to the point that
it's sometimes better to handle table routing in application logic. I wonder
what the performance is comparatively between this new "native" partitioning
and existing methods (triggers and rules) and whether or not this makes
application routing always inferior.

~~~
adrianratnapala
I worry that having things "all handled by the system" can be even more of a
performance footgun. I mean that users will be able to do things that are
allowed by the docs, but which result in, expensive and hard-to-see things are
happening under the hood.

~~~
flukus
This is a valid concern, I've seen simple automatic optimizations backfire
before. An example I remember from MS Sql Server (older versions) of a very
simple optimization of using indexes can backfire with the dynamic filter
pattern:

where ((@foo is null) or (foo = @foo))

and ((@bar is null) or (bar = @bar))

Depending on statistics, order of execution, indexes in place and phase of the
moon this could produce fast results or have to revert to a full table scan
(because it uses a cached query plan with the wrong index).

Devs still have some awareness of how queries translate to actual running
instructions and to measure that the results are in line with what is
expected.

~~~
mst
Modern pg libraries only cache the plan with the query params involved as well
so this generally doesn't happen.

(I made it happen with an older version of things and ... yes, argh)

------
gfodor
I noticed hash indexes are now crash proof and replicated -- this seems to
make them actually usable in production. In other words, this release
effectively "adds" a new index type. That seems like a much bigger deal than
is being talked about, is there any reason to believe that new databases
shouldn't be using hash indexes for columns that won't be supporting range
queries? (In other words, pretty much all keys.) I've seen mixed reports about
hash indexes being beneficial.

~~~
anarazel
I think it's a neat feature. But note that orderedness isn't just useful for
range queries, it also helps to satisfy ORDER BY and to allow for merge joins
without a sort steps. There's also no yet support for index-only scans
(probably never), no constraints, and no multi-column index support.

There's also still some performance kinks to work out with the current hash
index performance - large when growing the index quickly or under very high
concurrency.

~~~
brlewis
Stop it! You're giving away all possible answers to one of my favorite
interview questions.

 _Edit: not seriously asking you to stop, but the interview question part is
true_

~~~
Illniyar
I hope those are interview questions for dbas because it's way outside common
knowledge for backend developers.

~~~
tatersolid
Why are back-end developers not supposed to have a deep understanding of their
tools? I hear this sort of statement all the time, it’s infuriating. I once
even had a dev team lead claim that it was not a dev’s responsibility at all
if the app generates bad SQL; the DBA is supposed to make the queries fast
(this person no longer works for me, but I consistently hear similar claims
from consultants and others in the dev community).

DBAs are expected to understand the crappy inefficient row-at-a-time ORM in
use by the devs, which implies understanding a particular programming
language, but not the other way around? There’s nothing magical about SQL
databases that a decent developer can’t understand.

~~~
marcosdumay
> Why are back-end developers not supposed to have a deep understanding of
> their tools?

You mean all their tools? Like how does modern processors handle branch
predictions in the face of out of order operations, or what differences in
server cooling will affect throughput or peak performance?

Do you have a deep understanding of all the tools you use? Because I honestly
doubt so.

------
developer2
> ICU Collation Support

This tiny little mention is a _massive_ win! We now have access to the "und-x-
icu" collation. As the postgres docs[1] put it: "ICU root collation. Use this
to get a reasonable language-agnostic sort order."

I've been waiting on this _forever_. Columns containing strings from any/all
languages can now be sorted well enough to use postgres in truly multi-
language applications. If your application has users writing in English,
German, Russian, Japanese, et al, using the "und-x-icu" collation should make
your application much more user-friendly (as opposed to using the en_US or C
locale).

The ICU root collation is far from a perfect system, but it's the best option
available today. I've been tied to MySQL purely for its similarly-purposed
utf8mb4_unicode_ci collation. Now I can give postgres another very serious
look.

[1]
[https://www.postgresql.org/docs/10/static/collation.html](https://www.postgresql.org/docs/10/static/collation.html)

------
IMTDb
PgSQL more and more looks like the "redis of databases". Whatever the problem,
you can almost always find a good reason to use it :)

You get great performance for the simple "dumb" use case (relational data in
DB term, key/value store in redis case), and lots of awesome additional
feature on top of it for more complex situations.

I love it.

~~~
skrebbel
> the "redis of databases"

That's like calling Swift "the Ruby of programming languages".

~~~
nat
and it says more about redis and ruby than postgres and swift :p

------
qume
Floating point timestamps - the only item which affects my work is also one of
the few in this wiki page with no description. These were removed with no
breaking compatibility.

For anyone else using them, I just did a bit of reading [1] and it looks like
they are inferior to default 8 byte timestamps. FP timestamps have microsecond
precision around the year 2000, but it drops off as it moves away from there.

I was using them to be the same as some GPS timing code, but it seems this
isn't needed or desirable.

[1] [https://www.postgresql.org/docs/9.0/static/datatype-
datetime...](https://www.postgresql.org/docs/9.0/static/datatype-
datetime.html#DATATYPE-DATETIME-TABLE)

~~~
chowells
Yeah.. You should never use floating point when your use case wants uniform
precision across the full range. That's the opposite of what floating point
gives you.

------
pbnjay
> Cross-column Statistics

Holy crap I didn't even know this was feasible much less in development! Can't
wait to test this out, as significant amount of my data sets have these kinds
of relationships.

Parallel query will also be great for certain queries I do regularly. Been
looking forward to this!

Thank you Thank you to all the devs that had a part of this!

~~~
bonesmoses
Cross-column statistics are one of my favorites. The way Postgres 10 handles
it is really a _huge_ advancement that addresses a painful edge case that has
been around for ages.

[https://blog.2ndquadrant.com/pg-phriday-crazy-correlated-
col...](https://blog.2ndquadrant.com/pg-phriday-crazy-correlated-column-
crusade/)

~~~
ComodoHacker
To be honest, skewed data problem can be addressed with partitioning in many
cases.

Cross-column statistics is a great feature nevertheless.

~~~
pgaddict
Yes, but partitioning also has a number of limitations that may be quite
undesirable. Like inability to create UNIQUE constraints or primary keys, for
example.

~~~
anarazel
More importantly it doesn't help at all with cross-column correlations. It can
help with skew in between various rows though.

------
TCM
Full text search in JSON & JSONB looks exciting.
[https://wiki.postgresql.org/wiki/New_in_postgres_10#Full_Tex...](https://wiki.postgresql.org/wiki/New_in_postgres_10#Full_Text_Search_support_for_JSON_and_JSONB)

~~~
godzillabrennus
As someone working with json output from the gmail api I’m curious to see how
people smarter than me take advantage of this new functionality so I can adopt
it as well.

------
TCM
SCRAM authentication looks like a nice security improvement also. No
description on the site but
[http://paquier.xyz/postgresql-2/postgres-10-scram-
authentica...](http://paquier.xyz/postgresql-2/postgres-10-scram-
authentication/) gives a good overview.

~~~
xyzzy_plugh
It's an improvement for sure, but I am curious -- does anyone situate a
Postgres instance where it is publicly accessible? Who was asking for this
feature?

~~~
dalyons
Lots of cloud managed dbs are/can be publicly accessible. Eg all heroku hosted
Postgres instances

~~~
merb
are they? even without ssl? by default?

~~~
pdimitar
Not sure about SSL, but in the past customers of mine have copy-pasted full
Heroku PG URLs to me and I was able to get in via `psql` immediately.

So yes they're public but their addresses are basically impossible to guess.

~~~
cbowal
Known as "Security through Obscurity" [0]

[0]
[https://en.wikipedia.org/wiki/Security_through_obscurity](https://en.wikipedia.org/wiki/Security_through_obscurity)

~~~
wongarsu
Debatable. If the address is really unguessable, the address acts like a
regular key.

It's still not good practice, since most systems treat addresses with far less
care than passwords and often save and/or transmit them unencrypted.

------
wooptoo
For anyone using postgres on a daily basis I would highly recommend pgcli. It
has a very good auto-complete, and can even suggest JOIN statements (parts of
it actually).

~~~
hobolord
do you find that it's significantly slower than just using psql?

~~~
Manozco
Yes, in many cases (until I stopped using it), it messed up my benchmarks. I
was testing some queries, and often it was: \- pgserver fast, returns 10000
rows, pgcli takes time to parse -> result: total time 10s \- pgserver slow,
returns 1 row, pgcli instant -> result: total time 10 sec.

As we are working in efficient C/C++, our internal time to parse the query set
is closer to psql than pgcli, so we tend to prefer the pgserver fast, but the
benchmark using pgcli did not clearly show the best time.

Also, when you are testing a query on one machine, you have to carefully
monitor the CPU usage to differentiate the CPU pegging by postgres from the
one by pgcli.

All those issues were gone when I got back to standard psql.

~~~
KayEss
For bench marking and optimising queries you almost certainly want `EXPLAIN
ANALYZE`. That will give you what you want about what Postgres thinks and the
time actually taken for the query.

~~~
KingMob
`EXPLAIN ANALYZE` - I read that in a Dalek's voice.

------
maxpert
Postgresql keeps getting better and better everytime! It's amazing the type of
features you can pull off once you have a solid foundation :) (P.S. Have been
using MySQL in past and MySQL now feels like stoneage database)

------
dantiberian
What is the current best option for Postgres failover? I looked at this at the
start of the year, and found lots of options, but all of them seemed to have
various drawbacks, and none were natively supported or built-in to Postgres.

~~~
takeda
I would recommend repmgr[1] over solutions like DRBD suggested. The simple
reason is that DRBD replicates on block level and doesn't understand postgres
data structures, while repmgr relies on mechanisms provided by Postgres.

So if there is a failure, you're far less likely to learn that all your data
is gone, because some important block on disk was not replicated and the
database is not readable. With repmgr the worst what could happen is that you
might lost few latest transactions, but your data should always be consistent
and in working state.

[1] [http://repmgr.org/](http://repmgr.org/)

~~~
ahoka
Repmgr is relatively easy to setup and supports automatic failover with
repmgrd. Resyncing a failed node is also just a few commands. The only
limitation I have encountered was that I had to run it behind pgbouncer so the
clients can always connect to the primary instance. I think that will change
with 10 as it will support client side discovery of the primary node.

------
rb808
Shouldn't that be PostgreSQL X?

Seriously good work. I'm not sure why my firm still buys Oracle licenses.

~~~
nine_k
Well, this Postgres release introduces automatic partitioning; in Oracle, it
has been available for like 20 years.

Also, some shops are heavily invested in PL/SQL code.

The price of migration may far outweigh the license savings.

~~~
takeda
My previous company migrated to Postgres by using EnterpriseDB[1]. This
allowed them to run existing Oracle data with PL/SQL as they were working on
migrating it to the raw PostgreSQL. So where there is a will there's a way.

[1] [https://www.enterprisedb.com](https://www.enterprisedb.com)

------
ianamartin
Postgres isn't just my favorite database. Postgres is an example for all
people about how a project should be run. There are lots of stakeholders who
want different things. The dev team looks at what's being asked for and what
is reasonable within a certain timeline. And then the team delivers features
and supports them. And then on top of that there is best-in-class
documentation.

Thank you to all of the team members who put your time and effort into this
project. It's not only a wonderful tool that I use every day, it's also a
model of how to manage a project.

~~~
rc_bhg
"best-in-class documentation"

?? what am I missing. I dread having to lookup how to do things in postgres
because their documentation is so horrible to find things in.

~~~
grzm
Can you provide an example? In my experience I find the index[0] particularly
useful, as well as the SQL command reference[1], the datatype reference[2],
the function reference[3], and the server configuration section[4]. I know
other people have different usage patterns and different amounts of
familiarity with the docs, so I'm aware that my experience may not match
yours.

[0]:
[https://www.postgresql.org/docs/current/static/bookindex.htm...](https://www.postgresql.org/docs/current/static/bookindex.html)

[1]: [https://www.postgresql.org/docs/current/static/sql-
commands....](https://www.postgresql.org/docs/current/static/sql-
commands.html)

[2]:
[https://www.postgresql.org/docs/current/static/datatype.html](https://www.postgresql.org/docs/current/static/datatype.html)

[3]:
[https://www.postgresql.org/docs/current/static/functions.htm...](https://www.postgresql.org/docs/current/static/functions.html)

[4]: [https://www.postgresql.org/docs/current/static/runtime-
confi...](https://www.postgresql.org/docs/current/static/runtime-config.html)

------
gigatexal
SQLSERVER admin and dev here: these features look really compelling. Though
the best part is a team of devs could run these in production and do some self
support and basically pay nothing. Not so in SQLSERVER land.

------
hardwaresofton
Does anyone have any use cases where PostgreSQL falls down/loses to other DB
systems? I know sharding/replication has long been a sticking point, but what
else is there?

Why do people still choose MySQL/MariaDB/Oracle over PostgreSQL at all?

~~~
adventured
> Why do people still choose MySQL/MariaDB/Oracle over PostgreSQL at all?

Oracle: because the enterprise is its own unique beast. Companies frequently
stick to what they know and have invested into, so long as it still works.
Enterprise customers buy/adopt software in a very different way than everyone
else.

MySQL: if you've got a history with the product and it works well for what
you're doing, there is absolutely no necessity-based reason to switch to
PostgreSQL. You should switch from MySQL to PostgreSQL if there's a very good
reason/s to, a meaningful gain to be had, it's that simple. Otherwise, it's
unnecessary optimization. There are a vast number of considerations when
operating a business, upgrading to the latest & greatest (speaking very
broadly) just to do it, is not a good reason most of the time. There should be
a specific prompt to action based on a gain/s or capability that PostgreSQL
provides you, in/for your use case, over eg MySQL.

~~~
linux2647
Aside from those reasons, are there technical reasons for Oracle over
Postgres? My coworkers and I want to move our data warehouse over to Postgres
and my boss has been warming up to the idea. (I know switching databases with
that much data can be complex and troublesome, but those concerns are outside
the scope of this question)

~~~
adventured
Oracle is still a _very_ good option for data warehousing. When well tuned
it's extremely performant and reliable for that purpose. If I were operating
an EDW today, I'd choose Oracle over Postgres 10 times out of 10 when it comes
to very large scale operations. You'll still see plenty of credible arguments
around that Oracle can be optimized to be faster than every major alternative.
And you'll still see arguments around replication that are legitimate.

Obviously I don't know what your cost or performance considerations are. If
your organization can save a meaningful amount of money and operational
headache by switching to Postgres, that can be a big deal for any business. If
cost isn't much of a consideration in your operation (that is, what sticking
with Oracle is costing you), stick to Oracle for data warehousing over
Postgres - at least for now imo.

With Oracle it's the same story it has been for a long time (and probably will
remain), it's expensive and a pain in the ass to get up and running the way
you want it. Postgres has everything to gain vs Oracle from here forward, it
seems very likely to bite into some of Oracle's well guarded high value
territory in the coming years.

------
no1youknowz
Does anyone know whatever happened to the columnar storage work that 2nd
quadrant posted on their blog:
[https://blog.2ndquadrant.com/postgresql-10-roadmap/](https://blog.2ndquadrant.com/postgresql-10-roadmap/)

Is this essentially a pipe dream?

~~~
petereisentraut
Not a priority for 2Q at the moment. Still general interest in the community,
but hard to predict right now.

------
davedx
How does PostgreSQL 10 compare with Cassandra for BigData requirements?

~~~
SEJeff
Define "big data"? You can buy normal tier 1 server vendor hardware with 6T of
memory.

If it fits into 1 machine or a few, it isn't "big data".

~~~
pulam
I've heard this 6TB number thrown around a bit recently, but no one can
actually provide me with links to hardware I can buy - has anyone successfully
setup a machine with this much ram who can post links to what they used?

~~~
okaram
Don't know about hardware you can buy, but AWS will rent you a machine with
3TB Ram by the hour :) [https://aws.amazon.com/ec2/instance-
types/](https://aws.amazon.com/ec2/instance-types/)

------
jhoechtl
Very exciting. Still badly miss (bi-)temporal tables and native query support
[1] without using extensions [2].

[1][https://docs.microsoft.com/en-us/sql/relational-
databases/ta...](https://docs.microsoft.com/en-us/sql/relational-
databases/tables/temporal-tables)

[2][https://pgxn.org/dist/temporal_tables/](https://pgxn.org/dist/temporal_tables/)

------
odammit
Does anyone know if the partitioning can enforce a uniqueness constraint
across child tables?

I couldn’t tell from the write up if this is one of the things addressed with
Native Partitioning

~~~
Tostino
No it can't currently. There has been some discussion on how to do that for pg
11 though.

------
dzonga
To those working on and contributing to the goat DB PostreSQL thanks a lot

------
collyw
Every time I see stuff about Postgress it sounds so much better than MySQL.

Every time I get a new job, the existing system is using MySQL.(Every time it
causes problems of some sort of other - it's not a bad database, but has too
many quirks like JavaScript that will end catching you out at some point).

Last time I got to build something new the time pressure was so great I had to
use MySQL, because it was there, set up and working and I knew it. I wanted to
use Postgres, but at the end of the day the extra time setting it up wouldn't
really have been much benefit as it was a fairly standard Django app, and
didn't need any of Postgres's extra features. (I quit that job shortly
afterwards).

------
xfalcox
What happens if I try to insert into a master partition table, and there are
no child tables that can store my data?

For example, data is partitioned by month, and I have no table for the month
in the data I'm trying to insert.

~~~
jeltz
It will throw an error. Partitioning support in PostgreSQL 10 only solves a
few simple use cases well, but the next version should get much more generally
useful partitioning. This is how PostgreSQL usually releases major features.

------
transposed
I recently started using PostgreSQL for a project and have been pleasantly
surprised on more than one occasion. I also have been hearing more and more
people speak highly of it, especially for its security.

------
mcdoh
Can anyone recommend a good resource for learning PostresQL? I have a decent
understanding of SQL, I've used MySQL in the past, but every time I see the
latest features of Postres I'm left thinking I'll need to buy a book and take
a pretty deep dive to figure out how to apply the new features.

~~~
grzm
You'll likely find the PostgreSQL docs[0] very useful. They're organized
differently than the MySQL reference making it easier (in my experience) for
reading stand-alone.† They're quite well-written and up-to-date.

[0]: [https://www.postgresql.org/docs/](https://www.postgresql.org/docs/)

† The MySQL reference is very useful particularly when you're looking to
compare feature differences between versions.

------
swsieber
Completely off topic... but did anybody notice how they used partitioning
example column names were in Spanish?

------
bratao
What I miss the most from Postgree is the compression feature in the same
level as RocksDB or Tokudb. The actual compression is very bad compared to
them.

Do anyone know if they have something planned in this department ?

~~~
pritambaral
Advanced compression of data is really something that should be looked at
separately from database design. Look into using ZFS beneath Postgres for some
pretty good compression.

------
JunkDNA
Would love to know about FDW push down. That could be really good for
specialized applications where you want to use a relational db and wrap a
specialized data store outside postgres with a FDW.

------
jadbox
The only thing I'm upset about is the fact I'll need to wait 3-6 months after
it's released for AWS RDS to support it...

------
mrfusion
Can the xml table accept any xml document? What if it's a bunch of unrelated
elements?

Can you get back to xml from the table?

~~~
petereisentraut
Any XML document that you can address via XPath in a sensible way. If your XML
document is a mess, you'll only be able to get out a mess. :)

To get XML from a table, functions already existed in previous releases
(XMLELEMENT etc.).

------
mrfusion
What does native partitioning do?

~~~
jeltz
It basically just adds a convenient syntax for creating partitions plus adds
automatic tuple routing (routing inserts on the partitioned table to the right
partition). This is for most users not a huge improvement over using
PostgreSQL's table inheritance to implement partitioning, but it is a very
important stepping stone for implementing competitive partitioning in
PostgreQL. There are lots of promising patches in the PostgreSQL 11 release
cycle which improve on the native partitioning, making it useful for more
users.

~~~
mrfusion
Thanks. I guess I meant to ask what a partition is used for? When is it useful
to use one and why?

~~~
jeltz
Oh, you meant partitioning in general. It is useful when you have huge tables
with many rows, and works by splitting a table into multiple smaller tables
which acts like it was one big table to the outside. Some advantages:

\- You can partition by for example month and cheaply delete or archive old
data by instead of deleting all rows just dropping the entire partition which
is much cheaper.

\- You can have different indexes on different partitions. This is commonly
used when some indexes are only used by queries which operate on recent data.

\- You can put different partitions and their indexes in different table
spaces. E.g. you can put your older and less frequently accessed data on
cheaper and slower disks.

\- Random inserts into B-tree indexes are expensive and when you for example
have time series data on the format (device_id, timestamp, v1, v2, ...) it can
be useful to partition on the device id so all rows are inserted in strictly
increasing timestamp order in each partition. This way you can get more write
performance out of your hardware.

\- Most databases have a maximum table size (PostgreSQL's maximum is 32 TB)
and you can get around that by splitting your huge table into multiple smaller
partitions which all fit in the maximum table size.

\- The partitioning condition can be used as a very coarse index to entirely
skip looking at some of the partitions (even at plan time for some queries).
This does not always give you much over if you just had had one big table with
one index over all of it, but in some cases this can be a big win.

\- For PostgreSQL specifically it looks like you can have some of your
partitions be foreign tables. Meaning some of your data can be on an entirely
different server, not necessarily PostgreSQL or even a relational database
system at all. I am not sure how useful this will be in practice.

Partitioning has a maintenance overhead and adds a bit of extra work to query
execution and planning so it is not always worth it.

------
nevhan
How does the native partitioning compare to something like Timescale's
hypertables?

------
GoodbyeEarl
Hey, HN! Would you recommend me a practical introduction to data modelling
with Postres?

~~~
mst
Learn normalisation from wikipedia, then read "The Art of SQL" to get a
feeling for the shape of the weirdnesses, then read the postgresql
documentation from end to end.

(disclaimer: worked for me, may not for anybody else)

~~~
3131s
I intend to read the PostgreSQL docs end to end at some point, but last time I
checked the full PDF comes out to something like 3600 pages! I am astounded by
the quality of everything about PostgreSQL, the community included.

------
dream42
Nice !

