Hacker News new | past | comments | ask | show | jobs | submit login
PostgreSQL 13 Beta 1 Released (postgresql.org)
172 points by jkatz05 13 days ago | hide | past | web | favorite | 53 comments

The only thing I want for PostgreSQL is a better backup/restore. SqlServer is so easy. You backup a database and restore it and never have any issues. You connect the users in the instance to the users in the database and away you go.

PostgreSQL will allow a backup and then epic fail on the restore if you don’t have users and roles etc pre defined.

It’s the hardest thing to achieve in PostgreSQL.

Yeah, that's because you're using the wrong tool for the job. You are using pg_dump on a single database, or set of databases. Use pg_dumpall: https://www.postgresql.org/docs/current/app-pg-dumpall.html

For an actual backup solution, not one off stuff, use a tool like pgbackrest. It's just way better.

I’m pretty sure I’ve tried dumpall before but honestly cannot remember but you’re right do use dump.

I’ll give dumpall a go today. I do still feel this could be improved in PostgreSQL because there’s sooo many blog posts who suggest dump and none of them work.

> epic fail on the restore if you don’t have users and roles

?? in my lonely research world, the PG backup defaults and restore behavior have always been gentle and predictable. I generally backup per-database, never by cluster

I’ve never successfully backed up a single database and restored it on a clean instance of PostgreSQL. It always fails on roles not existing.

If I create all the roles up front then it works. But it’s a hassle compared to SQLServer where I can backup, restore, create a user and link him to the database user. And it’s done.

Are you using pg_dump or pg_dumpall? The latter will backup the global objects (like roles) as well.


An alternative, which I think I borrowed from how Heroku does/did dumps, is 'pg_dump --no-acl --no-owner'.

There is an option to omit roles on backup `--no-owner` https://www.postgresql.org/docs/current/app-pgdump.html

Super excited for the improvements to btree indexes, that is some interesting work and very impressive results. The improvements to pg_stat_statements will be very welcome too!

As someone still on Postgres 9.6, does anyone have advice on how to make the business case for upgrading to 10, 11, or 12? I want to upgrade but I have a hard time articulating the benefits against the costs of downtime and risk

With the upgrade from 9.6 to 10 we improved performance by 20% without doing any other changes. From 10 to 11 it was pretty similar, saw a 15-20% performance increase.

We haven't made the jump to 12 yet, we expect overall performance to be lower but disk I/O to be improved.

If performance doesn't matter and you don't need any of the new features you'll have a hard time finding reasons to upgrade. I consider 9.6 to be very stable.

While it may be very stable, the business case for upgrading sooner rather than later is not being behind the 8-ball when 9.6 EOL comes in a year.

1. There is a significant amount of improvement with regard to performance.

2. The partitioning improvements are tremendous and well worth the change if you're using trigger+constraint based partitioning.

3. Parallelization and the addition of concurrent reindexing

4. Built-In Logical Replication

5. JIT query optimization

At some point 9.6 won't get security patches and bugfixes anymore as far as I understand. That alone is a reason not to fall too far behind with upgrades.

There is lots of new stuff regarding declarative partioning and parallel queries. If any of that seems useful to you those are big new features.

There have been lots of small performance improvements that add up, and also the long requested change to CTEs as omptomization fences.

I'd probably just scan the release notes for 10-13 and see if anything particular hits a pain point you're having.

No real risks to be honest. There are many new stable features on every major version. V10 was rock solid to me and my team. However I needed to go v11 in order to get stored procedures, a feature our data scientist required and based on his Oracle SQL background this was a necessity.

9.6 is old now. I'd upgrade. Get pgadmin4 while you're at it.

depesz has a great tool that highlights the changes.

Even going from latest 9.6 to latest 12 gives a huge number of fixes and changes and features.

Can't recommend this tool enough :)


Well i am too on 9.6 its still supported and i have absolutely no problems with it, but maybe you find here some "must have's" : https://www.postgresql.org/about/featurematrix/

Indeed, GIS improvements, index operations improvements, statistics improvements, stored generated columns, parallelized hash joins all could be compelling reasons to upgrade. They improve operations efficiency and DB performance (nearly) without any changes to the working code.

Version 12, Inlining of - With CTEs - has been a big win for us, we have seen some dramatic improvements in performance. Also Parallel query to take advantage of multi core processors has improved performance in general since 9.6. I also like Lateral joins especially with Set Returning functions, they are literally like magic.

There are no other risks than the chance that you somehow mess up the upgrade, delete the wrong files, etc. Planning and practicing the upgrade process helps against these risks.

What I most waited for was improved replication and table partitioning support. These were must haves for us.

The other thing that helps with these risks is having a properly managed backup system.

If you can always just practice the upgrade on a new machine, or easily roll back if the upgrade does fail.

> make the business case

On the assumption right now there may be less people using the services and may be the least disruptive if anything goes down. Upgrading now means you have less hassle in the future.

I am in the same boat. It appears 9.6 is still supported until Nov 2021 [0] but I plan to look at moving up to 12 here either this summer, or next spring. I know partitioning is drastically improved, but we currently don't use it in our data (but need to start looking at it)


If you're on AWS you can use Performance Insights [1] from PostgreSQL 10 and up.

[1] https://aws.amazon.com/rds/performance-insights/

Test before upgrade.

There is one risk for this migration - versioning has changed a bit between 9.6 and 10 and some libraries have an issue with that. (They need to be updated.)

Am I the only one eagerly waiting on postgres to transition away from the process per connection model? It seems like that is the only glaring con for postgres compared to SQL server. This would really put the nail in coffin for MySQL and other alternatives.

There is a proposed change that improves connection scaling signficantly:


This should be either in Postgres 13 or 14, not sure how they decided in the end.

This earliest this will be available is PostgreSQL 14; it was not added to PostgreSQL 13.

Strange, they really wanted it for 13 and allowed it to be committed post feature freeze, but then all discussions stopped after April. And I couldn't find anything else in the mailing list.

There was a long discussion about multiple patches that "almost made it" on an internal release list. Ultimately, the conclusion was not to push them after the code freeze date (which was already extended by a week).

Obviously, it'd be great to get those patches in, but there's always the risk of pushing something that may not be quite ready yet, did not go through sufficient review etc. Which might disrupt the whole release schedule, make the stabilization phase longer, etc. And all the lockdowns around the world just amplify the risk. There's also the question whether it's worth breaking the rules the project adopted over the years. Considering all this, it probably makes sense to leave this for v14 ...

But yeah, it'd be great to have this in v13, had it been ready before the code freeze.

Thanks. At least we know it should be coming in PG 14. Hopefully Zheap too.

I expect there will be other changes to improve connection scaling first.

What's so wrong with that model?

It doesn’t scale and makes some abstractions more difficult.

How far does it need to scale? Times of two-tier apps are gone, middle tier has it's own pooling.

Ideally thousands (and maybe even tens of thousands) of connections. At the moment we're quite far from that, as the benchmarks in the thread [1] demonstrate (low hundreds, I'd say).

There's an awful lot of applications designed with the idea that connections are cheap. And on some databases it's actually true.

But it's also about hardware - machines with hundreds of cores are getting quite common, applications have "think time" so you need to use a multiple of core count to saturate the CPU, etc.

Currently the limit is probably "low hundreds" depending on the workload (the shorter/cheaper the transactions, the worse). Connection pools are great, but unfortunately not really a universal solution.

[1] https://www.postgresql.org/message-id/flat/20200301083601.ew...

how about pgBouncer for connection pooling? I know it's more operational overhead, but is that still not good enough for your use case?

I love Postgres, it’s an incredible piece of software, I have to say though, this is the least exciting update I’ve seen, in the last 10 years that I’ve been using it. Probably because it’s now so capable, the only major missing feature for me is, automatic incremental update to materialized views, one of the few important remaining advantages that Oracle has over Postgres.

This release had quite a few patches that just weren't quite ready to go out yet, so were not included.

There was some work done which reduces the per-connection overhead and allows many many more direct PG connections before performance degrades, while increasing overall performance with high connection counts by a ton...but it just wasn't quite ready and missed the deadline by a little.

The memory based stats collector is another patch in the same vein, where it wasn't quite ready for 13.

That said, many of these features are quite impressive and will offer nice wins for many workloads, incremental sorting is very cool, as is the memory bound hash aggregation.

I'm with you on automatically updated materialized views, it's been something i've needed and been waiting for a long while now, and I am hoping it gets more focus. The current patch-set being worked on for it seems to have quite a few limitations.

> I have to say though, this is the least exciting update I’ve seen

I agree. For me, it is a sign of maturity. It's not the only sign. It could be a sign of stagnation! But from what I know, having used it for over a decade, I am happy to be in the midst of "boring software", at least for my database.

> PostgreSQL 13 continues to improve operability on Windows, as now users who run PostgreSQL on Windows now have the option to connect over UNIX domain sockets.

I'm curious about this!

Windows 10 brought in support for AF_UNIX: https://devblogs.microsoft.com/commandline/af_unix-comes-to-...

Nice! Thanks for the link!

Zheap didn't make it?

Not yet, they have done much work on the pluggable storage API, but I haven't seen much discussion on the mailing list at all about zheap in a while but work seems to continue with it.

Pluggable storage would potentially be useful for YugaByte as well. PostgreSQL will see even more innovation in the years to come. Well done PostgreSQL community.

Still hoping for a transaction manager so I can implement 2PC using postgres_fdw.

I don't know your use-case exactly but every distributed system engineer should know that 2PC is extremely dangerous.

It is a blocking protocol, This mean if the coordinator crash , some participants will never resolve their transactions.

After a participant has sent an agreement message to the coordinator, it will completely block until a commit or rollback is received from the coordinator but this will never happen.

Thanks for the warning - I agree that there are dragons. I have some well-constrained use-cases, and would love to obviate the need for "yet another microservice."

I find it frustrating that stored procedures can't start and stop transactions. Without that ability, you can't really move application logic into the database.

[edit: Doh, I'm wrong. Sorry. I'll leave the comment here.]

You're a release behind with that as the other person who replied said. I've made use of them for helping manage ETL flows between my DW and OLTP database.

Slightly off-topic, but how comfortable are people with using stored procedures nowadays? I had developers in the past swear by them and I know that the early version of CloudFlare was almost entirely built using stored procs. On the other hand, I've had others who view them as something to be avoided at all costs.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact