Hacker News new | past | comments | ask | show | jobs | submit login
PostgreSQL 16 (postgresql.org)
586 points by pella on Sept 14, 2023 | hide | past | favorite | 84 comments



anytime a huge multi-decades-old FOSS project lands a milestone, I can't help but equate it to something like a moon landing.

So much (unpaid) work and thought goes into stewarding open software. Kudos to the whole team. Software infra is just as important as bridges and roads -- here's hoping we can fund it at least as well, for humanity's sake. [1]

[1]: https://www.fordfoundation.org/work/learning/research-report...


PostgreSQL is also one of the most impressive projects out there in terms of being community maintained, as opposed to many large FOSS projects which have some kind of corporate backing employing the majority of the core team.


Isn't the good part of the core team part of EnterpriseDB?


You can see a break-down of the core team and major contributors here, as well as their current company affiliation: https://www.postgresql.org/community/contributors/

(and as noted in the other comment, whilst EDB certainly makes important contributions, they are one of many)


TIL Julian Assange was a contributor to PostgreSQL


3 of 7 work at EDB, and the core team doesn’t drive the project roadmap. And EDB hackers fail to get patches in all the time, just like everyone else :)


Isn't the work in large projects often done by those paid by their employers on "company time"? For instance Bruce Momjian by EDB.


Yes, and Andres (literally the “main hat” they wear at Microsoft is work on PostgreSQL, and Tom, etc. I don’t understand why pointing out what should be obvious gets people to lose their mind here and is such a controversy. PostgreSQL like Linux is a massively commercially sponsored project - that there are some volunteers doesn’t change that.

I used to be paid full time to mostly work on a Linux kernel subsystem. This discussion is silly.


Agreed - just think about the billions, possibly trillions of economic value (jobs, shareholder value, utility to society, etc) that a project like postgres or ruby on rails has created.


They significantly increased major versions frequency, transition from 9 to 10 took 7 years, and now they release major version every year.


They just changed the versioning scheme; it used to be that e.g. 9.3 -> 9.4 was a major version (i.e. can't be upgraded in-place). Starting with PG 10 major versions are now 10 -> 11, etc. I don't believe the major release cadence itself changed that much


It seems like they fell more inline with Semantic Versioning when that format came in vogue. Semantic Versioning is what most devs expect now; it makes sense to communicate the version in a format that has a broadly understood meaning for devs.


I think there is no semantic versioning anymore, they just release one major number version every year, regardless how large and backward compatible changes are there.


16 contains paid work


[flagged]


schools either never worked or stopped working. after all, look, almost all the people who went to school don't give a fuck about underperforming underfunded school systems. (or healthcare or ... or if they care they are ignorant and clueless about what to do with the problem, and easily fell prey to political dogma of some group.)

it's simply time to stop worrying about it.


School systems are not underfunded. They're underperforming despite being overly funded. We're throwing good money after bad.


Teachers are under funded in the US. In many states if your lucky teachers wages top out at the median income. And they often have little to no funding for class room decorations or much beyond very basic materials.

If we don't pay our teachers decently how can we expect them to put in the effort to do the very difficult task of raising the next generation? (And yes teachers raise their students just as much as parents do)


well, real value of salaries of teachers are plummeting. of course the same thing happens in other hard-to-scale demand-for-quality service sectors (like healthcare, but also construction work, where standards simply increased, which have a lot of associated labor, which now cost a lot more due to multiple factors, eg. less immigration, shrinking of the active population - end of the positive effects of the baby boom demographic boost, of course there's also the inescapable Baumol effect).


The people underfunding our schools had their private schooling paid for by virtue of being unconscionably wealthy.

The more you realise what has happened the more maddeningly upset you will get, so it's best not to think about it.

Sufficed to say: you're wrong, additionally: dead wrong and it's not relevant for a topic about databases.


nah, there's a broad consensus of US (and alas also in most countries') population of not wanting to pay for "externalities", they want cheap gas and cheap labor, cultural homogeneity, and so on.

if a few thousand private school darlings can bamboozle hundreds of millions for decades, then the problem is not just with them.

of course it's cultural and it's due to the very strong biases of the last who-knows-how-many centuries.

> and it's not relevant for a topic about databases

it's very relevant for the whole concept of FOSS "moon landings" and the commons


[flagged]


if they're getting more than $200 then it's already way more money than I think.

It's true that there is some money (after all, there is pgcon); and it's true that some people are paid to work on postgresql.

However: it's additionally true that there are many volunteers, and that it's free for us to use, modify, hack on and so forth.

coordinating that effort must be absolutely herculean.


> It's true that there is some money

This is such a misleading understatement. There is a ton of money. Many of the core Postgres contributors have fully remunerated employment where much of their time is dedicated to postgresql development. This is fully deserved and appropriate - but Tom Lane, et al are not working on this for free (that they still might if they had to do is another thing) but these are all highly compensated contributors. The total payroll spend on postgresql is distributed and decentralized but it is clearly in the multimillions.

People should not be mislead about true costs regardless how they’re meted out.



woah til there is release art (did openbsd start this trend?)



Previous discussion from Beta 1 announcement: https://news.ycombinator.com/item?id=36070261 (154 points, 60 comments, 3 months ago)


> Add SQL/JSON constructors and identity functions

This will be a nice quality of life addition!


> SQL/JSON constructors

Like these?

> Adds SQL/JSON constructors, including JSON_ARRAY(), JSON_ARRAYAGG(), JSON_OBJECT(), and JSON_OBJECTAGG().

Not sure what SQL/JSON identity functions relate to


identity probably refers to the ’IS’:

SELECT js, js IS JSON OBJECT "object?", js IS JSON ARRAY "array?", js IS JSON ARRAY WITH UNIQUE KEYS "array w. UK?", js IS JSON ARRAY WITHOUT UNIQUE KEYS "array w/o UK?" FROM (VALUES ('[{"a":"1"}, {"b":"2","b":"3"}]')) foo(js);


> bidirectional logical replication

Just to make sure, this is what we used to call multi-master, right?

(This is not a "why did they change it" post. Do not make it into a "why did they change it" post.)



Yes


I am so glad psql got \bind.

What good does EXPLAIN do if you're not running the same (parameterized) queries that your app does? Very cool.


Explain for parameterized queries (which is indeed very important due to the way the planner works) used to work well with prepare+execute.


I can't wait for direct I/O (now behind debug_io_direct setting).


Curious what your use case is for wanting direct_io? Every DBA I've ever worked with when setting up a new database, the first thing they want to do is enable direct io. My worst experience was with IBM DB2 mounted over NFS talking to netapp. Performance complaints would come from customers and land on the CEO's desk. He'd go to the software team and tell them to fix it. They'd say the DB is slow. Then he'd go to the DBAs and tell them to tune the DB. They'd say there's nothing more to do, we need faster disks. So he'd end up in front of me on the sysop team asking if we had any faster disks laying around (we didn't and buying more wasn't in the budget).

Since it was NFS, you could just use tcpdump and watch what DB2 was doing on the wire. It was happily poking away sending and receiving packets all 1K in size (the current configured DB block size) with peak read and and write speeds of about 11MB/s. Since the DBAs didn't want to change settings on a production DB, I set up a testing environment, begged them to play with the direct io and block size settings on this new instance and figure out the best performance. When I checked back days later, it was set up exactly the same, "we follow best practices, use 1K block size and force direct io".

I ended up creating a VM under the guise of "we need a data warehouse" with 1/4 the cpus and ram as the DB2 machines and installed postgresql 9.2. Did a minimum amount of tuning, mostly just turning off fsync for WAL writes, then spent a week filling it up with 5TB of data and 15 billion rows from the production DB. Ran one of our analytic queries that had grown to taking 30 hours on DB2, it ran in 6 hours. The packet sizes over NFS were 32-64MB in size and getting peak speeds of 180-220MB/s on the wire.


The 1k packets you saw probably correspond to the default block size being used for the DB, that is a vestige of using spinning disks. That you were using NFS or any kind of networked filesystem is what I'd say is a performance hostile environment. Did no one think of just not using NFS?


This was before 2012, AWS did not exist. The company had to find rackspace in a data center. Which we couldn't. One of the funding customers "loaned" us a couple of slots in their on premise data center which fit only a bladecenter and single netapp. NFS had advantages, you could dynamically resize live mount points, etc. Plus as I mentioned, using postgresql did away with our performance issues.


> Did a minimum amount of tuning, mostly just turning off fsync for WAL writes

That is not something I would suggest to people on production systems, as that would give you a good chance of data loss when the system halts. So, out of interest, were there any circumstances why turning off WAL fsync was considered a good choice in your situation?


I probably meant to say "synchronous_commit", which is how data is written to disk from the WAL. If you want full data guarantees, with regular hard drives, you'd be looking at less than 200 transactions per second. You set synchronous_commit to off, and suddenly you can do 10k transactions per second. You can tune when the WAL gets flushed to disk based on time and/or size. So you can set the amount of recent data loss you are comfortable with.

From the docs: "setting this parameter to off does not create any risk of database inconsistency: an operating system or database crash might result in some recent allegedly-committed transactions being lost, but the database state will be just the same as if those transactions had been aborted cleanly. So, turning synchronous_commit off can be a useful alternative when performance is more important than exact certainty"


> If you want full data guarantees, with regular hard drives, you'd be looking at less than 200 transactions per second.

That sounds like an anecdote from the time before SSDs.

> "setting this parameter to off does not create any risk of database inconsistency: an operating system or database crash might result in some recent allegedly-committed transactions being lost, but the database state will be just the same as if those transactions had been aborted cleanly. "

If your application is told that the transactions were committed, but your DB actually hasn't, you got a problem, methinks.


Synchronous replication exists to avoid more of the transactions committed problem.

Thing is DB work is all about tradeoffs, so you'll never be free of them.

As the parent explained, PostgreSQL actually lets you make a very good trade: you get full cached level performance but lose only a couple of transactions at most, instead of data corruption. If that's not ok for you, the leash can be tightened to lose nothing, but then your hardware had better keep up with your platform.


> That sounds like an anecdote from the time before SSDs.

Yes, I stated "regular hard drives" :)


Re-read their post. It was a secondary system setup to just run these analytics which were loaded from the production system. No issues if the whole machine had to be rebuilt.


I'm thinking lower resource usage (no double caching of data), shorter path to data so I would expect fewer bad things might happen during commit, and better performance in terms of transactions per second. Otherwise, I can't explain it any better than one of the lead developers himself: https://www.postgresql.org/message-id/20210223100344.llw5an2...


The new debug_io_direct flag only triggers direct IO in very limited cases, and is only tangentially related to the AIO patchset discussed in that thread.

Note that the documentation on the config flag explicitly warns about not using it in production:

> Currently this feature reduces performance, and is intended for developer testing only.

Also note that very few things will actually do IO during commit - the only IO that I can think of are 1.) the WAL-logging of the commit (often small, a few 100 bytes at most), and 2.) replying to the COMMIT command (10s of bytes at most). It is quite unlikely that this will see much performance benefit from IO_DIRECT without further infrastructure inside PostgreSQL around io_uring and other async kernel IO apis.


I know, but it's still nice to see progress in this area. Even PG17 would probably be too early to expect this work to be finished.


Yes when direct io is brought up, it is usually followed with the "double buffering" argument. That is valid, but only if your disk speeds are well above 1,000MB/s. Outside of hardware like that, you are always going to be waiting on disk.

From Linus himself[0]

"The thing that has always disturbed me about O_DIRECT is that the whole interface is just stupid, and was probably designed by a deranged monkey on some serious mind-controlling substances"

[0] https://lkml.org/lkml/2002/5/11/58


Have you read the entire thread you linked? People explained to Linus why direct IO is important to them. Besides, it's 20 years old and there were even no SSDs back then. The lack of direct IO in PG was one of (one of) the reasons why Uber moved to MySQL (https://www.uber.com/en-PL/blog/postgres-to-mysql-migration/, "The Buffer Pool" Section). With buffered IO you will likely store a lot of the same data in memory twice- once in DB's memory and then in page cache. Now you can just give the memory used by page cache directly to DB, because it knows better what and when it needs.


The thread is large, maybe I read the whole thing at one time, but the point of it is literally someone asking why using direct io is slower. That is the point I make, you can't just enable direct io (which by passes all the logic to speed up reads and writes) and expect increased performance without a lot of extra up front work.

But back to my first question, I am still curious what your workload is that you feel will benefit from direct io :)


As others have written, this work entails implementing asynchronous IO. One system I worked with that was more heavily loaded than the usual intranet CRUD was an SMS lottery run by a radio station. They had "bonus rounds" that lasted a few minutes. When the host said "now send!" we were receiving a huge amount of data at the same time. However, it worked for years on Postgres 7.4 and regular hard drives :)


I suspect Linus' position has radically changed with io-uring and proliferation of NVMe storage devices.

I use io-uring with O_DIRECT at work and the performance graphs of TLB pressure are beautiful.


can anyone point to the COPY FROM improvements mentioned that can result in up to 300% performance improvements is it the line in the release notes about "ASCII string detection"?


These are the two relevant patches that I know of (there might be more):

1. https://github.com/postgres/postgres/commit/3838fa269c15706d...

2. https://github.com/postgres/postgres/commit/121d2d3d70ecdb21...

It causes much less CPU overhead on the receiving side of a copy when receiving big JSON blobs.


I think the 300% item is "Allow more efficient addition of heap and index pages". The source of the improvement is a number of related improvements around relation extension, see https://postgr.es/m/20221029025420.eplyow6k7tgu6he3@awork3.a...


Is native transparent encryption somewhere on the radar?


Anyone know more about the "vacuum" improvements?

To make my database fast, I often have to do a vacuum full on some key tables. Which is basically a freeze all access to the table, and copy byte by byte to a new physical file. So as your data size doubles, the vacuum full time doubles. Have a table that is so big I basically can't vacuum full it anymore (in an acceptable amount of downtime).


Maybe try pg_repack?


As other commenter mentioned, you can use pg_repack. We also had similar issue and we configured it to run every month and it works flawlessly.


Oh interesting. In my case I am on google hosted postgres, so I’ll need to double check it works there:


I’m not familiar with the VACUUM changes but the situation you’re describing suggests something is wrong with the table definition or database configuration.


How do you figure?

Postgres docs are quite clear. Table space is not reclaimed without a vacuum full. So delete a column in a big table? you are storing that data forever.


Not quite true. Regular VACUUM marks dead tuples as available for re-use, so the system can overwrite the dead tuples with fresh ones after a VACUUM. VACUUM FULL completely rewrites and repacks the table.

https://www.postgresql.org/docs/current/sql-vacuum.html


Postgres has a free space map, which allows it to reuse the space of deleted tuples for new or updated tuples created in the same table. If no new/updated tuples are created in the table after the DELETE, you have fragmentation, which means the file remains large so the space can't be used for other tables (or other unrelated data residing in the same filesystem).

The nature of fragmentation means that you need to move a lot of data around to actually make that file smaller. In Postgres, that's typically done with VACUUM FULL. The problem of fragmentation is not unique to Postgres, it's a fundamental issue; but perhaps other systems are able to move the data around in a less disruptive way.

If you just delete a column, that creates a different type of fragmentation within the tuples themselves (e.g. you delete the middle column, and the tuple itself doesn't shrink, it just ignores that middle column). You are right that can be a problem. Postgres could be improved to rewrite tuples to eliminate the wasted space from deleted columns in the middle, which would probably be (computationally) worth it to do if it's already performing cleanup on the page.


...okay ? What's the use case where data shrinks ?

The data that will be "not removed" will just be used by new data.

Only real use case is "we've loaded way too many data, removed it, and want to recover that space because we will never need it", and that is not enough to matter, as usually database have its own filesystem and most filesystems can't be shrunk online so any shrinking needs downtime


> What's the use case where data shrinks ?

Parent literally provided one: deleting a column.

Another is that it's very easy in PostgreSQL to bloat indexes. Load a bunch of data. Update (or delete) that data and now your index is bloated.

The only resolution is to REINDEX (or VACUUM FULL).


The nice thing about index bloat is REINDEX has a CONCURRENTLY option, no need to block writes.


True, good point.


If rewriting the table is part of a regular workflow, then database configuration can play a significant role in its performance. However I still contend that rewriting tables regularly is an anti pattern.

Are you able to partition any of your tables so that you can VACUUM FULL the partitions individually?


Yah partition is my eventual end goal. It’s getting less and less awkward with each PG version to partition. It’s still a little awkward in Django. But heck, I haven’t tried it in a year, maybe worth another shot.


Have you tried pg_repack?


No but someone else suggested it. If it works on google cloud I might be in business.


Why are you regularly doing vacuum full instead of just vacuuming more aggressively?


Do I still have to use pg_upgrade when I'm just upgrading from RC-1?


git diff REL_16_RC1..REL_16_0 doesn't show any changes that would require a pg_upgrade (at least from my read), so you should be able to upgrade without it.


I just tested, I had to use pg_upgrade.

Error message: The database cluster was initialized with CATALOG_VERSION_NO 202306141, but the server was compiled with CATALOG_VERSION_NO 202307071.


You might just need to run:

    alter database mystuff refresh collation version;


Congratulations to all PG community!


I'm really excited about the COPY FROM improvements!

Can't wait to test them with some big data.


This is great!

But I just installed the latest Debian with Postgres 15, haha. I don't even think I'm using any features past 11 (websearch_to_tsquery), so I'll need to research anything new that might be useful to me.


Our infra team decided to deploy v12 in the last year... But only for GitLab. We are still deploying with v9.6 for some other new projects.


9.6 for new projects is insane no?


Inertia is a helluva drug.


Performance




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: