Hacker News new | past | comments | ask | show | jobs | submit login

Congratulations to the team. The replication/partition improvements are significant and much appreciated.

My favorite improvements are full text search of JSON & JSONB; this makes pg a full replacement for Mongo for my use cases.

I feel like this feature replaces almost all of Mongo's use cases!

As soon as Postgres adds the "random data loss" feature they will have a full superset of Mongo.

You can always turn off fsyncs…

Then you might as well use pg.

I think he meant that you can turn off fsyncs in pg in order to add random data loss :-)

You might be interested in this paper where the author touches on fsync.


I'm interested. When you say almost, can you elaborate on any remaining use cases when you'd use Mongo?

At my previous company we made heavy use of its lossy compression feature.

What lossy compression? Were you guys throwing bits into /dev/null?

Is that related to the cool "hash compression" technology I hear about? Apparently it can compress an arbitrarily large file into just a few bytes, amazing!

You would get compression with Postgres running on ZFS.

Postgresql has compression by default on, for all large text and other large fields that get great benefit of compression. From documentation -" The technique is affectionately known as TOAST (or "the best thing since sliced bread"). "- https://www.postgresql.org/docs/8.0/static/storage-toast.htm...

Lossy compression, not lossless.

Database plus Copy-On-Write file systems sound like a bad idea. I am imagining a modest 100gb database being re-written for every change.

I am sure there is some way to work around this, but wouldn't this be the default behavior with a typical database and typical COW file system?

You might be interested in reading this paper: https://people.freebsd.org/~seanc/postgresql/scale15x-2017-p...

I am interested and I appreciate the thought and link, however these appear to be the slides to a talk without the actual talk. If so they are of limited use because the slides never have all the information the presenter has, and that information is often just a summary.

Is this the correct talk: https://youtu.be/dwMQXLOXUco?t=5380 ?

Yes, sorry about that, that is the corresponding talk to the slides. Thanks for pointing that out.

First, you have to make sure the page sizes for the FS and DB match. That's a critical requirement. But yes, you'll get write expansion twice.

In random cloud provider you may not get FS with compression on your machine..

I know it's pretty popular to hate on Mongodb now (even more so than it was to love on Mongodb 4 years ago), but there are still areas where it's better than a relational db. In game development, it's extremely helpful (especially as an "indie") to change the structure on a whim so easily. Also based on the design of the game I'm working on, I believe the document structure captures the structure of the data so much better than if I was forced to make a bunch of tables. This aides in understanding the representation of our game's data, and I believe (but haven't tested) it will be faster than a relational db for my use case, but that's an ancillary benefit anyway.

But the posters above you said that JSON and JSONB types in Postgres, and functionality around them, eliminated the need to use other databases for document type data.

What you are describing can be done with PostgreSQL. One thing that is missing is better client libraries that make use of those data types. Morphia wins for now in that regard.

.NET has great support with Marten:


My claim wasn't that it didn't fulfill their needs, it was that it doesn't fulfill all needs (gamedev is one example that I'm familiar with).

Postgres storing JSON types != All mongo functionality

I'm sure I could achieve everything I'm doing in mongo by some roundabout way in Postgres, but if you're doing a large amount of reading/modifying partial fields within JSON structure, it's the exact use case for mongo.


that said, I actually use partial JSONB updates on a regular basis, but I tend to use PLV8 to do the heavy lifting.

What is the point of posting that ?

(1) It's a proof of concept, (2) it hasn't been updated in 3 years and (3) it still isn't the same syntax as MongoDB.

The point still remains that PostgreSQL isn't just a 1-1 replacement for MongoDB which is pretty common sense to me anyway.

Wow, that's a pretty hostile response.

1) it's a proof of concept that others have improved upon to show that you can indeed replace the functionality of mongo that most developers tend to rely on.

2) neither has mongodb at that level

3) that's correct, you need to actually use the term SELECT when you use the functions

and you are correct, pg is missing the random data loss that comes with mongo. it will never be 1-1 in regards to that.


the important part is the note (the bulk of the comment) that I'm updating partial JSONB data on a very regular basis, and do it using PLV8, getting rid of the need for an unreliable database and instead using exactly what this news story is about.

Marten for .net

A small independent game is where I've used it before as well. And currently I'm working on an app/game sort of thing where it makes a lot of sense because we're iterating often and fast. I like that it kinda gets out of my way and just works, though perhaps I would not so much had I experienced the data loss others speak of.

Yeah I've read some horror stories about that too. I think that the improvements to concurrency (ala WiredTiger), will help wth that, as well as making sure to think about possible concurrency issues from the outset, as I've tried to do.

Have you heard of ToroDB? FWIW they have benchmarks that claim it's running faster on top of Postgres than MongoDB does natively.

"ToroDB Server

It is a MongoDB-compatible server that supports speaks the MongoDB Wire Protocol (and therefore can be used with the same drivers used to connect to any standard MongoDB server) but stores your data into a reliable and trusted ACID database."


Using MySQL/Mongo for some years, major problems with MongoDB Cloud Manager (several hours site down due to Cloud Manager removing the mongo binary), for some time now I use pg for new projects and think it's great.

Two gripes:

1. Client libraries in Mongo work nicer with documents than Postgres libraries with JSONB (e.g. Scala Option[] mapping to non existing/existing fields)

2. Why is the Postgres JSON syntax so different? Why not just support SELECT document.field.field instead of (inconsistent) document->'field'. Imho pg JSON syntax is hard to read and new to learn.

Re 2: That syntax is already in use by `SELECT schema.column`. I do agree that the syntax is a bit cumbersome and harder to learn, but I'm not sure if they could have done much better while being consistent with SQL.

Why can't pg same the same syntax? I'm sure it could detect if it's a table column or a document field - or am I missing something? Why not handle documents and columns the same, with documents a a kind of hierarchical columns.

Except that it doesn't scale like MongoDB does. How sharding / cluster works? By default isn't Postgres a single master?

I would say that it isn't configured to scale like Mongo out of the box...but that doesn't mean it can't.

You can go outside of Postgres core to get multi-master solutions with easy sharding and clustering...with the open sourcing of CitusDB and 2nd Quadrant's pglogical and BDR extensions there are options out there.

You can also roll your own (if you really want)...and it is relatively approachable to do so using built-in features like partitioning.

And, of course, with the 10 Beta it would seem that logical replication is being brought into core which sets the foundation for future replication features such as BDR to also be brought into core.

I would also point out that since Mongo's BI connector fiasco, it would seem more and more Mongo users are finding more reasons to just use Postgres (where relational interfaces are desirable for BI): https://www.linkedin.com/pulse/mongodb-32-now-powered-postgr...

None of what you posted is built in and thus supported by the vendor.

That may not matter to you but it's a deal breaker for those of us in enterprises. We can't just be rolling our own versions of PostgreSQL and we can't use CitusDB when it is not supported by other vendors for use with their products.

The point still remains that after all these year PostgreSQL's scalability story is still a mess.

I think you might be confused somewhat...PostgreSQL doesn't have "a vendor". The history of Postgres starts at UC Berkeley and now has the PostgreSQL Global Development Group which is a mixture of contributors both community and corporate sponsored:


Of that group, both 2nd Quadrant and CitusDB are represented...so in a way you could say their support is "by the vendor". Not to mention EnterpriseDB which also has support options.

> we can't use CitusDB when it is not supported by other vendors for use with their products.

CitusDB is no longer a fork, it's an extension...this means so long as your other vendors products support Postgres, they support CitusDB. More over, CitusDB itself will sell you an Enterprise package.

> The point still remains that after all these year PostgreSQL's scalability story is still a mess.

If by mess you mean specifically there is no knob and dial arrangement in the core of Postgres I would agree.

But within the core of PostgreSQL there are primitives which make scaling approachable (I myself am working on a data ingestion process that utilizes table partitioning and hand rolled sharding for PostgreSQL).

And there are now a plethora of extensions and tools provided by core contributors such as 2nd Quadrant and CitusDB to offer somewhat out of the box solutions and they even come with support.

PostgreSQL is not the right tool for every job and the replication/clustering area has been a a sore spot for Postgres in the past. But it certainly isn't bereft of options now...and the inclusion of logical replication in this beta is only the first step in bringing these options closer/into the core.

> PostgreSQL's scalability story is still a mess

This is absolutely true. I don't get the irrational downvotes when it comes to postgres.

It's a great database but it is sorely behind with scalability features and is just now finally getting single-node parallelism and logical replication. It's still hampered by the requirement on 3rd party tools to get connection scaling, decent backups, HA and distributed clustering. The future looks interesting but the other databases aren't sitting around idly either.

Registration is open for Startup School 2019. Classes start July 22nd.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact