This benchmark misses the entire point of MongoDB: that you can atomically update individual fields in the document.
Thas has not been possible with Postgres json storage type. Instead, the entire JSON blob must be read out, modified, and inserted back in.
This reality is well known to those that understand Postgres, which is why they have HStore. HStore is limited though (particularly to the size of the store), so there is work underway to make it more competitive with MongoDB.
So now they are also releasing a jsonb (b for binary) storage format, which looks promising, but I can't find any information on exactly what its features are. I would love to actually see a benchmark comparing field updates, but this benchmark is not it.
MongoDB is a database with trade-offs, downsides, and more crappy edge cases then MySQL, but it does exist because at its core it allows data modeling that traditional SQL databases are lacking.
MongoDB has first class arrays rather than forcing you to do joins. It supports schema-less data, which is rarely useful, but when you need it can be very useful. It can do inserts and count increments very quickly (yes the write lock means you eventually have to put collections in separate databases), which is also useful for certain use cases.
Note that on the Github repo for this benchmark, the README mentions that:
"... later versions will include a complete range of workloads (including deleting, updating, appending, and complex select operations) and they will also evaluate multi-server configurations."
Any update in PostgreSQL will result in a new tuple being inserted, whether it contains json, hstore or anything else; that's the basis of multi-version concurrency control. It'll be the same deal with jsonb.
Not only that, the delta to update the page to contain the new tuple, and a copy of the full page the tuple is being written to (if it's the first change to that page in that checkpoint cycle) are written to the write-ahead log.
This is an excellent post. You've answered a question I've had for a while - are there any circumstances where MongoDB is the right tool for the job? You seem to be on point that Mongo is currently the best for frequent updates of JSON blobs.
Thank you.
I'll take issue with a couple of points, though. Postgres has arrays:
It does have arrays, but they aren't first class to the same extent as MongoDB.
* A Postgres Array can end up being stored elsewhere, whereas in MongoDB an array will be contained within the document
* I am also not clear on what exactly can be stuck inside an array (In MongoDB it can be an object that contains more arrays) while maintaining first-class access and updates.
I would love to find more detailed information on these points, but with jsonb they may be moot now.
I will look into the increment issue in PostgreSQL more carefully. I just know that it wasn't feasible in MySQL when I was using it.
"stored elsewhere" doesn't make sense to me. Arrays are part of the tuple as much as any other attribute type.
You can store anything in an array that you can store as an attribute of a row. There may be some edge cases I don't know about, but generally anything that can be a column can be an array element.
Table groups look rather like hierarchical databases [1], which were the norm before relational databases came along.
Do bear in mind that you'll never get MongoDB-style in-place updates in PostgreSQL, due to MVCC. You may save a round-trip with the entire JSON object once they implement update operators, though.
Could you confirm that you can atomically update a single item inside a jsonb field without reading/writing all of the jsonb data? I know this is possible with hstore (but not 9.3's json type), but can not find a clear answer for jsonb in the documentation. Any pertinent links would be much appreciated!
Thas has not been possible with Postgres json storage type. Instead, the entire JSON blob must be read out, modified, and inserted back in.
This reality is well known to those that understand Postgres, which is why they have HStore. HStore is limited though (particularly to the size of the store), so there is work underway to make it more competitive with MongoDB.
So now they are also releasing a jsonb (b for binary) storage format, which looks promising, but I can't find any information on exactly what its features are. I would love to actually see a benchmark comparing field updates, but this benchmark is not it.
MongoDB is a database with trade-offs, downsides, and more crappy edge cases then MySQL, but it does exist because at its core it allows data modeling that traditional SQL databases are lacking.
MongoDB has first class arrays rather than forcing you to do joins. It supports schema-less data, which is rarely useful, but when you need it can be very useful. It can do inserts and count increments very quickly (yes the write lock means you eventually have to put collections in separate databases), which is also useful for certain use cases.