
Ask HN: Posgresql JSON or MongoDB? - ssijak
I am building some app where part of it will have pretty much unstructured and changing data which I would like to dump as JSON. As a primary DB I am using PostgreSQL. I have never used JSON field type, but have used MongoDB before. I don&#x27;t need anything fancy, 99% of the time it will be just storing and getting the data and querying by some fields of the JSON and it is of a scale that one instance of PostgreSQL could handle, nothing epic.<p>My question is, is PostgreSQL JSON field comparable to using MongoDB? Are there any limits or stumbling blocks that I should now about before choosing PosgresSQL for this? And is there an easy way to view and edit that JSON field by hand like I can when using Studio 3T app for MongoDB?(I know PostgreSQL DB apps are not the greatest for some reason..)
======
colanderman
Make sure you're using at least PG 9.5, which includes operators for modifying
JSON values: [https://www.postgresql.org/docs/9.5/static/functions-
json.ht...](https://www.postgresql.org/docs/9.5/static/functions-
json.html#FUNCTIONS-JSONB-OP-TABLE)

I know nothing about Mongo, but one thing you won't get with Postgres is
indexing inequality operations on arbitrary JSON fields. (You can accelerate
equality and membership on arbitrary JSON fields using GIN indexes, and you
can index inequality on fields known _a priori_ with expression indexes, but
not both.)

Also note that locking is at least at a per-row granularity, there is no way
to contend only _part_ of a JSON structure. Again, not sure what Mongo does
here.

Likewise, updates are at a per-row granularity. This means that – assuming
your JSON is indexed – you lose the benefit of HOT updates
([https://github.com/postgres/postgres/blob/master/src/backend...](https://github.com/postgres/postgres/blob/master/src/backend/access/heap/README.HOT)),
which can result in a lot of vacuuming and indexing overhead if you have an
update-heavy workload. Best practice here would be to place indexed and
unindexed parts of the JSON objects in different columns, to regain HOT update
optimization for updates to unindexed values.

~~~
maerF0x0
Mongo seems to lock on various granularities. Sounds like collection ("table")
is common and with WiredTiger engine (default iirc) it goes to document
locking ("row").

[https://docs.mongodb.com/manual/faq/concurrency/](https://docs.mongodb.com/manual/faq/concurrency/)

~~~
weishan
WT is the default storage engine and it locks at document level ("row")

------
angersock
Postgres JSONB is strictly better performing on a single node than MongoDB.
You can also create triggers and things to properly validate your data and do
other cool stuff you can't with Mongo.

There is literally no reason to ever use Mongo for a small-medium project
anymore.

If you need tooling, check out the new pgAdmin 4:
[https://www.pgadmin.org/](https://www.pgadmin.org/)

~~~
maerF0x0
what about High Availability?

~~~
angersock
Just use Amazon RDS and let them deal with it.

None of the theoretical scaling benefits of Mongo outweigh the clusterfuck of
maintenance, data inconsistency, and other general problems that shitty,
shitty piece of software has.

~~~
maerF0x0
name checks out.

I didnt mind mongo in a couple of my previous jobs. We never noticed data
issues and we did some crazy things like generating mongologs on one machine
and moving them to another manually .

------
jlawer
Postgres is amazingly powerful if you want to get into it, and its near
obsession with ensuring that everything is correct ensures a level of
reliability that is typically not present in the MongoDB / MySQL world.

However that does result in much of the tooling not being able to support all
of the features, unless they simply pass you down to SQL commands. Most of
this is due to the MASSIVE feature set that Postgres supports. I use JetBrains
DataGrip for Postgres work (which is SQL command based) , where as for MySQL i
do most work with Sequel Pro (Typical MySQL GUI).

I am working on a reporting data warehouse right now using Postgres, and we
are storing all data as JSONB objects as an abstracted "Entities" which link
to other "Entities", where I am trying to get some of the advantages of triple
stores / graph databases, while keeping compatibility with SQL / RDBMS which
our developers and systems are used to integrating with. I think postgress is
the best starting point for this project, simply due to the ability to manage
the DB consistency inside the DB using triggers and stored procedures. While I
am dealing with many different structures, I still need to be able to have a
certain "sub-schema" of properties to be able to link the data to other data,
which I can enforce using PGSQL.

However if your literally looking for something to Stash and retrieve a JSON
object by primay key... your not going to get that much value from Postgres.
There is some benefit if you already have pgsql infrastructure, as MongoDB can
have some caveats around reliability (at least with the older versions). This
is not that it is difficult to set it up in a mostly reliable fashion, it just
requires some attention. However it will really come down to is your
requirements, knowledge of the tools and use case.

------
borplk
Use Postgres 9.6 JSONB. Trust me no one has ever looked back and said "I wish
I had used MongoDB instead of PostgreSQL".

~~~
memracom
You said "pretty much". So use PostgreSQL 9.6 and but use JSONB only for the
schemaless data but normal typed columns for the rest. And remember that JSONB
is a column type so don't try to store too many different things in a one-
column table. You can have as many tables as you need to sensibly organize
your data, and then as many non-JSONB columns as you need for things like
meta-data. For instance, date created, date changed, who created it, who
changed it, is there an audit log for this table, when was it last replicated,
backed up, etc. And then, think about whether you should just stuff all the
data into one JSONB column or several.

For that matter you could have both a JSON and a JSONB column. If you run into
data that is not well-formed JSON then you will get an error trying to insert
it into a JSONB column but it will INSERT just fine in a JSON column allowing
you to deal with the well-formedness problem later.

And do make use of the rich selection of JSON and JSONB functions to create
indexes on your tables because nothing speeds up querying like an index that
lets you filter your data and only process the important subset.

PostgreSQL has been undergoing some heavy development in recent years as more
and more companies shift away from proprietary commercial databases. There are
several companies offering full commercial support for PostgreSQL if you
need/want that.

This development work is leading to a constant stream of improvements, both
performance and new features. This alone is a good reason to choose
PostgreSQL. It now embraces both the SQL schema world and the schemaless NOSQL
world in one database system.

------
smoyer
I've been using CouchDB since 2009 or so and am pretty impressed by the jsonb
functionality built into PostgreSQL (the RDBMS I generally use for
everything). I'm building an application with an abstracted "store" and will
be running benchmarks against both in the next few weeks but so far, it looks
like PostgreSQL might be a bit slower when writing jsonb fields. For this
application, the ease of using just PostgreSQL (versus a hybrid
PostgreSQL/CouchDB back-end) is well worth that slight penalty.

------
weishan
This might be useful for you.

[https://www.mongodb.com/compare/mongodb-
postgresql](https://www.mongodb.com/compare/mongodb-postgresql)

------
dyeje
If you're already using Postgres then just use that. If it doesn't work out
switch to Mongo. No sense in adding another layer of infrastructure and
complexity if you don't need it.

------
tracker1
If you already have PostgreSQL and only need limited indexing against the
JSON, I'd use that... If you're looking for a document datastore for HA and/or
read/write scaling, then I'd look at RethinkDB or MongoDB. I like RethinkDB
better (as do many), it's currently pretty stable, I'm hoping to see some
development pick up. There was some scare late last year as the backing
company shuttered, but it's now under the Linux Foundation, so should see
continued support.

------
harrisreynolds
I'm working on a tool for visualizing data that works with MongoDB [1] . If
you are working with JSON data anyway, mongo feels like a natural choice.

[1] [http://www.visualabs.com/](http://www.visualabs.com/)

------
docteurklein
be careful: UPDATE operations are nightmarish compared to mongo at least for
some use cases, as this one:
[http://stackoverflow.com/a/42434665/244058](http://stackoverflow.com/a/42434665/244058)

------
maerF0x0
I'm also curious if anyone has tried JSON types in Amazon PG compatible Aurora
or RDS?

~~~
e1g
We have been using JSON on RDS extensively since 9.5. Today probably 80%+ of
PG JSON capabilities are used by the app in some way for some pretty complex
use-cases (fintech). Never had any issues with this on RDS - everything just
works (multi-as, read replicas, performance profiling etc) and we love it.
Haven't tested Aurora yet.

~~~
maerF0x0
Ok you've been using postgres RDS? Im mostly interested in Aurora.

It seems cockroachdb, aurora and cosmos are targeting similar use cases.

~~~
e1g
We use Aurora for MySQL apps and love it. The PG-compliant variety is not
available in our regions, so can't put it through real testing by ghosting
requests yet.

IMO RDS is perfect for most use-cases. I'd strongly encourage to try it out -
it's highly performant and very reliable (uhhh... well...). Aurora has a
significant edge under very significant loads (i.e. >300 concurrent clients),
but can be slower under typical scenarios (0-100 concurrency).

------
vinum_sabbathi
in 2017, any db without horizontal scaling that is native is not really the
best selection for large scale apps.

