
Ask HN: Have you ever chosen Postgres over Mongo and regretted it? - chatmasta
The past 2 years seem to have been the death knell for Mongo and the &quot;NoSQL&quot; hype, with Mongo moving toward SQL parity and SQL databases supporting JSON storage, like Postgres with JSONB.<p>At this point it seems that there are very few workloads for which Mongo is the best option compared to Postgres+JSONB. I&#x27;m wondering...<p>1) Have you ever chosen Postgres over Mongo, and <i>regretted it</i>? Why? Did you end up switching to Mongo? Was it better?<p>2) Can anyone name a use case where Mongo is clearly a better choice than Postgres+JSONB?
======
kevinoid
My example is a bit dated, but may be illustrative: I chose PostgreSQL over
MongoDB for a MEAN webapp (making it PEAN?) project started in 2013 and
regretted it. I was constantly struggling with the poor support for SQL in
Node at the time and spent way too much time fixing bugs or adding workarounds
for deficiencies in the SQL abstraction libraries and ORMs, or writing bespoke
SQL and object mapping. After switching to MongoDB, I spent significantly less
time on storage issues and development speed increased noticeably.

Later we did run into several issues caused by MongoDB which could have been
avoided by using SQL (mostly related to consistency, constraint enforcement,
and problems with MongoDB's aggregation capabilities). MongoDB got us to
market faster, but had higher costs in the long-term.

I think there are two lessons that I'd take away: 1) There are costs to
choosing a less popular technology combination, even if the individual
technologies might be better (for your particular needs) in isolation. 2)
Sometimes it's better to choose the more costly technology in the long-run to
avoid costs early (i.e. incurring some technical debt can be the right
choice). MongoDB vs SQL for us was an example of both of these.

Note: I assume SQL libraries in Node have matured significantly since 2013, so
I wouldn't necessarily recommend MongoDB over SQL for new projects starting
today.

~~~
slezyr
> fixing bugs or adding workarounds for deficiencies in the SQL abstraction
> libraries and ORMs

TBH I had same experience with all ORMs I have used sqlpp(c++), odb(c++),
diesel(rust).

~~~
lucio
This article could be titled ORM considered harmful (2006)

[http://blogs.tedneward.com/post/the-vietnam-of-computer-
scie...](http://blogs.tedneward.com/post/the-vietnam-of-computer-science/)

~~~
ris
There's a lot of hatred for ORMs that floats around but most of the ire seems
to revolve around the cases where the ORM's flexibility doesn't allow it to
expose the power of the underlying database properly. But people often fail to
note how few and far between these examples are and how the vast majority of
an app's queries tend to be dumb & boring and handled extremely competently by
the ORM. Contrast this with the number of dumb typos & sql mistakes through
verbosity that the ORM has saved. Don't get distracted by the 1 or 2 cases per
app.

------
Cieplak
I have never regretted it. We dropped Mongo completely about a year after
JSONB support was released for Postgres. JSONB indexes perform very well, and
can be tuned with different index types (e.g. btree, gin, hash). You get the
best of both worlds, i.e. relational models with joins and aggregates, as well
as flexible and dynamic document storage. If you want a pure JSON interface to
your database, use PostgREST:

[https://postgrest.org/en/v5.0/api.html](https://postgrest.org/en/v5.0/api.html)

~~~
ganonm
My experience with JSONB is that you should think very carefully before
choosing to use it as it can become a real footgun if used inappropriately.

An example of a (possibly) good use case would be e.g. user
preferences/settings - there is often a lot of them and the relationship
between them is often hierarchical. You could use e.g. a junction table that
maps users to settings, but this can be unwieldy. A JSONB value can just be
sent directly to client. It is possible to be reasonably certain that you
won't want to do aggregate SQL operations on the entire table and thus just
being able to do CRUD operations on _single_ rows of the table avoids the pain
of JSONB aggregate queries. For the record, I would probably still use a SQL
solution in this case.

An example of a bad use case (which I saw in the wild) would be lumping entire
user conversations (think Facebook comments) into a single JSONB value. This
might naively seem like a time saver since you don't need a 'conversations'
table, a 'conversation_messages' table nor the JOIN to aggregate the required
data to reconstruct an entire conversation. However, as soon as you want to do
_aggregate_ operations across several/all conversations it becomes a real
hassle. Say for example you wanted to gather all messages that had been sent
in the last hour - using a traditional SQL solution, this would be trivial and
highly performant. With a JSONB solution you have to hope you have a JSONB
index on the timestamp and then you also have to work out the correct JSONB
syntax to do the query. Constructing non-trivial JSONB queries can get really
gnarly - trust me, I've been there.

From my (somewhat) limited experience with JSONB I would now generally avoid
it except for the cases where either the user wants to quite literally store
JSON (and now we get some nice JSON validation for free) _or_ for the case
where I am absolutely 100% sure I will never need to do any non-trivial
aggregate operations on the JSONB and that the upfront implementation cost of
a SQL solution would be stupendously high compared to JSONB. An example that
comes to mind would be an Evernote style application where the user data is
semi-structured and we don't generally speaking want to run aggregate
operations on it (it's their private data).

~~~
emidln
We abstracted JSONB operations into the Clojure Query AST library HoneySQL and
never really looked back. It was something one person on the team learned
thoroughly for a half day and promptly forgot about. Now we say things like:

    
    
        (jsonb/get-in :jsonb-column [:field :sub-field])
        ;; jsonb_column#>>'{field, subfield}'
    
        (jsonb/get-in :jsonb-column [:field :sub-field] "default")
        ;; COALESCE(jsonb_column#>>'{field, subfield}', 'default')
        
        (jsonb/get-in :jsonb-column [:field])
        ;; jsonb_column->>'field'
    

Which each generate the appropriate data structures for HoneySQL to do some
combination of using #>>, ->> and COALESCE as needed. I can't imagine this is
too difficult in any language. Even if you are writing static SQL, I don't
think learning the syntax to go with a feature is too high of an ask to use a
feature.

~~~
ganonm
Nice library - I agree with you, learning the syntax shouldn't be considered a
downside of an otherwise novel feature, my concern is mainly around the
appropriateness of the feature in the context of problems that already have
well established relational solutions

------
tonious
It's funny to see these things come around and go around.

~1998, there were quite a few megabytes burnt on discussing Postgres vs.
MySql. At the time, MySQL was not ACID compliant, and was (rightly) derided as
a database shaped object, but not an actual database. It was faster, provided
you didn't need get back the same data you put into it.

MongoDB seems to be in a similar situation. Under the right conditions, it can
be ACID compliant. But it's a database shaped object. As a developer, you have
to actively manage aspects of your data that should be handled by the database
itself.

Every MongoDB project I've touched has run into a schema version issue. Sure,
you can put freeform data into Mongo. But when you're querying your database,
will a record made yesterday have the same shape as one made six months ago?
How do you ensure that? You write model level code to normalize your data at
run time, or you write migrations.

Anyhow, to the matter at hand:

With a well behaved ORM you can store certain properties in dedicated columns,
and shove everything else into a jsonb blob. This gets the best of both
worlds: indexed, relational entities, and effectively schemaless data.

As your needs evolve, you can migrate properties out of the jsonb column and
up to the table schema. This can be entirely transparent to your
repo/service/controller level code. Now you can apply SQL level constraints to
required fields.

This has worked for me on a few different projects.

1) I've never regretted going with Postgres over MongoDB.

2) I can't think of a real application where I'd prefer Mongo. And the cases
that get close, I'd probably jump to a stream processing system like Kafka and
avoid database semantics entirely.

A fully schemaless database is great, providing that all of your database
applications are schemaless, too. Otherwise, you now have to manage that by
hand.

I don't begrudge Mongo's existence. It's a worthwhile experiment to move
features in and out of the database semantic model. We go through cycles of
decomposing and recomposing functionality. This is how we recombine tools to
become other tools. But I do not see Mongo as the future of databases.

~~~
jacquesm
> ~1998, there were quite a few megabytes burnt on discussing Postgres vs.
> MySql. At the time, MySQL was not ACID compliant, and was (rightly) derided
> as a database shaped object, but not an actual database. It was faster,
> provided you didn't need get back the same data you put into it.

MySQL has come a long way since then though.

~~~
tonious
Oh, absolutely it has! Even if its query planner is... suboptimal[1]. It's
even held up as an examplar of stability[2] :D

[1]: Yes, I'm nitpicking.

[2]:
[https://www.youtube.com/watch?v=b2F-DItXtZs](https://www.youtube.com/watch?v=b2F-DItXtZs)

------
tnolet
Practical example from the last ~8 months. I started building my somewhat non-
trivial Saas [0] with MongoDB + NodeJS. Getting started was quick.

Soon however, things started playing up: \- transactions \- aggregations (avg,
percentiles), rollups etc. \- simply having a relational model to express the
business domain

I switched to Postgres and I've been more than happy with it. Heroku hosting
is top notch. Libraries are pretty mature. I use the JSONB and Array datatypes
not heavily even. Turns out an "exact" data model is quite useful on anything
none-trivial. As always, YMMV

[0] [https://checklyhq.com](https://checklyhq.com)

~~~
nautilus12
Think they asked about the opposite case.

~~~
rhencke
I think OP's comment is a pretty clear answer to question 1: "Have you ever
chosen Postgres over Mongo, and regretted it?"

OP chose Postgres over Mongo, and did not regret it.

------
WA
I’d like to repeat my comment from another recent thread:

I mostly used MySQL. But for my app, I now use CouchDB. It has its
limitations, but the automatic sync between devices is pure gold. I haven't
written a single line of sync code and it just works. It baffles me a bit that
CouchDB isn't more popular and that people rather use MongoDB for NoSQL, which
doesn't come with the same syncing capabilities.

Adding to this: if you want sync out of the box and like NoSQL (and your app
isn’t super complicated), CouchDB is clearly better than Postgres (and Mongo
of course).

~~~
brightball
Sync is the primary use case for CouchDB (and PouchDB). It really deserves
more attention for that IMO.

It's not ideal for a lot of other use cases, but where it fits it shines.

~~~
WA
I just wonder if there is a single reason why anyone would use MongoDB over
CouchDB/PouchDB.

~~~
brightball
If you remember when Mongo first came out, it made a splash with a lot of
write focused benchmarks. They were achieved by RAM buffering the writes and
then persisting to the disk periodically.

Couch never led in benchmarks and always focused on consistency or both
performance and data. That’s not as flashy with the “OMG Benchmarks!” crowd,
in my experience at least.

Couch operates entirely over REST last time I checked as well, which makes it
a bit clunky for a lot of use cases. Couchbase attempts to be the fast version
of Couch and seems to do pretty well.

------
tanin
This is an opposite of what OP asked, but I hope it provides some useful
insight.

GIVE.asia recently migrated from MongoDB to Postgresql. We didn't really use
any special feature from MongoDB, so it was basically a poor decision.

One of the main problem we have with MongoDB was that there was no schema.
When we changed schema (e.g. add column, delete column), we needed to write a
migration script to update all the data inside that schema. It was burdensome,
so sometimes we didn't do it. This led to a situation where different rows
might have different columns.

Postgresql (or mysql and others), on another hand, forces us to handle schema
change immediately because the schema change (e.g. add column, remove column)
applies to all rows immediately. That leads to cleaner data.

Two situations you might need MongoDB (or other nosql): 1\. having an
extremely large data that it's impossible to modify all of them when changing
schema, or 2. you actually don't care about schema that much (e.g. logs)

~~~
kthielen
> 2\. you actually don't care about schema that much (e.g. logs)

I would argue that this is one of the most overlooked cases _for_ schemas,
because it's where schemas are the most complex and also where traditionally
we've had few options for really representing that complexity.

This is something I've tried to change with the hobbes project, where we have
a small header-only lib to efficiently record "log data" with inferred schemas
into shared memory:

[https://github.com/Morgan-Stanley/hobbes#storage](https://github.com/Morgan-
Stanley/hobbes#storage)

Basically, in this view, the type for recorded log data is a variant over all
possible log statements in an application. This is a very large and complex
type, but in memory or on disk it's a simple representation -- a number for
the constructor ID of the variant, and then the payload type that corresponds
to that constructor.

Then there's quite a lot that we build on top of that, but as a basic
primitive I think that logging is actually a great case for "extreme schemas"
rather than throwing out schemas entirely.

------
lmm
When you need to dynamically scale the "cluster membership" of a distributed
datastore. When you need more data than a single instance can handle.

I've worked on products that used Mongo and could not have worked with
Postgres. Mongo has plenty of issues but Postgres barely has support for true
master-master replication even with a static set of nodes, yet alone doing
dynamic scaling.

I tend towards Cassandra rather than Mongo, but if we're talking about
advantages for NoSQL more generally I'd say: when you need to do reporting
queries without them interfering with live queries. When you want index
updates to happen behind "live" changes for performance reasons. When you want
a query language that makes sense as a programmer and lets you understand
query performance from reading the query. When you want a nonblocking client
library in the JVM ecosystem. (Hell, when you want your client library and the
ecosystem on top of it to have first-class support for your database's
features - Postgresql may have first-class support for JSON but that doesn't
mean your ORM does). All these things are possible with SQL, but harder.

~~~
chaostheory
CitusDB?

[https://www.citusdata.com/](https://www.citusdata.com/)

There are a few other solutions for postgres as well.

~~~
lmm
There are various efforts, but none of them is mature/established enough that
I'd trust it (and I heard of specific issues with some of them, though I think
citrusdb is newer than when I last looked). Maybe this one works, but without
an established community around it I'd rather stick to cassandra.

~~~
chaostheory
Citus is almost a decade old now. Others like Greenplum are even older. I like
Cassandra, but like most NoSQL, if you want to use it as a general purpose
datastore; there's a lot of work to do for devs

~~~
lmm
> if you want to use it as a general purpose datastore; there's a lot of work
> to do for devs

I'd say that's equally true for postgres. Adapting your data to a relational
model does not come naturally, and to use postgresql as a document store
requires learning a whole lot of novel postgresql-only functionality that the
community will always treat as second-class.

------
squirrelicus
Mongo wins when two things are true 1) Your use cases are get doc, put doc,
delete doc. No aggregations, OR you have a very healthy sharding key. 2) You
need to go to the 1 terabyte - 100 terabyte scale for a very small cost in
humans and machines.

For those of us who have dealt with (or dealt with the human/machine cost of)
a TB+ Postgres or MySQL database, Mongo can be a blessing if you data can be
organized in a proper NoSQL fashion.

It's true, you can scale RDBMSs into the terabytes range with modest cost with
excellent performance. BUT the singular guiding principle to achieve this is
integers, integers, integers. In general, RDBMSs start shedding performance
the more non-integer columns you have in your relevant tables.

If you have less than 100GB of data, it will never matter what database you
choose -- it's all about table structure and queries. MySQL, Postgres, Mongo,
etc. Unless you choose sqlite of course :P

~~~
Jedd
> In general, RDBMSs start shedding performance the more non-integer columns
> you have in your relevant tables.

The more non-integer indexes, I'd suggest.

TimescaleDB under Postrgres goes a long way to improving scaling of certain
(NoSQL-type content) large databases.

~~~
squirrelicus
I think we're both right for sure, but even just having varchars in your
critical join tables is a big perf killer given there'll be fewer rows read
per page fault(unless you're just index scanning/joining of course)

------
013a
Generally speaking, I don't like MongoDB. Though it has little to do with the
database itself: its more so a combination of the limited use-cases of NoSQL
and the lack of managed hosting options directly on AWS or GCP (Azure can kind
of do Mongo via CosmosDB, but I don't use Azure and don't plan to at this
time).

If I've just got a small project that needs to throw data somewhere, I'll
probably reach for Redis, DynamoDB, or (if its simple enough) just serializing
JSON files to S3 (this is an underutilized way of storing structured data if
all you need is key-value lookup with no latency or consistency guarantees,
very cheap). Its likely my project is already on App Engine or Lambda, so I
don't want to have to go to a second login portal just to manage the database.

I don't hate NoSQL. I prefer SQL, but I also respect the fact that if you
combine something like Mongo with in-app schemas, and are careful about
keeping those up to date, writing proper migrations, etc, you can get a very
production ready and highly scaleable DB out of Mongo. Its not as bad as
people make it out to be.

~~~
mcfedr
For hosting options I've recently discovered mongo atlas, and been very happy
basically. It's dead simple and solid. Just like using RDS or cloud SQL.

------
reustle
I've used both mongo and postgres in large production environment. For small
projects that I plan on keeping small (like personal tools) I'm often happier
when I leave it as mongo, vs dealing with schemas and migrations in the
future.

~~~
colonelpopcorn
I do like the "just throw whatever you want in this document" approach that
Mongo has.

~~~
StavrosK
I've regretted that every time I've done it, with the usual "why is this
document not like the others" failure case. Nowadays I use schemaless
databases/fields extremely sparingly.

Relevant talk:
[https://www.youtube.com/watch?v=BN8Ne2JCGBs](https://www.youtube.com/watch?v=BN8Ne2JCGBs)

~~~
tyfon
We are using azure documentdb for a project at work and it already has like
3-4 structural versions of the documents in the db that has to be accounted
for by the applications and reporting tools, the latter of which I am working
on. It is somewhat similar to Mongo.

It's starting to become really painful and I'd be extremely happy if they
moved it to a relational database.

~~~
StavrosK
That's been my experience too. Very rarely is your data model non-relational
and schemaless, and if it isn't, you end up having to manage all the relations
and cascades and everything yourself, which ends up being a huge pain.

~~~
e12e
> Very rarely is your data model non-relational and schemaless

This of course completely false - which is why we use filesytems and files,
more than we use databases ;-)

But what I think is less clear and obvious - is when is a document database
better than files and the file system? This is kind of the question, are
maildirs/mbox any good at storing mail - and are they any good at searching
mail? And is sql better at either?

This isn't clear cut - there are things like dbmail[d] that just uses postgres
and layer imap on top/in front - while a lot of servers prefer some form of
file storage augmented with custom indexing.

[d] [http://dbmail.org/](http://dbmail.org/)

~~~
StavrosK
> This of course completely false - which is why we use filesytems and files,
> more than we use databases

Filesystems and files are good at a lot of things, but not really great at
anything. They're a general-purpose abstraction, but the relational model is
much more specific.

~~~
e12e
Of course. I agree with the points made here and in sibling threads about
schemas and sql databases. But I also think it's hard to dispute the fact that
most data lives in files and file systems.

And I think developers too often compare various no(t only)sql solutions with
rdms' \- without really considering "file-based sql" (sqlite) - or simply file
system/DAV based storage (webdav, s3 and work-a-likes, local files, nfs etc).

And again - I think it can be easy to see some examples where files are great
(store the images for an image gallery), or where relational model is great
(transaction data).

But I think it can be much harder to see where file storage, or file storage
augmented with some form of index - is beaten by a nosql dbms.

------
rjkennedy98
Maybe not Mongo itself but schema-less NoSQL databases are very useful for
Data Hubs that integrate many different data sources. If you work as I do on a
team that has 100+ members, trying to integrate 25 billion documents with 30+
datasources and 60+ APIs, then a schema-less database where you can more
flexibly handle changes to the document model (which are happening every day)
is ideal. Trying to build that in MySQL would be a disaster.

~~~
arcticfox
Right, but you could just build that in Postgres with JSONB

~~~
ruiquelhas
MySQL also has JSON support
([https://dev.mysql.com/doc/refman/8.0/en/json.html](https://dev.mysql.com/doc/refman/8.0/en/json.html))
and even a Document Store plugin
([https://dev.mysql.com/doc/refman/8.0/en/document-
store.html](https://dev.mysql.com/doc/refman/8.0/en/document-store.html)).

------
stevendgarcia
I think it depends on what platform you choose. For my Ruby/Rails work
postgres support has been nothing short of phenomenal. The addition of JSONB
and hstore added the dynamic schema features of MongoDB that I'd used in a
previous projects. Nowadays I miss none of those features, plus I get free
fulltext search, Common Table Expressions and tons of other vetted goodies
that come with postgres.

However, working with NodeJS was a different story, as there seems to be a lot
more community around projects like Mongoose as opposed to Sequelize. I found
that bootstrapping a project from scratch was a lot easier and the pace of
development was relatively hiccup free. The only downside for me was memory
consumption. Mongo stores all data in memory for fast reads and this quickly
proves unsustainable if you aren't prepared for it. By comparison, postgres
has some sort of crazy developer voodoo behind it because it has way more
features, yet consumes far less RAM and is more resilient to failure in my
experience.

NOTE: I haven't built a Twitter or anything "webscale" Most of my apps are for
small to mid sized business so take my input here with the appropriate grain
of salt

------
33degrees
JSON storage is only half the story, the other half is ease of scaling.
Scaling Postgres is not simple, and given enough of the right kind of data, I
could see why someone might chose MongoDB. But for quickly iterating on a
schema for a new project, Postgres's JSON support is definitely the best of
both worlds.

------
jaxn
Not Mongo, but I am eyeing CouchDB, solely because of PouchDB. I am just not
finding an easy and well-supported way to build an offline-first SPA/PWA
backed by Postgres without handling all of the data syncing manually.

~~~
option_greek
Woudn't Vue with Sequelize+postgresql solve your usecase ?

~~~
pier25
(I'm not jaxn)

That solution does not sync.

Also, we've found that Knex with Objection is much better than Sequelize.

------
dijit
I can think of reasons.

I have a personal dislike of mongodb, for many reasons but I’m going to put
them aside for a moment.

MongoDB beats PostgreSQL because it’s relatively easy to shard and scale. If
your application logic isn’t quite capable of sharding then mongodb handles
this and replication for you.

It’s also true that mongodb attempts quorum over consistency. Which is just
fundamentally different than how PostgreSQL works, it is possible that mongodb
can be faster in a very large mesh topology.

What I am hoping for in future is that something like elasticsearch exists but
with PostgreSQL as it’s engine instead of lucene. Such a system would be a
complete death knell to mongodb.

~~~
wenc
Yes.

I think the major advantage of NoSQL databases was easy horizontal scaling
(shard + scale + seamless failover) at a time when it was difficult to do
these things with SQL databases. However, NoSQL's original advantages have
started to fade with db's like Citus and TimescaleDB, and even FB's MyRocks.

NoSQL databases are optimized for _straight data retrieval_ and _horizontal
scaling_. If you want to dump lots of naturally unstructured or semi- or
variable-structured data (like logs) into a distributed data store and
retrieve them quickly without doing any non-trivial manipulation on them,
Mongo will likely fit the use case. It's fast, strongly consistent, supports
secondary indices etc.

However, as soon as you need to manipulate your data (i.e. analytics use
cases), you're much better off doing a bit of data modeling up front and using
a SQL database.

------
vorpalhex
These are fundamentally different database technologies. Postgres is a
powerful SQL based store while Mongo was originally built around the idea of a
write never failing.

Postgres tends to keep a speed edge but the jsonb syntax is awkward and some
operations can be trickier for a developer to navigate. Scaling uses
traditional sql strategies.

Mongo has a pretty nifty query language but more difficult queries start
becoming quite slow. Scaling is a bit different then Postgres.

Like any tech choice, it's not about what is vogue or trendy but what best
fits your problem space.

~~~
kqr
> built around the idea of a write never failing.

What am I missing if I think this sounds like a very interesting thought
experiment, but completely disconnected from reality?

~~~
vorpalhex
The tradeoff is "consistency". As long as some node of Mongo is running, it'll
do it's absolute damned best to accept a write, including buffering in memory
and all kinds of other tricks.

However, if you write to it and then immediately read.. you might not get back
what you _just_ wrote (but it will eventually show up in reads). Likewise, you
might get different responses from different nodes in some weird situations.

A few things to note: This weird behavior is only in a failure state (eg a
network partition), and I believe it's no longer the default but can be
enabled. It's very useful for lots of applications, but isn't the right choice
for all applications.

------
vandahm
In response to Question 2, here's the best I've got: At a previous job, we
used MongoDB with a write concern of 0 to store logs in such a way that we
could more easily search through them later on. The information we were
logging was useful to us as developers but wasn't critical to the business, so
we were willing to take some risks in order to achieve faster inserts.

It worked OK for us at the time, but if I had to do it over again today, I
would probably look at ELK before rigging something up with Mongo.

In the larger sense -- at least for conventional web stuff -- I don't like
Mongo. Usually, I'm building something that receives a web request, reads or
writes to a database, then generates either a web page or a JSON object as a
response. In theory, since you know what information you need to render the
response, you should be able to include all of that into the structure of your
Mongo document, then fetch everything you need with only one query. In
practice, however, business and UX requirements tend to change quickly enough
that it isn't feasible to do that. As your product evolves away from its
initial design, it gets harder to bridge the gap without joins and
transactions. At that point, at least for me, I just feel more comfortable
using PostgreSQL.

~~~
linsomniac
In support of ELK, I spent years glancing at all sorts of different options
(fluentd, greylog, ELK, Splunk, logstash, loggly, ...). I had set up
Elasticsearch but never got the parts working for log ingestion and groking. I
got them loading but no groking. This was a "personal time" project at work
and I just didn't have the time.

Then we decided to wanted Apache logs to be centralized, so I set up an
rsyslog server and fed the logs to that. Once I had that component running, I
set up another ELK system and figured out the "groking", ended up using
liblognorm.

A few weeks later our Elasticsearch cluster had some problem and people had
come to rely on it, so I started shoring up the deployment. Going from
experiment mode to something more production ready.

The stack I ended up with was rsyslog on the clients, shipping logs to the
rsyslog server using RELP. The rsyslog server using omelasticsearch to send
the logs to Elasticsearch, and liblognorm module in rsyslog to process the
logs.

liblognorm is kind of a pain in the ass, but it was built into rsyslog. If I
were to do it again, I'd spend a bit more time figuring out if I could use
grok instead.

------
adreamingsoul
1) YES. But the schema design was terrible, which isn't a fault of Postgres.
The result was a SQL database that didn't scale well, was brittle, required
lots of compute resources, and wasn't always redundant. The original codebase
existed before me and my team didn't have the approval from management to fix
the core issues, so we chose Postgres as the replacement for MySQL.

Since then, we've built new versions of those products, re-architected, and
invested 100% into the cloud. Now that we are serverless, cellular, and
highly-available we use a NoSQL database that "scales".

The schema concepts from the relational era to the non-relational era haven't
changed that much, but we did learn from others mistakes and changed the types
of the worst offenders to be JSON. (Which Postgres now supports, yay!)

Now I live in a world were instead of querying for the data from a relational
database, I instead have to load more data than I need into "serverless"
memory, join the relations, and return the output. That requires at least a
single database request for each relational object, increased lines of code,
increased complexity, and expensive monthly service costs.

2) We all know that NoSQL is great for non-relational data, but most of my
projects have relational data objects. Example. A system has users; a user has
notes; a note has attributes.

99% of the time, I'm willing to bet a relational database is needed, and I
have yet to find a project that benefited from NoSQL.

Ironically, I've been thinking about the SQL to NoSQL trend for the last
couple of weeks and asked myself similar questions. I came to the conclusion
that a relational database is not only more natural and more efficient but
cheaper in the long run.

Just my thoughts.

------
scarface74
For me whether to use Mongo or an RDMS is dependent on what I see the need to
be for the data.

If I see using the database as only backing storage for a domain specific
microservice where all access should go through the service, I will probably
use, MongoDB. C#/Linq/MongoDB is a powerful, under used combination. You get
your data and model consistency via C#'s strong typing and it feels more like
you are working with everything in memory even though your LINQ queries are
being translated to MongoQuery.

For instance, in C# with Mongo you're not working with an amorphous blob of
weakly typed data. You're working with a Collection<Customer> and an
IMongoQueryable<Customer> where you can write standard LINQ queries....

var seniors = from c in Customers where c.Age > 65

For storing data you can't put an Orders object in a Collection you defined to
hold Customers.

You don't get the whole object/relational mismatch with Mongo and as good as
Entity Framework is, it's still brings along all of the issues that are
unavoidable with an ORM.

On the other hand, my experience with large Mongo clusters is write once/read
many. Meaning once the documents got stored, they were never updated and it
was very much a document store. I would still use an RDMS most of the time.
But why choose? In the age of domain specific services and polyglot
persistence, do both.

~~~
jordanab
In the case of C#/.Net, there are more alternatives for document DBs with
native Linq-based drivers. For example I've used RavenDB in a couple of web
app projects, and it has the extra benefit of being fully transactional and
having Lucene indexes built-in (effectively giving you a built-in auto-syncing
Elasticsearch/Solr server). For another upcoming (cloud based) project we're
planning to use Azure Cosmos DB.

------
manishsharan
Not Mongodb but Redis . I added Redis to my stack and I am absolutely loving
it for what it does. I have SQL server for stuff that belongs in a RDBMS e.g.
purchase data, accounts etc. . But the operations and analytics related stuff
for my application goes into Redis. And Redis outperforms the SQL server for
those purposes.

If you can represent your data in terms of data structures as opposed to
Table/records, you may want to take a look at Redis.

~~~
cagmz
> the operations and analytics related stuff

Can you elaborate on this? Is it ephemeral data that requires some operation
done with it, and then it can be discarded?

------
harrisreynolds
We are building a tool [1] that pulls in data from several different external
services as part of our data visualization platform. For this use case
MongoDB, with its schema-less design, is perfect.

We are however using Postgres for our internal app database where it is easy
to manage our schema a priori.

[1] [https://chart.ly](https://chart.ly)

------
nasalgoat
The only regret I have going with PostgreSQL over MongoDB is that the upgrade
path is not easy. I can't just upgrade in place, or easily convert the older
pgsql files to the new version, I need to spin up a new one and _transfer_ the
data via dump.

That's why I'm still running 9.5 in production, because the associated
downtime is too much.

------
crtlaltdel
most of the issues i have encountered and that i've heard peers complain of
stack up to human-error and the inability to adequately plan for
change...doesn't seem to matter what database or framework we're talking
about...

------
taf2
MongoDB is a fine database IMO. The main issue I have with it, is that they
dropped SQL interface and instead invented a less expressive query syntax.
Really there are some great features to MongoDB and it's relatively easy to
maintain. I wish it had better tooling for modifying existing document
structures such as ALTER TABLE... but the automatic sharding is nice.

------
paulie_a
I would never choose mongodb after using it. Postgres has never failed me in a
decade, where mongo is huge pain in the ass in the last year. Honestly the
concept of schema less is stupid. You just eventually have to implement SQL to
the documents. It's an obnoxious waste of time.

------
pjmlp
No, because I only use RDBMS systems, never had a business reason to jump into
the NoSQL wave.

------
madhadron
I've been at two companies that used MongoDB for high velocity, high volume
data (packet capture and analysis). It was a bad decision that we tried to
change after the fact both times.

------
qaq
Nope but I would imagine in a consulting context one can miss out on the whole
and now let's switch to PostgreSQL to address all the Mongo issues revenue
stream

------
DanHulton
Postgres cannot (without some wild and wooly SQL) do an increment upset in a
JSONB field. I.e. If I wanted to add 1 to a JSONB attribute that may not
exist, I cannot do this, not simply.

I'd, of course, love to be wrong about this, but when I looked into it about
six months ago, the best solution I could find was a hack that had untested
performance characteristics. While in Mongo, it's a very simple, standard
operation.

~~~
chatmasta
If you are incrementing a field, then by definition you know it exists and is
an integer. In that case, what advantage are you gaining by storing it in
JSONB? Just factor the integer field out of the JSONB column and into the
table schema.

~~~
rpedela
I deal with messy data often and there are definitely times when a JSON row
should contain a field but doesn't. If I also needed to increment the field as
part of cleaning the data, it would also be nice to upsert in the cases where
the field is missing. Extracting a field into its own column may make sense
sometimes, but usually I just need the data to be correct and stored rather
than be queried or updated frequently so adding another column is a waste of
time and space.

~~~
brightball
It seems like you're talking about a couple of different cases here.

If you're going to be incrementing a field infrequently, then just resaving
the JSONB shouldn't be a big deal. If you're updating a field frequently,
having it in it's own column makes more sense and should absolutely be worth
the time. It's tremendously more efficient to increment a typed number than it
is to increment a piece of a serialized chunk of JSON.

When you create a column with postgres that has a null default, it doesn't
allocate space for it. It's one of the things that makes postgres so good for
huge databases because adding an empty column is an instantaneous operation,
rather than a locking operation. On the flip side, when you create a field in
JSON you're duplicating the space used just to name the field in every single
row.

But the short version is, if you're updating something often enough that
you're willing to fight around upserting into a JSON column rather that just
creating that same column it might be worth it to take a step back and re-
evaluate why doing it that way is important to you.

~~~
rpedela
> If you're updating a field frequently, having it in it's own column makes
> more sense and should absolutely be worth the time. It's tremendously more
> efficient to increment a typed number than it is to increment a piece of a
> serialized chunk of JSON.

Yes. When did I say the opposite?

~~~
brightball
I think I assumed based on your need for increment upsert on a JSON field. My
mistake.

------
alexmorse
nope

------
HackerEthicDead
No. Mongo is garbage.

------
philip1209
No

------
themtutty
tl;dr Nope, although Mongo is fun to ride around in the backyard.

