
Why The Clock is Ticking for MongoDB - turrini
http://rhaas.blogspot.com/2014/04/why-clock-is-ticking-for-mongodb.html
======
overgard
Having used mongo in a professional context, I'm sort of amused by how much
vitriol it gets. It has it's flaws, but it's not _that_ bad. I think it's been
a bit misbranded as the thing you use to "scale", which ticks people off. To
me, when I use mongo, I mostly use it because it's the most _convenient_
option. It's very easy to develop against, and awesome for prototyping and a
lot of times it's good enough that you never need to replace it after the
prototype phase.

Relational databases are great, but they're almost an optimization -- they're
way more useful after the problem set has been well defined and you have a
much better sense of data access patterns and how you should lay out your
tables and so on. But a lot of times that information isn't obvious upfront,
and mongo is great in that context.

~~~
brainlock
> Relational databases are great, but they're almost an optimization --
> they're way more useful after the problem set has been well defined and you
> have a much better sense of data access patterns and how you should lay out
> your tables and so on. But a lot of times that information isn't obvious
> upfront, and mongo is great in that context.

I think it's exactly the other way around. I prefer to lay out tables in a way
that reflects the meaning and the relationship in the data. Only later, if
there is a bottleneck somewhere, I might add a cache (i.e. de-normalize) to
better fit the specific data access patterns that were causing trouble.

~~~
threeseed
You missed the point.

Since most companies now are doing Agile development there isn't the big
upfront design process where the data model is clearly understood at the
beginning. Instead you have this situation where the schema is continually
evolving week by week and hence this is why schema less systems can be
appealing. It isn't about performance.

~~~
hcarvalhoalves
IMO, this is the worst argument. There are multiple schema evolution tools for
SQL, there's nothing stopping your team from changing the schema every week -
plus, it's not hard, certainly less hard than having to maintain code that
deals with multiple document schemas at once.

~~~
threeseed
The point is that if you using an ORM and have domain classes then it is
unnecessary and annoying step. You have to make changes in two places rather
than just one. Most people using MongoDB I know are seasoned enterprise Java
developers and we have used many schema management tools for the last decade.
It is a giant relief to be finally rid of them.

~~~
hcarvalhoalves
IMO, it sounds like the wrong remedy for the right diagnostic. I would never
throw away the database because I'm duplicating information between ORM and
domain classes. This seems more related to the particular constraints imposed
by your codebase/architecture than the database.

Right now I'm writing a REST API that will be consumed by web and mobile apps.
It would impractical to duplicate validation across all codebases. Rather, I'm
leveraging the database checks to get form validation on all clients for free.
The application layer is thin, adding a field amounts to adding one line and
running a command.

I believe it boils down to which component rules the system: application or
data layer.

------
pilif
I really don't see how a fixed schema is seen as such a bad thing by many
NoSQL advocates. In most databases, altering a schema is an operation that's
over quickly and in many databases it's easily reversible by just rolling back
a transaction.

The advantages of a fixed schema are similar to the advantages of a static
type system in programming: You get a lot of error checking for free, so you
can be sure that whenever you read the data, it'll be in a valid and
immediately useful format.

Just because _your_ database is bad at altering the schema (for example by
taking an exclusive lock on a table) or validating input data (like turning
invalid dates into '0000-00-00' and then later returning that) doesn't mean
that this is something you need to abolish the schema to solve.

Just pick a better database.

~~~
overgard
Well, I think the static typing analogy pretty much cuts to the heart of it. I
wouldn't be shocked if static typing advocates tend to like relational
databases, and dynamic typing advocates tend to favor nosql. (I can't prove
this -- but I wouldn't be surprised).

~~~
threeseed
Nope. MongoDB is very popular in enterprises which are largely dominated by
Java i.e. strongly typed. MongoDB actually suits strongly typed languages
since you enforce your domain in code rather than in the database.

And since Morphia
([https://github.com/mongodb/morphia](https://github.com/mongodb/morphia)) is
similar to JPA it is trivial to port your existing code to MongoDB. Which then
leaves you with the same experience as an SQL database except with a lot of
benefits (single binary, easy to query/administer/scale, geo spatial features
etc).

~~~
hibikir
I'd not count Java as a sign that people at a place like strongly typed
systems: Java is a default. The typing system has so few features, it pales in
comparison with the alternatives. Those that are in a JVM and really care
about a type system might be using Scala instead.

------
gdulli
"This is not to deny that MongoDB offers some compelling advantages. Many
users have found that they can get up and running on MongoDB very quickly, an
area where PostgreSQL and other relational databases have traditionally
struggled."

A few things about this I've never understood.

1\. Someone's going to make a technology decision with priority given to a
one-time cost over the strengths and weaknesses of running competing products
in perpetuity? The ability to avoid having to learn something is a pro in this
decision?

2\. If you don't have the skill set to install or maintain MySQL or Postgres,
you should not be in charge of installing or maintaining production systems,
period. You will hit your ceiling the first time you have to do anything non-
trivial with a system that happens to have been easier to "get started with."

~~~
astalwick
The critical point in the life of a startup is very often that short period at
the start, when the team is small and stretched in a hundred directions,
trying to 'execute' on everything at once. It's really important to choose
your battles.

Where I work (wavo.me), we went with MongoDB, and though imperfect, it was the
right DB for us. It's quick to get going, and the schemaless document store
makes it easy to change directions and iterate very quickly. That was (and
still is) super important to us. It's really not about skillset, it's about
cycles.

~~~
gdulli
My point is that MySQL/Postgres aren't difficult and this is a pretty low-
hanging fruit example of a decision that doesn't need to be made and shouldn't
be made with short-term thinking. Maybe it's necessary at times, but you can
go too far with that excuse. Code is much, much easier to change/fix than a
data store.

~~~
threeseed
MySQL/Postgres aren't difficult but they are more complex that MongoDB.

And in Agile you are taught to only focus on the short term and then move only
when you need to. In which case picking the simpler database has merits.

~~~
nasalgoat
It's a case where your short-term choice leads to long-term pain if you
actually _succeed_ in your start up.

Should you really be making short-term decisions like that if you're not
planning to fail?

------
m_mueller
I can't really speak about Mongo, but since the post seems to be talking about
relational vs. document based DBs in general, here's my perspective coming
from CouchDB:

\- Schemaless DBs make sense, when handled correctly within the application
framework, for what I'd call information systems with regularly changing
requirements. I'm currently building a rapid development platform for these
kinds of systems, where users can define an arbitrary relational data model
from a Web-UI and get the application with forms and views all pre-made. The
user design is changeable at any point without breaking anything and even the
underlying static data structures can be changed without any need for an
update or data migration process - it's all handled when opening or saving a
document with a previous version.

\- CouchDB's map/reduce view definitions are interesting when designing a DB
system, since they IMO restrict the developer in exactly the right way: One is
forced to write the right kind of index views instead of being able to just
join willy nilly. Making something slow usually means writing complex views
and chaining many lookups together - one has to work for it and, conversely,
being lazy and reducing everything to the simplest solution usually results in
a fast solution as well. The result usually scales up to a high number of
documents in terms of performance.

\- Being able to replicate without any precondition, including to mobile
devices with TouchDB, is a big plus - and in fact a requirement in our case.
Offline access is still important, especially in countries where people spend
a lot of time in trains or for systems that manager types want to be accessing
in flight.

~~~
SideburnsOfDoom
> Schemaless DBs make sense ... for some cases

I think his real point is that on a SQL database like PostgreSQL you can add
schemaless parts to your data if you think that it's right for your case, and
with other DBMSs "capabilities in this area are improving rapidly".

But starting with a "Schemaless db" it's much harder to go in the other
direction. "The advantages of relational databases are not so easily
emulated."

~~~
m_mueller
Yes, that point is true - it's easier to go from the stricter environment to
the more open one - I guess that's basically just how entropy works. Note that
I don't necessarily disagree with the article, I just wanted to offer some
perspective on why I think the schemaless approach can often make sense,
especially in how CouchDB handles it. If at some future point, the worlds of
RDBMS and NoSQL fold into one DBMS handling all the use cases perfectly, I'm
all for it - it's just that right now I'm still skeptical about going all
schemaless on an RDBMS, since it's still a rather new feature, and the
underlying system is very complex and has grown for decades.

------
dkhenry
This is clearly a flamebait title. The article doesn't say anything about why
MongoDB is running out of time other then "PostgreSQL is making real progress
as a document store". I think unwittingly the author has identified why
MongoDB is not running out of time. It still has a huge lead on RDBMS in a
very common and useful workload. If they can continue to make progress while
other engines catch up with some of the document oriented features ( as
Postgre has done ) then they will still have compelling features to offer. If
they do nothing while other engines make progress then of course they will
fail.

If anything all this points out is that Document stores, like MongoDB, have a
real market where they excel and other engines are playing catchup.

~~~
simonw
The flame bait title was in response to a similar title on zdnet: "MongoDB
chief: Why the clock's ticking for relational databases"

The author should probably have quoted that title rather than expecting people
to follow the link.

------
craigching
As I always say when these sorts of articles come out (and I admit this is
_my_ use case for MongoDB and that doesn't necessarily match everyone else's
use case), where is the easy to setup HA and sharding for PostgreSQL? I know
it's coming, but right now it's not there.

For someone who redistributes a product that relies on end-users setting up
HA, this for me is MongoDB's killer feature, easy to configure replication and
sharding. I _love_ PostgreSQL, but this is the one big thing that keeps me
from using it right now.

------
danford
Often when I read articles like this I take them with a grain of salt. A lot
of hate for MEAN technologies seems to stem from people who don't know how to
properly use them. You're not supposed to use MongoDB like *SQL and if you try
you're gonna have a bad time.

Let's say I have a big pick-up truck I use to haul xWidgets. My friend gets a
little motorized scooter to drive around town hauling his yWidgets. Is it
proper for me to believe that my friends scooter sucks because it doesn't haul
xWidgets like my truck? I don't know much about scooters, but by friend says
it doesn't have a steering wheel and it only has two wheels. How the heck can
it steer without a wheel? He says it has "handle bars" for steering. Seems
kind of dumb but I guess it works for him. He touts the fact that he gets
80mpg on gas but it doesn't matter how efficient it is if he can't haul
xWidgets. Scooters suck because they don't work the way my truck does.

~~~
Jweb_Guru
Okay. What does this have to do with MongoDB, though? People in this thread
have presented plenty of usecases for it, but every single one of them (pretty
much--someone correct me if I'm wrong) is also satisfied by jsonb in Postgres
9.4, which in your analogy would make it a truck that gets 85mpg.

------
justinsb
My personal "big picture" critique of MongoDB is that I see it evolving into a
SQL database with a different syntax. It is a strongly-consistent,
distributed-through-replication-and-sharding system. Additions to
'traditional' databases, like Postgres' HStore or MySQL's HandlerSocket show
that many of the MongoDB differences are not fundamental.

Much more interesting to me are systems that do something fundamentally
different. e.g. explore different parts of the CAP trade-off, like Cassandra.
Or systems that are built around the idea that data no longer lives
exclusively on a server, like CouchDB.

~~~
danpalmer
I think I disagree with you on a few points. MongoDB might be adding features
of SQL databases, and heading in that direction, but they are still missing
joins, and that's a crucial part of relational databases.

Also, I'd clarify a little by saying it's _trying to be_ a strongly
consistent, distributed through replication and sharding database. Right now
there are some bugs in the system, some improvements that need to be made, and
definitely some default configurations that need to be changed before it's
going to be a consistent database. It's definitely going for consistency
instead of availability, but I don't think it can call itself consistent yet.

~~~
justinsb
I don't think we disagree as much as you might think!

I'm trying not to critique MongoDB today, but rather thinking of what it will
become. Today's implementation problems will hopefully be fixed.

I agree that without joins, MongoDB querying is very limited. I credit them
enough to believe they will end up fixing that and supporting joins in some
way. But then, they have "SQL with a different syntax".

And I definitely agree on the data-loss bugs. But again, presuming they fix
those, they just end up with something very similar to a relational DB, in
terms of the CAP theorem etc.

In short, even when/if they fix all the bugs and add all the missing features,
I worry that we'll just end up back where we started.

------
ThePhysicist
I agree that document-oriented databases will probably not replace relational
databases in the near future. In my opinion though, the schemaless design of
MongoDB paired with its ease of use and its native support of JSON data makes
it a perfect choice for prototyping and (in some cases) a viable option for
use in production.

What you also have to consider when comparing document-oriented to relational
databases is that the former is still a very young technology: MongoDB has
been founded in 2007, whereas Postgres has been around since 1986! So given
what the MongoDB team has achieved in such a short time span, I expect to see
some huge improvements in this technology over the next decade, especially
given the large amount of funding that 10gen received.

In addition, the root cause for most complaints ("it doesn't scale!", "it
loses my data!", "it's not consistent!") is that people try to apply design
patterns from relational databases to MongoDB, which often is just a horrible
idea. Document databases and relational databases are very different beasts
and need to be handled very differently: Most design patterns for relational
databases (data normalization, using joins to group data, using flat data
structures, scaling vertically instead of horizontally) are actually anti-
patterns in the non-relational world. If you take this into account I think
MongoDB can be an awesome tool.

~~~
theseoafs
To be fair this particular complaint:

> "it loses my data!"

is entirely legitimate. A database product should not be losing data unless
you specifically allow it to. Period.

~~~
laichzeit0
There's cases where occasional data loss is an acceptable trade off for
performance. In a particular app I'm working on it's ok for me to have 99.9%
of the data and all my updates and inserts have write concern set to 0,
journaling off, etc.

~~~
theseoafs
Sure, hence:

> A database product should not be losing data unless you specifically _allow
> it to_.

------
AdrianRossouw
I've been asking a lot of people when MongoDB is actually the right tool for
the job.

[http://daemon.co.za/2014/04/when-is-mongodb-the-right-
tool](http://daemon.co.za/2014/04/when-is-mongodb-the-right-tool)

I'm starting to form this idea of what constitutes an ideal use case for mongo
in my head, and i'm trying to prove the model.

If I were to imagine some kind of "realtime" multiplayer game, like quake or
something.

1\. You have to have the state be shared between all the parties in a
reasonable time.

2\. The clients only need the data that is directly relating to the round they
are in, so you have the concept of cold and hot data.

3\. The data is all kind of ephemeral too, so that you don't specifically care
about who was on what bouncy pad when, but you do want to know what the kill
score/ratio is afterwards.

4\. You have a couple of entities that have some kind of lightweight
relationship to each other, which makes it just more complex than a key-value
store like redis is really suitable for.

5\. These entities are sort of a shared state, and thus get updated more often
than new unrelated documents get added, and couchdb's ref-counting and append-
only nature makes it really unsuited for constant updates of an existing
record.

any feedback would be appreciated.

~~~
drhayes9
This seems totally reasonable to me. I was playing around with using Firebase
([https://www.firebase.com/](https://www.firebase.com/)) as the datastore for
a turn-based multiplayer game on the web and its data policies seemed very
well suited.

Essentially, each game was its own document and relevant stats were
updated/duplicated on the player documents as well.

There don't seem to be useful relationships here that I'd want to query.
_Maybe_ successive game states? But not across different games, right?

~~~
AdrianRossouw
Well, I was imagining a situation where you had multiple different pieces in a
game, relating to each player and distinct to that game. Maybe like an RTS
game or something?

So you might want to query how many pieces a certain player has available, or
do a check on who this piece belongs to.

Keep in mind I'm still not saying an actual game, but any problem that had
similar data storage and access patterns as a game along these lines would
have.

I don't know enough about firebase to confirm whether it is suited to this,
but if I had any kind of backend at all, and I had requirements for really
quick reponses, i wouldn't be comfortable adding the extra hop to a hosted
service onto each call.

------
joeclark77
From reading Redmond & Wilson's book "Seven Databases in Seven Weeks", my
impression is that the CAP Theorem is a big idea (essentially, you get two
choose any two: consistency, availability, and partition-tolerance).
Relational databases have to be "consistent", i.e. transactions are atomic and
reads always get the latest value. NoSQL databases make it possible to choose
Availability instead of Consistency, so you can get faster response times even
if it means you sometimes get out-of-date data.

I would use Mongo or Riak or something like those for "write once, then read-
only" applications, for example a twitter or facebook type application, or
simple blog. In those cases when the user hits "refresh" they're never going
to know or care that they don't yet see a comment someone posted a half second
ago. They're just going to be happy that the page refreshed fast.

------
remon
Given the title I expected a completely different article. The reason being
that the opinion that "MongoDB is running out of time" can be supported by a
few very strong arguments. There are a few very serious problems with the
current technology that give NoSQL alternatives a chance to catch up and grab
momentum. For example, MongoDB still has a questionable mmap based storage
engine, there are cluster state consistency issues, the cluster topology
management is outdated and so on. This article should have been titled "Why I
personally like relational databases better than MongoDB". That discussion is
incredibly subjective. Source : Very early adopter of MongoDB and use it
professionally.

~~~
joeclark77
It was a play on the title of the article that the author was responding to.

------
mathattack
I think many folks confuse normalization as a strategy with the underlying
database technologies. Oracle and other RDMBS technologies can create
normalized databases too. In the end it's a design judgment. There is a lot of
room between fully normalized and one-big-table. Even firms that logically map
things out fully normalized frequently decide that for some things that
doesn't make sense.

Taking a step back, there are still reasons to abandon Oracle. It may not
scale up, or be good for certain time series calculations, but that's another
story entirely.

------
hartator
It´s funny to see this kind of posts now and then predicting the close end of
mongodb... for several years now!

MongoDB is here to stay. It´s opinatred, PostrgreSql isn't. It's faster out of
the box. The client drivers are pretty good. (Dont forget that SQL database
still send raw text as request and get raw text in return!). It's fitting the
bill for a lot of quick and dirty web apps and deliver early performance and
argualy scalable performance. Dont get me wrong, I still litteraly love
postgres.

~~~
orthecreedence
> It's faster out of the box

...for reads. Once you get any _real_ traffic it hangs on its DB write lock.
Yes, I have experienced this, and the solution "just shard!" is especially
obnoxious because setting up sharding is a lot harder than they make it seem.
Not only is it complicated, you go from one server to three config servers,
two repsets with 3 servers each (so 9 fucking servers) just so you can make
more than 40 writes/s that most relational DBs wouldn't bat an eyelash at.

It's a poorly designed system, and this becomes painfully apparent once your
app moves out of the basement and into the real world. Luckily for MongoDB,
most of their users' apps never leave the basement and they think "wow, this
is great!!"

> argualy scalable performance

Not scalable. At all. Global locks do not scale. Complicated config setups are
hard to scale.

Check out RethinkDB. It's an open-source document DB that fixes just about all
the problems MongoDB has _and_ it has DB-side joins. It's just as easy to get
quick and dirty apps running against, and it doesn't actively flush your data
down the toilet like Mongo has been known to do.

------
yawz
Dramatic titles get the readers attention, therefore I get the choice of
words. However, our industry is so big that there will never be a single
solution. MongoDB may never become #1 but as it's described in many comments,
it's a pretty good choice in various situations. So, as they say in Ireland
"Stall da beans der bi!" I don't hear a ticking clock.

------
loftsy
On document store indexes the article says:

> If all order data is lumped together, the user will be forced to retrieve
> the entirety of each order that contains a relevant order line - or perhaps
> even to scan the entire database and examine every order to see whether it
> contains a relevant order line.

Both of these are untrue. Author needs to read up on secondary indexes.

------
jchrisa
Maybe the clock is ticking because thier large production deployments are
migrating to other tech? At least we are seeing plenty of folks who realize
that a query API on top of mmap isn't really a database. :)

One high profile migration:
[http://www.couchbase.com/viber](http://www.couchbase.com/viber)

~~~
threeseed
MongoDB will be shipped as part of every RHEL installation. So pretty sure it
is actually going to be getting even more popular.

------
lucisferre
The author is conflating (or just ignoring) the very significant difference
between application databases and reporting databases. Not surprising since
most of us do this as well when we are building applications. However no
comparison of the relative value of database schema styles can responsibly
ignore this difference.

------
Yuioup
_In short, I don 't expect MongoDB, or any similar product, to spell the end
of the relational database._

The author makes it sound like that was a possibility. SQL and NoSQL are two
different tools in the toolbox and should be considered as such.

------
ulisesrmzroche
Man, how long has it been since MongoDB has supposedly been dying? I swear
it's more than 3 or 4 years for sure.

------
EGreg
Actually, graph databases are better suitsd for many of today's social
applications.

