

Why I'm So Happy About MongoDB - ericingram
http://collaborable.com/blog/why-im-so-happy-about-mongodb

======
zzzeek
The downside is that you have to recreate full data constraints within your
application - the database offers none of it.

Of course constraints are tedious when you're in "experimentation" mode (to
quote another post I see here) and are doing rapid, early development. But
once you're in production with data that's critical/important (i.e., not
someone's list of their favorite songs; more like, their bank statements and
medical histories), constraints are the bees knees.

Once you have data constraints in place, now migrations are hard - whether or
not you're on a SQL database. You need to either update all old documents to
match new schemas, or open up your constraints to "expect both" (where by
"both" I really mean, "any number of 18 different formats...oh make that 19")
- and that is the _potentially_ slippery slope here into a coding crapfest.

Disclaimer: I'm the author of a very popular SQL tool for Python (SQLAlchemy)
as well as a new database migrations tool (Alembic).

~~~
rb2k_
It has to be noted that a lot of the >referential< constraints are just
necessary because an RDBMS wants you to split your "object" over several
tables. What I found pleasant with the non-relational databases is that these
constraints CAN just fall away because you e.g. nest "comments" inside of an
array in a "blogpost" object. When deleting the blogpost all comments will be
deleted too and cascading deletes are just not necessary.

Some of the other constraints you'll have to implement in your software. The
advantage: you don't put application logic outside of your application. The
disadvantage: Every bit of code touching that value has to know the
limitations. I wonder if this could be solved by using a message queue and
just have dedicated step for updating/deleting data

~~~
zzzeek
> I wonder if this could be solved by using a message queue and just have
> dedicated step for updating/deleting data

See but now you're building some big thing. Let's just include that in MongoDB
or whatever, a "constraints engine". So that you don't have to build it from
scratch each time, and can have some mature, well tested thing instead of
something ad-hoc and probably buggy. Now you need to carefully build
migrations again !

------
nosequel
OMG I'm so excited because it is schema-less! Was there something insightful
in that blog post that helped it bump to the frontpage, or is it just because
it had MongoDB in the title?

Downvote away, but is it too much to ask to upvote meaningful blog posts that
present something new?

~~~
hassy
HN is so easy to game - it takes very few votes in a short period of time to
get something on the front page. The guy probably sent the link around to his
buddies - I've been on the receiving end of a number of such requests (and
ignored them).

~~~
mofle
The main reason people upvote this blog post, is because they agree with it,
not because it's any good.

------
wulczer
I'm curious, when you change the schema on the fly, the app code has to deal
with both versions, right? I'm afraid this means you _shift_ the pain instead
of _blowing it away_.

~~~
eternalban
Distributed system development is a zero sum game. That is the dirty little
secret/feature of NoSQL. It is great if the shifted work can be partially
addressed; simply addressed; or entirely ignored (for your domain). But if you
find yourself reinventing an RDBMS, it is time to re-evaluate your choices.

(I'm a NoSQL OSS developer/contributor and enthusiast.)

------
matan_a
I've been developing using MongoDB for a few months now and the thing that
always comes back to me as missing is the need to group a set of actions so
that they perform atomically. It's not exactly meaning i need transactions (i
don't care about the commit/rollback part), just need the need to say - hey
server, do X, Y, and Z as a batch so that another thread won't do something in
the middle of that.

The general consensus on this is to structure your data so that it
encapsulates your business needs in one document structure (which is atomic on
changes), but i find it hard to always conform to in the real world.

So now i have to use zookeeper (memcached also works) to setup global locks on
those specific batch update actions. I guess it's a small price to pay right?
Right?

~~~
veesahni
I believe MongoDB's server side functions allow you to create an atomic
grouping of operations.. i.e. store a function and call it by db.eval()

See the following:

[http://www.mongodb.org/display/DOCS/Server-
side+Code+Executi...](http://www.mongodb.org/display/DOCS/Server-
side+Code+Execution#Server-sideCodeExecution-Storingfunctionsserverside)
[http://www.mongodb.org/display/DOCS/Server-
side+Code+Executi...](http://www.mongodb.org/display/DOCS/Server-
side+Code+Execution#Server-sideCodeExecution-NotesonConcurrency)

~~~
matan_a
Thanks for the suggestion, but still not ideal. It performs a global lock on
the DB which would block _all_ operations while it computes.

------
13rules
I had read a lot about NoSQL stuff over the last few months and never really
got it. What was the advantages / disadvantages, etc. I was happy with MySQL
and it worked for me over the years ... why change?

Then, last week I was working with a 3rd party API and returned a big JSON
response for their transactions. I wanted to store a lot of their response in
a database and it looked like a huge pain. Searching around for the best ways
to go about storing JSONs in MySQL I found the following comment
([http://stackoverflow.com/questions/3564024/storing-data-
in-m...](http://stackoverflow.com/questions/3564024/storing-data-in-mysql-as-
json)):

"CouchDB and MySQL are two very different beasts. JSON is the native way to
store stuff in CouchDB. In MySQL, the best you could do is store JSON data as
text in a single field. This would entirely defeat the purpose of storing it
in an RDBMS and would greatly complicate every database transaction.

Don't."

Wait, NoSQL systems' default store is JSON?!? A few clicks later and I was
playing with Mongo over at <https://mongolab.com/> ... installing the PHP
Mongo extension was a piece of cake. I was up and running in minutes.

Now, instead of developing one or several tables to store the information from
the 3rd party API, I just dump their JSON response right into a Mongo
collection. I can query whatever I want from that ... and there might be
information that I want later on, that I didn't realize to store initially. If
I had created a regular MySQL table that info wouldn't be there, but with
Mongo I'm storing everything, so I'll be able to use that other info later if
I want.

I don't think that Mongo is a replacement for MySQL — they are too different
tools with distinct advantages and disadvantages. But Mongo definitely suits
certain applications better. So, use the tool that best suits your project!

------
giulivo
I'm having lot of fun with mongo's stored javascript (procedures), give a look
at this: [http://dirolf.com/2010/04/05/stored-javascript-in-mongodb-
an...](http://dirolf.com/2010/04/05/stored-javascript-in-mongodb-and-
pymongo.html)

~~~
ericingram
The patterns that could emerge from that are exciting.

------
ghc
I couldn't agree more with this blog post. The drastic reduction in friction
has allowed me to experiment a lot more painlessly. Mongo may not always be as
robust as a mature SQL solution, but the friction is so low I just don't care.

~~~
bmuon
The fact is that Mongo is _fun_ to develop with and that's invaluable. But
there's another side to databases and that's DB administration. Most Mongo
reviews I've seen so far are from developers, but I haven't seen many positive
reviews from DBAs.

~~~
ericingram
So far it feels difficult to administrate outside of code, but I am confident
that will improve.

------
zwass
I'm still struggling to understand how embedded documents can be useful in
Mongo, considering there is no way to do a SELECT with a WHERE clause on their
contents. <https://jira.mongodb.org/browse/SERVER-142> . The age of this bug
report has me wondering whether there is some fundamental design flaw with a
schemaless database that causes this. Thoughts?

~~~
latch
Embedded documents currently serve a smaller purpose that one would assume at
first glance. I answer a lot of questions in the MongoDB-User group by telling
people to "pull it into its own collection".

However, there are a couple planned features that'll change this. Virtual
collections is one, but the other is the $ operator in field selection..which
I believe is planned for 2.1.

Even as-is though, they are useful. The tag example is simplistic...let me
give you a real case from mogade.com. We have a scores collection, which looks
something like:

    
    
      {
        _id:  ObjectId('...'),
        leaderboard: ObjectId('...'),
        user: 'leto',
        daily: {
           points: 100,
           data: 'level 10',
           date: ....,
        },
        weekly: {
           points: 150,
           data: 'level 8',
           date: ....,
        },
        overall: {
           points: 300,
           data: 'level 15',
           date: ....,
        }
      }
    

We essentially store the user's top score for each scope (daily, weekly and
overall). You could store a scope-per-document, which is how it initially
was....but, that isn't how it's modeled and, it takes a lot more space (user
and leaderboard get repeated 3x, along with the index which is on those two
fields).

Also, I wrote about collections vs embedded documents:
[http://mongly.com/Multiple-Collections-Versus-Embedded-
Docum...](http://mongly.com/Multiple-Collections-Versus-Embedded-Documents/)

Edit: I wouldn't rely on the age of a feature request as a sign that there's
something fundamentally wrong with a design. Not everything can be top
priority.

------
tonyplee
I did it in sqlite. Putting json in SQL table is trivial. Putting additional
table for kv search idx is also trivial and extremely fast. Best of all, it
works in android iOS and python with one simple wrapper class wrapper layer
(<100 line of code) I was able to implement an user authentication web server
+ db code in 300 line of python with and kind user attributes that can be
added later on. I read somewhere that fb use mysql as pure kv data store.

------
francesca
Awesome. Glad you found NoSQL. Scaling MongoDB is a breeze, you will love it.

