
Ten things I didn’t know about MongoDB - iqster
http://slowping.com/2011/ten-things-i-didnt-know-about-mongodb/
======
latch
Some things I didn't know:

1- While you can select only specific fields by using {fieldA: 1, fieldB: 1}
as a 2nd parameter to find, you can also exclude fields using -1.
Interestingly though, you can't mix inclusion or exclusion (which makes sense
if you think about it), except to exclude _id. So {name: 1, description: 1,
_id: -1} is possible.

2- Sending SIGUSR1 to the running mongod process will rotate the logs

3- We got a 25% memory/storage reduction by shrinking our field names. YMMV. I
knew how data was stored, but I didn't know what amount we would specifically
save :P

4- Count returns the value of found documents regardless of paging. This lets
you pull documents + get total # very easily. If you pass true to count,
you'll get the actual returned # of documents.

5- Replica sets either have a priority of 0 or 1...future versions will
introduce more flexibility

~~~
rb2k_
>3- We got a 25% memory/storage reduction by shrinking our field names. YMMV.
I knew how data was stored, but I didn't know what amount we would
specifically save :P

I really wish they just had an internal lookup table to do that. I don't want
to have to deal with keys like "c1, ba, la" in my application.

~~~
latch
Some drivers/mappers support it. However, as much as MongoDB likes to lean on
the drivers, this clearly belongs in the server.

------
jacques_chester
In its defence, Mongo is basically brand new. They've done the cool CAP stuff
first, now they're discovering why everyone else keeps banging on about
wanting ACID guarantees.

edit: I suppose I should have expected to be downvoted -- I'll elaborate on my
thoughts.

The folk who wrote things in the pre-CAP theorem era were not ignorant of the
problems of scale. They did their best to attack them and did an amazing job
of it.

If you remove constraints, then yes, you can improve performance. But soon you
will discover why and how those constraints were imposed in the first place.

The implementers of Mongo are, less some genuine advances, doomed to repeat
history per Santayana.

------
strmpnk
Funny that they say multi-master is bad since we can't reason about it. It
always feels like they are out to insult the users telling them they are too
stupid to use other things because it's hard. I guess all that durability
didn't matter too until they had time to implement write-logs. I guess first
class conflicts won't matter until they have real distributed computing
support across multiple data centers either.

~~~
aaronblohowiak
MongoDB is a single-system database with support for replication and sharding.
If you want a true distributed database, you'll need to look into Riak or
CouchDB.

Riak also has no master!

~~~
strmpnk
Yeah. I was just pointing that 10gen's arguments seem to evolve at it's own
convenience. Developers need to take things more critically (the why) in
general instead of stopping at "that makes sense."

------
StavrosK
A few things I didn't know about MongoDB when I started using it (admittedly,
this was last year, things might have gotten better):

[http://www.korokithakis.net/posts/my-experience-with-
using-m...](http://www.korokithakis.net/posts/my-experience-with-using-
mongodb-for-great-science/)

------
Sym3tri
Another gotcha we found is that you can't do any type of real-time aggregation
in a production environment.

There is a group() function, and there is Map Reduce, but since they both run
via the JavaScript engine and the nature of JavaScript is single-threaded that
means that you can never execute more than one of the aggregation queries at a
time. So if an aggregation query is taking a long time to run, all other
queries will block and potentially timeout.

Supposedly we'll have better options in 2.0.

~~~
latch
In addition to supporting multi-threaded map-reduce, it'd be nice if they

1- Eliminate any memory limits on inline map reduce, or bring back output to
temporary collections. If they bring back output to temporary tables, allow
them to be run on slaves and not participate in replication

2- Polished and released a production-ready version of their MongoDB Hadoop
Adapter: <https://github.com/mongodb/mongo-hadoop>

------
mike_esspe
I didn't know about global lock too. In the presence of this lock we sometimes
have several seconds read queries even from collections we are not writing too
(we are very write heavy).

------
democracy
#11: [http://www.quora.com/Why-did-Diaspora-abandon-MongoDB-for-
My...](http://www.quora.com/Why-did-Diaspora-abandon-MongoDB-for-MySQL)

~~~
bricestacey
That is not much of an explanation for why. It's like saying Chinese food is
better for the Chinese.

------
thisjustin
You know, Mongo DB is Web Scale: <http://www.youtube.com/watch?v=b2F-DItXtZs>

------
jgmmo
Too bad their site doesn't scale as well as MongoDB

~~~
lwat
Crashed for me too

------
rb2k_
I always find it annoying that you can't really restrict MongoDBs memory
consumption.

I'd love to run it alongside a ElasticSearch process and a small redis server
on a machine, but while I can limit ElasticSearch and I could theoretically
limit Redis, MongoDB would just grow as far as I understood (mmapped I/O). I'd
love to use it as a second copy of the data in ElasticSearch

~~~
latch
I agree being able to put a hard limit, in some situations, might be nice.
However, I believe the current implementation pretty much leaves memory
management up to the OS, which, hey, I'm no expert, but sounds reasonable to
me.

MongoDB won't starve other processes of memory unless the OS decides it
should. It seems like the best way to leverage the most amount of available
memory.

------
MostAwesomeDude
"MongoDB should likely not be run in a 32-bit environment." This is sending up
alarm bells for me. Why does it matter? Can a Mongo fan explain why this is?

~~~
gnaritas
Probably because of the file size limitations with memory mapped IO on a 32bit
system.

~~~
latch
And god help you if you reach that limit on an already repaired/compacted db.
You can't even connect your shell to delete some data. I had to rm -fr a
collection out of the /data folder..thankfully it wasn't production..but I can
only imagine...

