

MongoMem: Memory usage by collection in MongoDB - rgarcia
http://eng.wish.com/mongomem-memory-usage-by-collection-in-mongodb/

======
fiatmoney
Mongo's internal storage mechanisms are a mess. BSON is incredibly inefficient
as a storage format (for instance, it stores arrays as string -> string maps
with a literal index, ie ["1" "first element" "2" "second element"]), it
relies on mmap / fsync for read / write scheduling (see [1] for some issues
specifically with regards to database usage), and its "linked list of
documents" approach means that any appreciable number of removals or updates
badly fragments your storage space. Oh, and key names aren't interned, so
better use the first N unicode characters rather than sensible names if you
need to fit everything into the page cache (and due to the lack of caching
algorithms beyond the OS's LRU-by-page, you definitely need to fit everything
into the page cache). "Everything" really means "both data _and indexes_ "
since the OS doesn't know the difference between your index & data - it will
quite happily evict index entries from cache in favor of a data scan.

[1] [http://rhaas.blogspot.co.uk/2014/03/linuxs-fsync-woes-are-
ge...](http://rhaas.blogspot.co.uk/2014/03/linuxs-fsync-woes-are-getting-
some.html)

~~~
ericingram
TokuMX solves most of the issues you describe without compromising on the
query language or other great mongo features. If you are interested in mongo
but concerned about these issues, I highly recommend TokuMX.

------
caio1982
I'm so going to use it. Thanks for open sourcing it!

~~~
adamaflynn
No problem! It's been pretty helpful for us. Results can be surprising
sometimes.

