Cache Reheating - Not to be Ignored

antirez · on Sept 19, 2011

"A very nice attribute of the MongoDB storage engine is its use of memory-mapped files. In this model the cache is the operating system’s file system cache. Restart the mongod process, and there is no reheat issue at all."

This is not correct, even if the operating system handles the cache it needs to reheat the pages that are used in order to cache the memory mapped file.

So usually for instance you have a big MongoDB database but the application accesses the database in a very biased way, there is an obvious working set. Part of the working set ends being cached on memory pages by the operating system virtual memory system, so many times when you do a database read operation you read reading from memory.

If the server is rebooted and the MongoDB process restarted the file is not cached in memory, so the first access to every different "page" of the file will actually seek and read from the disk.

mathias_10gen · on Sept 19, 2011

That line was referring to restarting just the mongod process on a running machine (eg, when upgrading or changing config flags). In that case, the data should still be in the OS file cache and will result in minor rather than major pagefaults when mongod accesses data so it will not go to disk. The case where you do a full reboot is mentioned two sentences later: "Of course on a full server reboot MongoDB must reheat too."

antirez · on Sept 19, 2011

Sorry I missed that sentence where you mentioned it was just if you restart the server.

And I agree it is an advantage that at least in this scenario no reboot is needed, for instance if you plan to upgrnde MongoDB version. Actually when you can plan the downtime it is usually simpler to handle in the general case of many cache hosts running at the same time when you can do the upgrade incrementally, but it is nice that a single big server can be upgraded without the problem of the cache reheating.

Apologizes again for not reading the article with more care.

Locke1689 · on Sept 19, 2011

Yeah but the relevance drops precipitously as soon as you talk about only restarting the DB server. Planned downtime is less of a problem than unplanned downtime.

For example, at Google my caching strategies could be exported in production with only a few minor changes to the code. If we experienced data center downtime I could actually offload a hot cache to the other backing stores in round-robin configuration. The problem isn't when one database server gets taken down, it's when all the servers in the datacenter get taken down at once without any warning.

jsight · on Sept 19, 2011

I'm not sure how your comment is substantively different from the article. He seems to cover both scenarios (just the mongodb process rebooting vs whole server rebooting) accurately to me.

antirez · on Sept 19, 2011

sorry my fault, please read the other my comment in this thread for more info.

andrewvc · on Sept 19, 2011

One good reason to use Redis with snapshotting to disk instead of Memcached.

Even if ALL your cache servers go down, you'll at least have the reheat load distributed among a cluster of cache servers in a super-efficient direct memory dump rather than hitting your DB servers over the network.

PaulHoule · on Sept 19, 2011

Maybe I'm working with bigger databases than I used to, but lately cache reheating has become a bigger issue, particularly with mongo.

I'll build a big db and then run ad hoc queries interactively and be like... this is pretty cool. Then I'll reboot, start the query, go get coffee come back and see that it's still running.

0x12 · on Sept 19, 2011

Why the reboot?

keithnoizu · on Sept 19, 2011

  On a similar vein, I like to partially randomize my cache invalidation period if I don't have the luxury of using some other cache invalidation scheme(http://the-robot-lives.com/index.php/2010/03/fractional-keys/).  These prevents cyclic cache invalidation which can lead to interesting issues such as having your server fail every 24 hours after you initial cache restart when a huge some of 24 hour cached items invalidate and start hitting the database at the same time, or whenever the hourly, daily and short duration cache items all invalidate at the same time.

chrismealy · on Sept 19, 2011

Does this mean mongo might be better than something like redis as a memcache replacement?

_3u10 · on Sept 19, 2011

If you're suffering from cache reheating issues that only stem from restarting the process then maybe. If the reheat issues stem from a reboot then you're still screwed.

The numbers presented only make sense in the event that your cache hit rate is uniformly distributed.

In the event that you have a normal distribution or pareto distribution (more likely for a website) the majority of your cache hits come from very few entries meaning your cache can probably be reheated to 80% with 20% of the data, or 200GB instead of 1 TB.

A much better idea that actually solves the problem instead of discussing implementation artifacts (memory mapped files) is to reuse your tracking data (eg. this page gets X number of hits) to pre-heat the caches before the server goes live.

All you need to do is build a histogram of your cache hits over the last X minutes and flush it to disk every now and then.

dramaticus3 · on Sept 19, 2011

"Not to be Ignored" but we ignore it