
Downsides of Caching - arkenflame
https://msol.io/blog/tech/2015/09/05/youre-probably-wrong-about-caching/
======
makecheck
Never implement a cache that you can't _completely_ disable with the flip of a
switch somewhere. Otherwise, when wrong behavior is observed in the system,
you don't have an easy way to rule out the cached copies. And worse, as you
make "changes" to the system code or the data, you may not be able to tell the
difference between a change that had no effect on the problem and a change
that was hidden by a stale cache.

This is one of the things that drives me crazy about some of Apple's
technologies. For instance, somebody at Apple decided long ago that all
application HTML help pages should be cached. The "off switch" for this cache
remains a bit of black magic but it's something like "rm -Rf
com.apple.help.DamnedNearEverything" followed by "killall
UnnecessaryBackgroundHelpProcess" _every damned time you modify a page_ or
else the help system might show you an older version of the content that you
just "changed".

~~~
nstart
How I handle kill switches is to first make sure that all code related to
catching is just that. If it gets tied anywhere into the data saving logic or
even worse, the actual business logic, you are screwed. I tend to manage this
by keeping caching as part of a strategy pattern so I can enable or disable
stuff using config parameters when starting the app. End to end testing with
casper/selenium always runs with cache turned off unless I specifically want
to test cache which I actually never have now that I think about it.

------
ggreer
I agree with pretty much everything in this post, though I would add one more
thing. It's not so much a downside of caching as a misuse: Application-level
caches should never cache local data. Cache network responses. Cache the
results of computations. Don't cache files or disk reads. Operating systems
already implement disk caches, and they do a better job of it than you. That's
in addition to a modern computer's numerous hardware caches. For example, take
this code:

    
    
        ...
        FILE *fp = fopen("example.txt", "r");
        char dest;
        int bytes_read = fread(&dest, 1, 1, fp);
        putchar(dest);
        ...
    

Think of how many caches likely contain the first byte of example.txt. There's
the internal cache on the hard disk or SSD. There's the OS's filesystem cache
in RAM. There's your copy (dest) in RAM, and also in L3, L2, and L1 cache.
(These aren't inclusive on modern Intel CPUs. I'm just talking about
likelihood.) Implementing your own software RAM cache puts you well into
diminishing returns. The increased complexity simply isn't worth it.

~~~
thrownaway2424
I don't think that's very good advice in a heavily-loaded shared hosting
environment. A disk read could easily stall for tens of seconds, just because
the kernel whimsically decided to throw out the cache (or because your server
crowded its memory container). I actually don't want any server touching a
disk while it's serving. Everything should be read before service begins and
never again.

~~~
ggreer
Your proposed solution (read from disk on startup and never again) is really a
memory-backed data store, not a cache. Caches can miss.

But let's analyze your example. If disk reads take tens of seconds and memory
usage is high enough to purge the kernel's disk cache, nothing can save you.
Had your process read in everything at the start, it would be using even more
memory. Given the same load, one of two things will happen:

1\. If you have swap enabled, parts of your process's memory will be swapped-
out. Accessing "memory" in this case would cause a page fault and tens of
seconds of delay.

2\. If you have swap disabled, the OOM-killer will reap your process. When it
respawns, it's going to read lots of stuff from disk... and disk reads take
tens of seconds. Oops.

Even if an application-level data cache improved performance on heavily-loaded
shared hosts, the added costs of software development and maintenance far
exceed the cost of better hardware. Hardware is cheap. Developers are
expensive.

~~~
thrownaway2424
Here's an example. You have a 100MB C++ executable that needs 4GB for its own
various purposes and 20GB of data that it's serving. The machine has 64GB of
memory. If you allocate 24.1GB of memory to the container for this service,
disable swap, and mlock the binary and the data files, nothing will go wrong.

On the same machine is a batch process which is reading a 1TB file and writing
another 1TB file. If your serving process was reliant on the OS page cache, it
would find that its pages were routinely evicted in favor of the batch
process.

You're right about swap, that's why only a crank would enable swap. The moment
at which swap was a reasonable solution was already behind us 20 years ago.

~~~
ggreer
In that example, I'm pretty sure forgoing containers and mlock would result in
similar performance while using less memory. Process startup time would also
be significantly improved. (If there's such high contention for disk I/O,
reading 20GB on startup is going to take a _very_ long time.)

The kernel's page cache eviction strategy is smarter than naïve LRU. On the
first read, a page is placed in the inactive file list. If it's read again,
it's moved to the active file list. Pages in the inactive file list are purged
before the active file list.[1] So large sequential reads may cause disk
contention, but they won't massacre the file cache.

This I/O situation isn't uncommon. Consumer systems also have big batch jobs
that can pollute file caches: large copies, rsyncs, backup software (Déjà Dup,
Time Machine, etc). They don't solve this with containers, limits, and
mlock()ing. Some programs add a couple calls to fadvise(), using the
FADV_NOREUSE or FADV_DONTNEED flags.[2] But for the most part, doing nothing
yields excellent performance. Operating systems are pretty good at their job.

1\.
[https://www.kernel.org/doc/gorman/html/understand/understand...](https://www.kernel.org/doc/gorman/html/understand/understand013.html)

2\. This is handy for applications like bittorrent, where multiple reads of
the same page are possible, but caching isn't desired.

------
sirgawain33
Great article. I'll add one: caching doesn't address underlying performance
issues, just "sweeps them under the rug".

I've seen many devs jump to caching before investing time in understanding
what is really causing performance problems (I was one of them for a time, of
course). Modern web stacks can scream without any caching at all.

Years ago, a talk by Rasmus Lerdorf really opened my eyes up to this idea. [1]
He takes a vanilla PHP app (Wordpress, I think) and dramatically increases its
throughput by identifying and tweaking a few performance bottlenecks like slow
SSL connections. One of the best lines: "Real Performance is Architecture
Driven"

[1] I think it was a variation of this one:
[https://vimeo.com/13768954](https://vimeo.com/13768954)

------
gabbo
Good article which touches on real issues that a lot of developers won't
_really_ appreciate themselves until it happens to them (unless they have a
strong background in distributed systems theory, and maybe even not then). A
little strange that it doesn't use the word "consistency" even once though. :)

By dropping a cache into an existing system, you're weakening consistency in
the name of performance. At best, your strongly-consistent system has started
taking on eventually-consistent properties (but maybe not even eventual
depending on how you invalidate/expire what's in your cache). Eventual
consistency can help you scale, but reasoning about it is really hard.

In some sense caching as described by OP is a tool to implement CAP theorem
tradeoffs, and Eric Brewer described the reality of trading off the C
(consistency) for A/P (availability/partition-tolerance) better than I ever
could:

    
    
      Another aspect of CAP confusion is the hidden cost of
      forfeiting consistency, which is the need to know the
      system’s invariants. The subtle beauty of a consistent
      system is that the invariants tend to hold even when the
      designer does not know what they are. Consequently, a
      wide range of reasonable invariants will work just fine.
      Conversely, when designers choose A, which requires
      restoring invariants after a partition, they must be
      explicit about all the invariants, which is both
      challenging and prone to error. At the core, this is the
      same concurrent updates problem that makes multithreading
      harder than sequential programming.

------
markbnj
I agree with all the main points here: caching adds a significantly complex
component to the system. You should only do it if you absolutely must pull
data closer to a consumer. Adding caching "to pick up quick wins" is always
dumb.

With that in mind, I do think most of the pitfalls listed here can be avoided
with well-understood tools and techniques. There's no real need to be running
your cache in-process with your GC'd implementation language. Cache refilling
can be a complex challenge for large scale sites, but I expect that a majority
of systems can live with slower responses while the cache refills organically
from traffic.

The points about testing and reproducible behavior are dead on - no
equivocation needed there. As always keeping it as simple as possible should
be a priority.

~~~
gabbo

      There's no real need to be running your cache in-process with your GC'd implementation language.
    

Fundamentally there's no _need_ , but in-memory caching may still be the right
choice. As always, there are tradeoffs. Standing up a separate cache component
incurs non-trivial costs. Your service now has a new "unit of management" \- a
new thing you need to deploy, monitor, and scale. It's a separate thing which
might go down unless it's provisioned for sufficient load, and you need to be
careful about unwittingly introducing a new bottleneck or failure mode in your
system. These are all solvable problems, but solving them comes at a cost.

You can totally argue that engineers should be forced to think about and
address these issues up front with more rigor, and in a perfect world I think
I'd agree. :)

------
armon
The article can probably be succinctly summarized as "Premature optimization
is the root of all evil". Most of the authors points are valid, in that
caching adds more complexity.

That said, caching is absolutely critical to almost every piece of software
ever. Even if you explicitly caching isn't used, a wide variety of caches are
likely still being depending upon including CPU caching (L1, L2, L3), OS
filesystem caching, DNS caching, ARP caching, etc etc.

Caching certainly adds complexity but it's also one of the best patterns for
solving a wide range of performance problems. I would recommend developers
spend more time learning and understanding the complexities so that they can
make use of caching correctly and without applying it as a premature
optimization.

~~~
TheLoneWolfling
What bugs me is not so much caching as _redundant_ caching.

I've seen applications that have 5 redundant caches, if not more (on-disk
cache, OS cache, VM OS cache, stdlib cache, programmer-visible cache). And
then you end up killing the actually-important caches (CPU caches, etc) from
the amount of redundant copying required...

------
zkhalique
Caching is the classic memory-time tradeoff, everything from memoizing a
function, to DNS caching, to storing a web resource that didn't change.

I think that, if a cache is combined with a push indicating a change, then
it's basically a local "eventually consistent replica" which catches up as
soon as there is a connection to the source of truth.

Seriously, many times you are READING data which changes rarely (read: every X
minutes / hours / days). So, in the meantime, every code path that will need
access to the data may as well look in the local snapshot first.

The question about consistency is an interesting one. The client's view of the
authoritative server state may be slightly out of date, when the user issues a
request. If certain events happened in the meantime that affect the user's
view, then the action can just be kicked back to the user, to be resolved. But
90%+ of the time, the view depends on 10 things that "change rarely", so a
cache is a great improvement.

Related issues involve batching / throttling / waiting for already-sent
requests to complete.

PS: That was quick. I posted this and literally 10 seconds later it got a
downvote.

------
chubot
Caching is also bad in distributed systems, because by definition you're
creating tail latency: the cache miss case. In a distributed system, you're
more likely to hit the worst case in one component, so the cache may not buy
you any end user benefit. It might just make performance more difficult to
debug.

A cache can still be useful if to reduce load and increase capacity... but
latency becomes more complex.

~~~
thrownaway2424
That's kinda weird reasoning. Are you saying there's no benefit to an
improvement of median latency, if the tail latency remains long? I would
disagree. I also would point out that not all systems that can benefit from a
cache are latency-sensitive.

~~~
chubot
Not that there's no benefit, but just that it's more complicated in a
distributed system.

Certainly caching is vital to many distributed systems, but it has to be done
from a systems perspective. In my experience a lot of caches are just slapped
on top of individual components without much thought, and without even some
basic monitoring of what the hit rate is. I think it helps to actually measure
what the cache is doing for you -- but this is more work than adding the cache
itself.

And I agree with another poster in that I've seen many systems with caches
papering over severe and relatively obvious performance problems in the
underlying code.

I was thinking of this Google publication which outlines some problems with
latency variability:
[http://www.barroso.org/publications/TheTailAtScale.pdf](http://www.barroso.org/publications/TheTailAtScale.pdf)

Interestingly they didn't seem to list caches as one of the causes; they list
shared resources, cron jobs, queuing, garbage collection, power saving
features, etc.

------
velox_io
Implement cache isn't hard conceptionally. If an object is modified, flush it
from cache. If object isn't cached, build it.

The problem is that implementing caching is a bit of a canary in a coal mine.
If there are problems with the architecture, then trying to add caching into
the mix will make things much more difficult.

I wouldn't say adding cache to parts which you know will be heavily read,
upfront (or at least adding hoods to make it easier to implement later) is a
waste of time or "Premature Optimisation". The 80-20 rule is live and well,
just use your judgement.

------
0xcde4c3db
It occurs to me that sharding shares most of these disadvantages. It avoids
the problem of "you no longer read from your source of truth", but the overall
complexity and set of failure modes looks strikingly similar.

I wonder how many sleepless nights have been caused by combining the two.

~~~
rileymat1
I have worked with a couple of systems that used very course grained sharding
at the application level. I did not notice these drawbacks. I have not worked
with one that did auto sharding on the back end, that might be trickier.

------
contingencies
Some people may benefit from the academic exercise: _How many caches are
utilized in serving a typical website?_ (Assume a LAMP-like stack; if your
number is less than about seven keep thinking)

------
patsplat
Sometimes it's better to fix the database in production rather than add
another database to production.

------
amelius
We need smart tools that can automatically make programs utilize a cache. That
way, we can have the best of both worlds.

