

IdentityCache: Improving Performance one Cached Model at a Time - jduff
http://www.shopify.com/technology/7617983-identitycache-improving-performance-one-cached-model-at-a-time#axzz2PV7hpXfu

======
TheCloudlessSky
I'm curious: how does Shopify handle database migrations with the cached
models? Do you explicitly invalidate the caches when a migration is run?

The reason I ask is because in the .NET/NHibernate world, this is akin to a
2nd level cache provider. I maintain a Redis caching provider for NHibernate
(<https://github.com/TheCloudlessSky/NHibernate.Caches.Redis>) and I've been
trying to find a decent solution to this problem. When NHibernate fetches a
cached model with mismatching columns etc, it'll blows up with an exception.

I like the generational approach that is part of Rails 4 but it really only
works for the views. Maybe incorporating some sort of generational identity of
the model's configuration could be used for cache busting?

~~~
camilolopez
The way schema migrations are handled is by making a hash of the current
schema part of the cache key, so yeah, effectively every time a cached model's
underlying table schema is changed all the cached entries become invalid.

------
elbee
One risk of not using expiration at all is that if the database is updated but
the after_commit hook doesn't finish (crash, out of resources etc.) then the
cached data remains outdated until the record is updated again (which could be
never). Setting a generous TTL won't increase your load much, but will let
problems like that eventually fix themselves.

~~~
hbrundage
Core dev on IdentityCache here. Excellent point, and truth be told we hadn't
considered setting an explicit expiry to let those unavoidable problems fix
themselves. We do have a finite amount of space in memcached so the LRU there
accomplishes something similar but gaining complete control over the expiry
duration does make sense for the reasons you listed. Truth be told however I
think I value the flip side of explicit expiry more: we can use any
corrupt/un-updated information for debugging which we have done and found
really useful in the past. We're also in a way forced to deal with anything
which might interrupt our after_commit hooks instead of letting the problem
just go away in a day. Hooks firing is also critical for other services we
have (like elastic search) that rely on them, and for which I'd rather not
create other auto-healers for.

Thanks for the intelligent suggestion!

~~~
elbee
In some systems the problem is that you can never be guaranteed that the
after_commit hook will always run. This is especially true in multi-server
systems where the cache, database and front-end servers are separated. The
front-end server can update the database and then completely die (power
outage, networking, reboot) before talking to the cache.

On the other hand I can see how you would want to drive out bugs instead of
just sweeping them under the carpet with auto-healing...

~~~
hbrundage
Indeed: it will never be 100%, so perhaps a very, very long TTL on the keys
would be wise. We do end up flushing the cache incrementally if we ever ship a
bug by accident or notice there is more in the cached blob than we want which
I think accomplishes the same thing.

