
Continuous Cache Warming for Rails - foobar2k
http://stdout.heyzap.com/2012/03/21/continuous-cache-warming-for-rails/
======
oconnore
This should never be an issue. A page load should trigger page generation code
(I'm not a Rails guy, but the view in traditional MVC), not some massive 10+
second operation.

What you should be doing is caching that big time consuming job (I guarantee
you aren't timing out someone's browser with template code) at the model
level, and then generating pages off of that cached result. Feel free to
further cache the HTML too, but the bulk of the win is from decoupling long
running jobs from your display code.

If you want to ensure that the user sees the absolute latest, use an AJAX call
to pull the more recent result after serving the previously cached result.

~~~
kellysutton
We had a similar conundrum at LayerVault a few months ago. It can be quite
costly to rebuild file histories in a quick manner. We offloaded the tasks and
denormalized the data a bit and now our pages scream.

------
briandoll
I feel the need to point out that regardless of how you're "pre-warming" the
cache, you're missing the primary benefit of page caching.

By default, Rails' page caching persists to disk. An immense advantage of this
solution is that you can use your web server to serve these pages directly,
without ever hitting the Rails stack.

You're saving the page generation time, but you're still hitting the Rails
stack + Redis for every single page request, both of which are entirely
unnecessary.

~~~
FooBarWidget
Page caching is only useful if your page looks the same for everybody, e.g. it
has no 'Welcome $LOGGED_IN_USER' bar. Which already rules out 99% of all web
apps.

~~~
davedx
The last big web app I worked on cached the page components. For example, the
"Latest images" section is cached, but the "Logged in user (11 new mails)"
section isn't. (Or as another poster suggests, use AJAX for that stuff).

You can still get away with caching large parts of your pages though.

~~~
bhousel
In Rails that's called "Fragment Caching", which is different from "Page
Caching".

------
bigiain
"There are only two hard things in Computer Science: cache invalidation and
naming things." -- Phil Karlton

(though in this case, I think it's safe to assume the second hard thing is
solved - "kludge" seem to fit just fine)

------
trustfundbaby
_It should never be used in production or for user-facing or critical client
purposes"

script/console production < worker/cache_page.rb_

?

 _Our slow endpoint was on a back-end administrative page only; faking the
session data in curl would have been annoying. Also, it was exceeding the
timeout limits of our production server_

I think you have a bigger problem here.

You're right this is very hacky, it makes me itch, but I'm not sure I have
better solution. Why not just use wget to load and cache the page (passing in
a unique parameter that you use to expire the cache and skip the filters)?

~~~
foobar2k
I don't think that's a better solution, it's similar but less integrated with
the app stack.

~~~
trustfundbaby
Good point ... how about just throwing in rufus-scheduler and using excon to
make the request?

------
dools
As far as I know Varnish will let a single request through to generate a cache
and hold subsequent requests in a queue then serve everyone from the same
cached data.

If your page takes long than a second or two to load when uncached you should
really be moving towards a batch processing model.

------
yummyfajitas
I ran into a similar problem since some of my pages were taking upwards of 5
seconds to render. But rather than pre-warming the cache I opted for static
generation. Here is a small library that does this for django:

<https://github.com/stucchio/Stiletto>

It's nearly always better for nginx to serve up pre-gzipped content than for
nginx to ask django/rails to ask memcached for the same content. It reduces
your CPU load as well, so you need fewer servers to scale up.

On EC2, make absolutely sure you are storing the pre-rendered files in
ephemeral storage (/mnt, not /var) - instance storage is slow.

------
mnutt
We used to do this at Gilt. The problem we found was that eventually the
warmer wouldn't be able to get through every page of the site before the cache
expired, and users would start falling through to the rails instances. Despite
there being hundreds of rails instances, this would kill the site in seconds.

We eventually moved to decoupled services, which were able to do much smarter
caching.

------
spicyj
Unless I'm misunderstanding something, script/runner could be used instead of
script/console. Then the whole Rails environment is properly loaded but in a
manner that's actually meant to be used for prewritten scripts instead of in
an interactive environment.

~~~
agius
script/runner does not have the app.get facility. That's only available in
script/console.

~~~
wycats
app.get is using the same facility as integration testing
([https://github.com/rails/rails/blob/master/actionpack/lib/ac...](https://github.com/rails/rails/blob/master/actionpack/lib/action_dispatch/testing/integration.rb)).

Check out the code that instantiates it at
[https://github.com/rails/rails/blob/master/railties/lib/rail...](https://github.com/rails/rails/blob/master/railties/lib/rails/console/app.rb)

------
rubyrescue
incidentally the irb trick to prevent tons of output works just fine if you do
;0

saves you a few keystrokes...

