

Serving static files with Django and AWS - going fast on a budget - brox
http://eventseer.net/p/thomas_brox_roest/whiteboardentry/13/

======
ajross
_Web servers such as lighttpd and Apache are blazingly fast at serving static
pages. Reading a file from disk and sending its contents back to the browser
is as simple as it gets._

Except that it's not. One of the reason the classic web servers are so
amazingly fast with static files is that they, along with the operating
systems they run on, have spend significant time trying to optimize this
process. For linux, for example, look at the sendfile(2) system call, or the
TCP_CORK socket option. These were expressly designed to permit a userspace
program to get a file through the kernel and onto the wire with as few CPU
cycles and memory copies as possible.

One of the most frustrating things about the Web 2.0 crowd (from the
perspective of curmudgeons like me) is that they really don't have a clue
about any of this complexity. They just figure that they'll stuff everything
into a DJango/Rails/whatever request and scale up later. Then when they run
into trouble, they end up turning to tools like Apache as black boxes and
designing Rube Goldberg apparatii around them when they really should be
looking at the problem more directly.

Really, folks: those low level APIs are your friends. They're not nearly as
scary as they look. Even if you end up with an off-the-shelf solution,
knowledge of this stuff can only be good for you.

------
briansmith
0\. Make sure your HTTP responses return the correct value in the Vary header.
Otherwise, everything else will fall apart.

1\. Ensure that all your HTTP responses have proper ETags, and make sure you
process the If-Match and If-None-Match request headers appropriately to avoid
doing unnecessary work.

2\. Put a simple caching reverse proxy (e.g. Squid, Varnish, mod_disk_cache)
in front of your application. Then, tune the cache-control directives to allow
the cache to return cached responses without hitting the back end. For this to
work for HTTPS responses, you need to put a HTTP-to-HTTPS proxy in front of
the caching proxy.

3\. Add a system like the one described in the article, that immediately
purges entries from the cache when they are updated in the back-end.

4\. Purchase a caching, SSL-enabled, load-balancer appliance (e.g. Big-IP)
that is built to do all of the above nearly automatically.

Most people never need to go beyond step #2.

------
babul
Being into Django+AWS development at the moment (and using
<http://eucalyptus.cs.ucsb.edu/> for a private cloud with AWS as an extension
as AWS is not cost effective in many ways), I find this a good article to
improve performance in a basic way. Many people seem to forget not all content
needs to be dynamic and some basic modification can seriously improve
performance.

It is great to see articles that include code snippets and architecture
diagrams. People should use direct example more often instead of trying to
describe things in words.

Lastly, I'd suggest using nginx (<http://wiki.codemongers.com/Main>) as I
found it to be a serious improvement on lighttp for many of my projects.

~~~
smoody
I've heard great things about nginx. In relation to this article, how would
you implement the lua script presented in this article with nginx? I know that
Perl can be embedded in the server, but I'm not sure if that would be the
proper solution for deciding which files to serve statically and which to
serve dynamically at the httpserver level.

~~~
jsn
you probably don't need perl for that. afaics from the snippet, it fits into
nginx mod_rewrite. maybe something like this:

if ($http_cookie ~* "auth=1" ) { proxy_pass <http://backend> ; break ; }

if (!-f $request_filename) { proxy_pass <http://backend> ; break ; }

