

Django-medusa: Rendering Django sites as static HTML - mtigas
http://v3.mike.tig.as/blog/2012/06/30/django-medusa-rendering-django-sites-static-html/

======
natrius
_"The process that actually generates the output simply uses (or abuses)
Django’s internal testclient to request each URL and store the resulting
data"_

It warms my heart to see another example of test client abuse. We used the
test client to implement a pure Python ESI processor[1]. We ran it in
production for awhile, but no one should ever do that. Varnish is the answer.
The code was still running on our development machines when I left.

[1] <https://github.com/armstrong/armstrong.esi/>

------
ashray
I just don't see the point to this kind of stuff.. =/

For high traffic apps, Varnish is the answer as you don't hit the application
layer.

If you think that's too complicated, try nginx-memcached - also an excellent
solution.

If not that, try django's template caching with memcached - also extremely
fast but will hit the application layer.

If you're in some shared hosting environment (you probably are too small still
to warrant this kind of aggressive caching on static assets - but hey,
efficiency never hurt anybody :P) without access to memcached, use django's
cache backend with a file based cache. It's almost 100% as what this does and
you don't have any additional overhead.

Beats me why people are re-inventing the wheel - or am I missing something ?

~~~
mtigas
It was an itch I wanted to scratch. I’d been tempted to convert my entire
(Django-based) blog over to something like Hyde[1], but wanted a bit more
flexibility than the framework provided.

In most applications I work on, I lovingly use and abuse Memcached, Redis, and
Varnish. If you’re working on an application that warrants using a live
website and the whole application server shebang, then yes, I’d agree with
you.

But for something like my blog and other non-dynamic websites which don’t
update very much at all, I’m not sure if I see a pressing _need_ for an
application server. My blog previously ran varnish-nginx-uwsgi-django, but I
was moving off of a VPS and it was the last thing left on that server. I got
curious.

In the case of something like the L.A. Times’ Data Desk[2] projects (who use
their own django-bakery app), if some views are very expensive/slow to
generate, you can offload the work from the application server and do it in
advance. (This makes sense if you want to just render everything out on a fast
workstation or if you have a local database of several hundred gigabytes that
you don’t want a live server querying to crunch the data.) It’s not out of the
question to pregenerate HTML pages, JSON for visualizations, and simple image
files (generated in PIL).

In any case: it’s not so much a question of “high traffic apps” as much as the
tradeoff between (computation cost + server maintenance cost) and (app that is
server-side dynamic or updates frequently). Most people don’t want to
configure and maintain an app server (with cache layers and all) for a simple
app and those that don’t seem to have uptime issues the moment they get any
legitimate traffic: see [3].

So:

* I decided I didn’t want to maintain an app server for my blog, and my historical average for updates is about once every four weeks (or even more infrequent). * People seemed to be big fans of Jekyll/Hyde, Movable Type’s static publishing mode[4], WP Super Cache, etc. * I felt a Django-friendly analogue to those would be cool. * Like any developer tinkering with their own blog, _there didn’t have to be a point_.

[1]: <http://pypi.python.org/pypi/hyde/> [2]: <http://datadesk.latimes.com/>
[3]: <http://inessential.com/2011/03/16/a_plea_for_baked_weblogs> [4]:
<http://daringfireball.net/linked/2011/03/18/brent-baked>

~~~
ericingram
I'm wondering, what kind of dynamic features were you interested in for a blog
that you update once per month? Why not just edit static files on S3?

~~~
mtigas
I'm actually coming up on ten years of having the same blog: waaaayyyyy back
when I started, I was editing all my pages manually. (Though blog posts didn’t
have permalinks, it was just a growing massive "list page" that I’d break off
and paginate every so often.) It started to become a pain in situations where
I wanted to modify some basic aspect of _every_ page: I’ve consistently gotten
the itch to either re-architect or re-design my blog annually.

You really can’t beat a system that uses templates.

I'd tried my own Javascript-based content system in the past (where everything
is based on one HTML page and JS loads the page content), but those add a bit
more client-side complexity (not to mention search engine reachability).

I think the ability to regenerate an entire 500+ page static HTML site is
pretty powerful and useful. (Also: who wants to manually update date-based URL
paths every time there’s a new thing? <http://v3.mike.tig.as/blog/2012/06/30/>
<http://v3.mike.tig.as/blog/2012/06/> <http://v3.mike.tig.as/blog/2012/> etc.)

EDIT: As to your first question wrt "dynamic features I wanted": CMS and full
control over my site's behavior and templates. It’s Django-based, too, so I
can theoretically extend it with any features as necessary. (I also have
plenty of "non-published" content that I can view on the local, "hot type"
development server of my site, but the "renderer" file is only configured to
upload blog posts marked as "live". I find that feature pretty useful.)

~~~
ashray
I see how you can use this :) It's actually a pretty great idea for some
scenarios so kudos on that!

TBH, I still think Django is overkill for a static site, there are templating
systems and frameworks that would be far lighter (flask + jinja ?) -
especially given that you really don't need any dynamic usage at all.

Still, great work on creating and releasing a tool that makes your (and
possibly others..) life easier!

------
bifrost
Thats kinda neat, I suspect this will be used a lot. I did something similar
to this 5-6 years ago with wget but this is a lot more elegant IMHO.

~~~
mtigas
It’s very wget-like, due to the use of the Django HTTP test client — just
_slightly_ more elegant due to the programatic definition of what gets
scraped/rendered and the addition of the "direct to S3" backend, which allows
arbitrary mimetypes.

Glad you like it.

------
55pts
Very cool project, thanks for open sourcing it. I was looking for this type of
library this week and found medusa and aymcms.

Does it handle images? How about multiple sites/subdomains?

~~~
mtigas
For my blog, static files (anything stored in an app's "static" directory,
basically[1]) the like are handled transparently through django-storages' S3
support [2]. (The STATICFILES_STORAGE option in settings.) If you’ve used
Django's staticfiles framework before, it’s pretty much plug-and-play.

For more dynamic file storage (say, using FileField or ImageField in a model),
I believe django-storages would work, too. (Make sure you configure django-
storages with the DEFAULT_FILE_STORAGE option set to S3 also.)

Assuming you’re managing your site via a local dev server (or a server that
"hosts" the "hot type" version of the site), any time you "upload" a file to
your local server, it'll actually upload to S3 (and any calls to "field.url"
will actually map to the S3 URL). Not sure how well it'll work in all use
cases: I haven't actually used FileField or ImageField myself in the django-
medusa+django-storages usecase, but I _have_ used both separately so I’m
fairly sure this is possible.

This is a pretty darn good question though, so I’ll likely make a follow-up
blogpost with a more comprehensive walkthrough regarding handling staticfiles
and FileField/ImageField. Sometime in the near future.

Multiple sites/subdomains is a bit more complicated. I’d say you should
probably use separate Django instances for each and render them separately.
(For S3, you’d need to use separate buckets, anyway.) If they _need_ to share
data, you can configure multiple Django settings.py configurations for each
site but still use the same source tree and local database. (See the Django
sites framework: [3])

[1]: <https://docs.djangoproject.com/en/dev/howto/static-files/> [2]:
[http://django-
storages.readthedocs.org/en/latest/backends/am...](http://django-
storages.readthedocs.org/en/latest/backends/amazon-S3.html) [3]:
<https://docs.djangoproject.com/en/1.4/ref/contrib/sites/>

------
megaman821
Why all these static site generators? Just use Varnish, it is the static site
generator for any and all frameworks.

~~~
rglullis
Do you have access to Varnish in your standard run-of-the-mill shared host
service?

~~~
nilved
Or GitHub pages/Amazon S3/etc.

