

How I Made my Blog 2.3x Faster - polymathist
http://blog.alexbrowne.info/how-i-made-my-blog-faster/

======
exabrial
"Ruby/Rails" <-Theres the problem

I know HN loves these frameworks, but seriously, the JVM is waaay faster...
Too bad there wasn't a dead simple JVM or Java based blog engine.

~~~
Cowen
Did you catch that afterwards he still wasn't using the JVM?

Any time you're serving up gzipped static pages instead of loading _any_
language runtime, JVM or not, you're going to see huge speedups.

~~~
exabrial
My apologies for "implying" that. I facepalmed when I saw, "slow" and
"ruby/rails". So I digressed into a rant about the lack of a simple blog
engine for Java, which would likely run much faster than a Rails solution. I'm
fully aware his solution is to just pushed static files, and I even bet he's
having Apache do linux sendfile, which would pretty much idle the CPU.

Better?

~~~
chc
What do you find lacking in the countless existing blog engines for the JVM?
It seems to me that hosting blogs on the JVM is about as well-supported as
hosting them with Ruby.

------
csense
Everybody says it's the bees' knees, but Ruby on Rails is terrible. I speak
from experience, I administer non-public installations of Redmine and Gitlab.

First of all, the language syntax is awful and incomprehensible.

Then there's the Byzantine deployment process (although, to be fair, it seems
endemic to modern web development; it's just more obvious in Ruby since most
Ruby programs are web apps).

Finally, the startup for a Ruby application is _incredibly slow_ \-- worse
than Java ever was, even in the Bad Old Days running 1.2 on a 300-ish MHz
Pentium II (which was a state-of-the-art desktop at the time).

Hint: If someone makes a post to the Github issue tracker for your web
application saying "Increase timeout from 30 seconds to 300 seconds" [1], a
one-line patch changing a number from 30 to 300 does _not_ count as solving
the problem. There is _no reason_ that a web app should _ever_ have an
expected load time greater than 30 seconds.

[1] <https://github.com/gitlabhq/gitlabhq/issues/694>

------
rb2k_
While I personally like Jekyll, and the added simplicity when it comes to
hosting, there should be almost no performance difference between a
"dynamically" generated blog based on Rails/Sinatra behind Cloudflare and
static hosting on S3.

The only user generated content on a blog are usually comments. With just a
bit of added caching headers, I'd say that 99+% of all page hits on a blog can
be served out of Varnish. You can even use rack-cache if you don't mind the
performance penalty. If you're ok with using e.g. Disqus (as Octopress does)
for comments, you can serve 100% of them and just purge the cache when adding
new content. This would probably also work when doing a purge for new
comments. This way, you have to do the dynamic computation exactly once per
page and you're done.

Adding the CDN properties of cloudfront to S3 is nice, but Cloudflare seems to
provide a very similar solution for "regular" sites.

I'd say this is similar to comparing AOT compilation (Jekyll) vs JIT (dynamic
generation + caching).

~~~
johnbellone
I converted my blog[1] to be static this past weekend using all AWS
(S3+CF+Route53) and it went relatively painless. But I decided to use
middleman instead, and after using that I don't think I'll be going back to
Jekyll.

I still have a Github.com pages account that uses Jekyll because its free, but
since its so damn cheap I may convert this to AWS stack as well.

The largest problem I was running into was permissions and invalidating the CF
CDN. I am thinking of rolling a simple Markdown editor using node-webkit and
scripting up some things to automate everything.

[1]: <http://thoughtlessbanter.com>

~~~
HeyImAlex
I wrote a deployment script for s3/cf/r53 in python last week. It's pretty
janky right now and is very oppinionated on a lot of things, but it handles
invalidation requests for you and the code is simple enough to easily modify
to your liking.

<http://www.github.com/heyimalex/s3tup>

------
greggman
I don't keep up on what various platforms offer but wasn't the original
movabletype a static site generator? You entered a new post, clicked generate,
it made a bunch of pages. It was only later that they added a dynamic option.

Blogger was static too. It was just a cloud based DB. You gave it an FTP
name/pass and it logged into your site and uploaded the new files.

------
RyanZAG
Web 3.0 - static content? ;)

~~~
lumberjack
That or it's anti-thesis of sorts, i.e. websites that are rendered almost
entirely on the client side.

~~~
debacle
Websites are already rendered entirely client side.

~~~
RyanZAG
I think he is referring to having the clients download a copy of the database
and a big chunk of javascript to create views and forms to use the database.
Syncing the database with the server at specified intervals, etc.

Maybe that can be Web 4.0 - Any ideas for Web 5.0? ;)

------
tomfakes
I recently wrote a blog post about how Rack::Cache and ETag work
[http://blog.craz8.com/articles/2012/12/19/rack-cache-and-
eta...](http://blog.craz8.com/articles/2012/12/19/rack-cache-and-etags-for-
even-faster-rails), which is particularly important to know for Heroku based
apps.

The key part is that, for public content (and blog posts tend to be public),
the Rack Cache can be used to serve these pages directly from the cache store
(usually Memcache) with minor database traffic needed, _even for people who
have never seen this content_.

That last part was the surprise for me - surely ETags are only used by return
visitors! Rack Cache makes ETags work for new visitors too.

I think I can get my blog to run almost as fast as a static site, and I'm
working towards getting that done and documenting it as I go.

(In addition, Heroku seems to be adding Varnish headers to my responses. They
say they don't use Varnish in Cedar apps, but this is clearly not correct)

------
timmillwood
I was looking to move my Drupal site to Jekyll but quickly got bored. I
therefore moved the site to Sinatra and now Rails, yes it's slower than jekyll
and the rails site is slower than the sinatra one but more fun.

~~~
neumann_alfred
re: "more fun". Yes, having a faster site that uses less resources is great,
and very probably important for those seeking audiences as large as possible.
But coding my own CMS and making it just like I want it (minus where my
abilities limit me in that) is something I never would go back on. It's simply
so much more fun than, say, just using wordpress (not to mention something
even more basic).

Sure, my site is just "my stuff", quotes and links I collect, maybe a rambling
here and there. Therefore my comment is not quite on-topic, it's not the same
kind/league of site. But I just can't find anything exciting in pure speed, I
want something that I like when I look at it and use it, not something I like
because I know it's very popular. I like having variable and filterable views
on content, user preferences, etc. too much.

I don't want to pretend I'm some kind of artist; I'm a shoddy coder, and not a
great designer either. My CMS is unusable for anyone except me I'm sure. But
still I take pride and enjoyment in whatever it is I'm doing, and at the very
least I would encourage everyone to make something that is about ideas and
features more than performance, even if as a secondary/private site.
Minimalism is sometimes overrated. It may be more effective when dealing with
many people, but it's also a bit sterile. So maybe do both, one for business,
the other for inspiration.

~~~
CGamesPlay
Agree. My first reaction when reading the article was "your blog was massively
overengineered". And while it was (after all, it got replaced with static HTML
pages), I'm sure it was fun measuring and tweaking the performance and
learning about building a scalable infrastructure.

------
arrowgunz
Isn't it obvious that static HTML is served faster than dynamically generated
HTML? What's so new about that? Please let me know if I am missing something.

~~~
ahoge
It's actual data from one actual (and very common) use case.

Also, the document itself is just one fairly small piece of the whole puzzle.

------
EwanG
Ummm... not to be the wet blanket, but what business advantage is there to
your blog loading 2.3x (or even 5x) faster? I understand you may occasionally
be adding content that helps plug your work, but in general I am less
concerned about the speed of a long article loading than I am the content
being worth the much longer time it will take to read.

Maybe it's just me?

~~~
polymathist
There's not really a tangible business advantage. Sure, I put a brief plug in
some of my posts, but that's not really the point. For me the whole point was
just to explore new technology and push it as fast as it can go. I learned a
lot from the process, I found it personally intriguing, and maybe I can even
use some of what I learned in future business ventures. Other people might be
able to take away something from it as well, however small.

(I know it's a pretty hefty post, hence the tl;dr at the top. No one is
forcing you or even asking you to read the whole thing)

------
lzm
Wow. I'm actually impressed by the load speed of that page.

As someone who lives in South America I've gotten used to sites loading in 2+
seconds, but this one feels almost instantaneous.

------
viseztrance
I'm sure this was a fun exercise but I see this just as a limitation of
heroku. Normally you could've switched to full page caching in your rails app
and get the same results.

~~~
polymathist
True, but there's still an advantage to using a cdn in terms of worldwide
performance.

------
dave456
From previous HN threads on Static Site Generators I have seen people
recommend Punch and Nanoc. How does those compare with Jekyll?

~~~
jacquesm
Jekyll is slow as molasses on anything over the trivial level, I haven't used
Punch or Nanoc so I can't comment on those.

------
jevin
I guess the price difference between the two setups is huge. I'd love to hear
more about that.

~~~
polymathist
Actually the price will end up being pretty similar. It was more or less free
when I was hosting through Heroku (pennies a month, and that was just for
images on CF). It's hard to predict exactly (depends on traffic) but I'd be
surprised if it costs me more than $1 per month with the new setup. Will find
out for sure at the end of the billing cycle.

~~~
polymathist
Update: For the month of December, Amazon charged me $1.06 for S3 and
CloudFront combined. This includes the ~million requests I performed during
benchmarking.

