
Is Google (and mod_pagespeed) the Poor Man's CDN? - narcissus
http://www.codefromaway.net/2011/12/is-google-and-modpagespeed-poor-mans.html
======
SquareWheel
This article really doesn't have anything to say other than they found a new
feature, and we should install it. What was the speed difference? Did it
alleviate load on their servers?

I spoke to my web host about mod_pagespeed before, they reported that it was
far too buggy to be used just yet. Just an anecdote, but I happen to like my
host a lot.

------
127001brewer
How is this different from simply creating a sub-domain on your server for
CDN-like functions?

For example, Google's "Page Speed, Performance Best Practices" recommends
setting up your own sub-domain(s) to serve static content[1]. Yahoo!
recommends using a CDN service provider[2], but the general idea of using
other domains to serve static content is the same.

Also, some static content can be referenced from third-party hosts, such as
jQuery on Google Code or Twitter's Bootstrap CSS on GitHub.

As an aside, I'm surprised that, seemingly, a large number of web developers
have never heard of - or read - the performance best practices from Google or
Yahoo!.

1\. [http://code.google.com/speed/page-
speed/docs/rtt.html#Parall...](http://code.google.com/speed/page-
speed/docs/rtt.html#ParallelizeDownloads)

2\. <http://developer.yahoo.com/performance/rules.html#cdn>

~~~
LogicX
Its different, because it seems you're confusing CDNs with Parallelizing
Downloads. These are not one and the same.

They're different because the recommendation to create one or more subdomains
for static content has two purposes:

1\. So that you can exclude some overhead from things like cookies being
passed on each request, as its a separate hostname from where the dynamic
content loads.

2\. So that there are multiple hostnames to download static content from, so
web browsers can take advantage of their maximum connections per host setting,
to download your static content in parallel.

Neither of those points included CDN: Geographically disperse high speed
network serving content to your viewers from the nearest CDN server to them.

The proposal from the linked article is that they are able to take advantage
of a CDN service for free, without having to pay the likes of Amazon
Cloudfront.

~~~
127001brewer
True, which is why I said "CDN-like functions" because, for smaller web
applications and websites, having paralleled downloads would increase
performance (without having to use a CDN service).

Also, what about hosting your static scripts and style sheets on a service
like Google Code or GitHub? Would that give you a "CDN-like" performance
increase as well?

I'm advocating non-CDN services through other techniques that can provide
increase performance for smaller web application and websites.

~~~
LogicX
I don't believe any such static host would necessarily improve your
performance over your own host or s3.

Cloudfront is surprisingly inexpensive, especially for smaller web
applications and websites: <http://aws.amazon.com/cloudfront/pricing/>

You probably have only a few MBs of static content, cloudfront can be
configured to have a cloudfront URL that just fetches from
<http://yournormalwebsite.com>. No S3 bucket involved, no uploading content.
Just change your static content URLs to be <http://cloudfrontURL/static>
content.

Even with tens of thousands of visitors a month, you'll owe... < $5.

Or as suggested elsewhere here, just use cloudflare, which is $0.

------
ck2
Funny how I made a couple of busy sites faster recently by taking them off CDN
and just moving them to a faster, better configured server instead (that
ironically was less expensive). All the CDN in the world would not have made
them as fast as they are now.

Unless your site is heavily image based, I simply do not understand CDN for
static content.

If you are trying to trick browsers into accepting more connections, first ask
yourself why your site is so poorly designed that it needs so many connections
- then if necessary use additional domains pointing to your own servers.

~~~
modoc
A good CDN provides "close" (network latency wise) servers so if there are 30
assets to be loaded, most are done over a 10 ms hop, not a 60 ms hop. Get a
complex page with 100+ assets, or try to serve it to another continent and the
difference is even more impressive.

Also if you're running a "busy" site off one server, it may not be busy enough
to warrant a CDN's traffic off-loading perk. When you start talking about
thousands and tens of thousands of requests per second, being able to off-load
80% of those requests makes a huge difference in how much hardware you need to
deal with locally.

~~~
ck2
There should not be 100 externally loading elements on an initial page load,
ever.

Any site does that needs to get a better developer.

~~~
modoc
So say 50 elements. With a 50ms difference that's 2.5 seconds of total time.
Most browsers will do 2-4 simultaneous connections, so let's say 4, that's
still over 1/2 second faster page load. Now put that end user in Europe or
Africa, and you're looking at 200-500 ms latency per call, versus 10 ms. You
can see how the math works.

Assets loaded on some of your recently submitted URLs:

21 - <http://www.mozilla.org/en-US/firefox/10.0/releasenotes/>

60 - [http://www.adequatelygood.com/2010/2/JavaScript-Scoping-
and-...](http://www.adequatelygood.com/2010/2/JavaScript-Scoping-and-Hoisting)

104 - [http://www.hollywoodreporter.com/thr-esq/falling-skies-
lawsu...](http://www.hollywoodreporter.com/thr-esq/falling-skies-lawsuit-tnt-
font-285103)

282 -
[https://plus.google.com/u/0/111314089359991626869/posts/HQJx...](https://plus.google.com/u/0/111314089359991626869/posts/HQJxDRiwAWq)

54 -
[http://www.computerworld.com/s/article/9223601/Anonymous_dup...](http://www.computerworld.com/s/article/9223601/Anonymous_dupes_users_into_joining_Megaupload_attack)

67 - [http://abcnews.go.com/blogs/politics/2012/01/rand-paul-in-
pa...](http://abcnews.go.com/blogs/politics/2012/01/rand-paul-in-pat-down-
standoff-with-tsa-in-nashville/)

70 - [http://www.wptavern.com/bad-behavior-in-the-wordpress-
commun...](http://www.wptavern.com/bad-behavior-in-the-wordpress-community)

etc....

The impact of reducing latency for each of those calls, and offloading those
requests from your server(s) is a big deal for big sites.

~~~
ck2
Strong upvote for defending your argument with researched examples.

I guess I am old school, I'd never serve that many elements on initial page
load, CDN or not, that's crazy.

Each additional DNS lookup can add up to 2 seconds if it's a cold cache-miss.

Most modern browsers/servers use pipelining so it's not a 50ms connect each
time.

Different continent I might understand the desire. But I have a server in VA
that can serve western Europe at 75-100ms connect time, which is not horrible.

------
krmmalik
Mildly off-topic.

We've been using maxCDN since its not terribly expensive to use for the entire
year, and since its being used with Wordpress site we've coupled that with a
caching plugin and also incorporated CloudFlare. maxCDN seems to be very good
for caching static content such as js and css files, but i dont believe it
caches the html itself nor does Cloudflare. Thats only being done locally.

I do wonder if there is some other service out there, thats not terribly
expensive that caches html too. Ive noticed even with a CDN and services like
CloudFlare, if you dont have a dedicated server to handle the high html load
requirement the setup is going to fail.

I'll be interested to learn if anyone has any further experience with things
like nginx or varnish to cache their html and what services in their cloud
they are using to achieve this.

~~~
tyler
Fastly (fastly.com) is designed for exactly this purpose.

~~~
krmmalik
I have to admit. I found the pricing structure on Fastly, very confusing. Are
you someone affiliated with Fastly?

I'd be quite interested in it as a service if it can help me handle high
volumes of traffic on a server thats running as a shared hosting account
negating the need to move to a dedicated server.

------
Gigablah
Alternatively, just use Cloudflare.

~~~
phoboslab
We've been using CloudFlare for two weeks now for an image serving site. It
saves us about 1.5tb in traffic per day.

I still don't understand how they can offer this for free; I'd be happy to pay
them a few hundred USD per month - it would still be a lot cheaper than AWS or
any other cloud hoster out there.

~~~
tonfa
> I still don't understand how they can offer this for free

You allow the following:

\- Add tracking codes or affiliate codes to links that do not previously have
tracking or affiliate codes.

\- Add script to your pages to, for example, add services or perform
additional performance tracking.

~~~
phoboslab
We only enabled CloudFlare for two of our subdomains. These subdomains only
serve images, no HTML or script files.

So I guess we're circumventing their monetization model and will be kicked
anytime now?!

~~~
jgrahamc
You certainly won't be kicked off CloudFlare and I don't know what the parent
is talking about. CloudFlare doesn't modify your pages to make money for
itself. It makes money by people paying for premium services.

~~~
tonfa
> I don't know what the parent is talking about.

I'm talking about the ToS for cloudflare. I don't know whether they modify the
pages to make money or not, but they make it clear it is a possibility.

~~~
jgrahamc
That section of the ToS begins: "Depending on the features you select,
CloudFlare may modify the content of your site." The examples listed are
things that you control, by default it doesn't modify the page.

~~~
tonfa
I see, thanks. Might be worth stating it explicitly ("we won't modify the
content unless you allow us to").

------
modoc
Related question: I started playing with mod_pagespeed when it first came out,
however at the time it was very buggy and broke RichFaces based apps, PHP
Gallery, and a few WordPress plugins that sites I host run, so I had to ditch
it. That was ages ago though. Has anyone been running a recent version of it?
What is your feedback?

------
codesuela
Call me paranoid but I would rather not see this being utilized to much.
Personally I block Google Analytics (along other tracking scripts) and many
people use Adblock. If you start moving your important scripts and stylesheets
to Google they will still be able to track you throughout the web.

~~~
LogicX
Tracking available via Google Analytics != the tracking Google could achieve
by having your site use their CDN.

Google Analytics is a browser-side javascript code.

Static content loaded through them would not send cookies^, there would be no
dynamic URLs with session IDs. The best they could do is track by IP, which
they also probably wouldn't bother with across the globally dispersed CDN
network.

^ Should not

~~~
jackalope
IP address + User-Agent + Referer is a potent combination for tracking, even
without cookies, and is available to all CDNs.

~~~
LogicX
True, this is possible, and they could combine this information with IP + UA
when you are visiting google properties that also have the google analytics
code in it.

Though depending on the network you're on (how many people are NAT'd behind
that IP), and how generic your User-Agent string is, this could be useless.
Comparison from Urchin (owned by google):
[http://www.urchintools.com/urchin6/configuration/tracking-
me...](http://www.urchintools.com/urchin6/configuration/tracking-methods)

------
kenrik
If you need a CDN you most likely can afford to pay for one. No? I mean even
if you're just loading your images through S3 unless you're getting some crazy
number of hits it's only going to cost you a few bucks a month. I'm all for
CDNs however many people think that you need it (thanks to yslow and google)
for a basic website and it's just not the case. if you're not under a high
traffic situation that extra hop can actually make your site slower than if
you just served the files from your server.

If you're on a php/database driven site (early on) I would be willing to wager
that your optimization time would be better spent following that string. If
your website can only handle 25 hit/s because you're killing MySQL with
SELECT* FROM __ALL __type commands the CDN is not going to do you much good.
After you're sure your code is optimized and you feel you need it, sure have
at it! there are plenty of good CDNs out there.

~~~
deno
> If your website can only handle 25 hit/s because you're killing MySQL with
> SELECT* FROM ALL type commands the CDN is not going to do you much good.

It will, if you set proper caching headers.

