
Amazon CloudFront - Support For Dynamic Content - jeffbarr
http://aws.typepad.com/aws/2012/05/amazon-cloudfront-support-for-dynamic-content.html
======
simonw
We got burned by CloudFront about 18 months ago... we were serving our static
assets (CSS, JS etc) through CloudFront and had bug reports from some users in
eastern europe (I forget where, it might have been Slovenia) that our site was
displaying without CSS. I got them to check and they couldn't load CSS for
GitHub (which used CloudFront) either. We went back to serving directly from
S3.

It's an infuriating bug, because I can't see how we could confirm that this
kind of thing isn't an issue any more. I'd love to go back to CloudFront but
I'm just not confident that it will reach all of our users.

~~~
jeffbarr
Please feel free to send me some details (address is in my profile) and I'll
pass it along to the team.

~~~
simonw
I've emailed you. For anyone else who's interested, here's the support email
we got (back in January 2011 it turns out):

> The following URLs fail to load: >
> <http://cdn.lanyrd.net/css/core.221dbc4b.min.css> >
> <http://cdn.lanyrd.net/js/jquery-1.4.3.min.97be02d1.min.js> >
> <http://cdn.lanyrd.net/js/jquery.jplayer.min.72d89d00.min.js> >
> <http://cdn.lanyrd.net/js/lang.ENG.4f594a71.min.js> >
> <http://cdn.lanyrd.net/js/global.f0851851.min.js> > This is on the basic
> page - <http://lanyrd.com/services/badges/>. As far as I can tell, no files
> from the domain cdn.lanyrd.net will load. > > Also, it seems the Lanyrd.com
> site doesn't can't load any resources from the CDN domain as well - the
> homepage is totally broken for me. > > Oh, and I'm situated in Slovenia, if
> that helps.

I replied and asked them to run "host" and "ping" against cdn.lanyrd.net and
they sent back the following:

> Host cdn.lanyrd.net not found: 3(NXDOMAIN) > ping:unknown host
> cdn.lanyrd.net

I also had an incident a few months later where our assets failed to load for
a period for me sitting at my desk in London - GitHub's assets were affected
as well, which lead me to suspect it was a CloudFront failure. Unfortunately I
don't have any notes from that.

~~~
ceejayoz
How do you know that wasn't your DNS provider having troubles there? Should
have had them do `dig` to see if it was a DNS issue on your end instead of
blaming Amazon right off the bat...

~~~
simonw
It could well have been (that's why I'm sharing the details: so people can
make their own mind up). Like I said, this was over a year ago so it's pretty
hard to debug-in-hindsight.

~~~
ceejayoz
Starting with "We got burned by CloudFront..." seems a little harsh when the
only piece of actual data you have could just as easily point at your own DNS
provider rather than Amazon's systems...

------
j2labs
Still no gzip support, though. I had to jump through some hoops to get this to
work by posting duplicate files that were gzipped ahead of time that respond
to all requests with static headers saying the content is gzipped. It works,
but it'd be a LOT better if cloudfront could do that for us.

~~~
notmyname
Rackspace Cloud Files supports this. The file "test_javascript.js" was saved
non-compressed. It works the other way, too (compressed->uncompressed if the
client doesn't support compression):

    
    
        $ curl -i http://d.not.mn/test_javascript.js
        $ curl -i -H "Accept-Encoding: gzip" http://d.not.mn/test_javascript.js
    

This isn't trying to take away from their announcement. I'm always impressed
by Amazon's ability to rapidly deliver features.

~~~
boundlessdreamz
Rackspace cloud files didn't have origin pull last time I checked. Without
origin pull asset serving via cdn is a pain to setup and maintain.

~~~
notmyname
True. Your content needs to be in Cloud Files, not on your own server. The
storage and cdn services are tied together into the product. They have not
been separated to allow the CDN on top of any arbitrary endpoint.

I don't see the requirement of storing the data in cloud files as a very heavy
burden, but I'm not the most unbiased source on that.

------
kennu
It sounds like being able to run a bunch of Varnish servers to cache stuff at
edge locations around the world. I wonder if it really works that way or do
you have to change your web app a lot to work with it?

~~~
tyler
If that's what you're after you might want to check out Fastly. We're a CDN
entirely based on Varnish, with all the features that implies.

------
eli
_If you set the TTL for a particular origin to 0, CloudFront will still cache
the content from that origin. It will then make a GET request with an If-
Modified-Since header, thereby giving the origin a chance to signal that
CloudFront can continue to use the cached content if it hasn't changed at the
origin._

I wonder how well this works for content that is truly dynamic. Seems like it
would necessarily be slower for those pages that change on every request.

~~~
byoung2
I bet they use a grace period, like Varnish does. The grace period would let
the cache serve stale data for a few seconds or more while the if-modified-
since call is made, and if necessary, the cache refreshed.

~~~
mckoss
I don't think that's what the post said. Doing so would break the semantics of
the 0 second cache time. They must wait for your 304 Not Modified response
before serving from their cache.

------
teoruiz
Supporting the query string as part of the cache key is big news for CF users.

Thanks a lot.

------
marcamillion
Hrmm....the more and more that Amazon rolls out these services is the more and
more tempted I am to go full-hog on AWS.

I only use S3 now - and Heroku - but I am excited about where this will
continue going.

The future looks bright and I can't wait until the right application comes
along for me to build it on top of a fully scalable infrastructure that I only
have to pay as I use.

~~~
taligent
Don't.

My suggestion for almost all businesses is to use AWS for S3, SQS, SWF etc and
then get dedicated/VPS servers in the same data center. I actually get faster
ping times to SQS from my dedicated server than from EC2 (both in US-East).

EC2 is the biggest ripoff going around. And all the other AWS services are
some of the most awesome going around.

~~~
netmau5
What are some good dedicated hosting options in US-east? I've tried looking
them up but the info is usually buried deep in the hoster's site where it's
impossible to find.

~~~
shimon_e
Try burst.net

OVH should have their east coast data centre open for customers in August.
[http://www.datacenterknowledge.com/archives/2012/04/30/europ...](http://www.datacenterknowledge.com/archives/2012/04/30/european-
giant-ovh-to-enter-us-hosting-market/) I am an alpha testing and their service
is great. They are not just a data centre that leases bandwidth from others.
They are an internet backbone with ownership in back haul fibre. They are big
enough to add 290gbps to their network in days.
[http://forum.ovh.co.uk/showpost.php?p=42216&postcount=66](http://forum.ovh.co.uk/showpost.php?p=42216&postcount=66)

Most of the links are 10gbps so it is just a simple hardware upgrade to 40gbps
or 100gbps.

------
mattgreenrocks
Blah, still no mention of SSL CNAME support.

~~~
WALoeIII
This cannot be reliably done with Windows XP in the wild.

<http://en.wikipedia.org/wiki/Server_Name_Indication>

Essentially, the cloudfront server doesn't know the certificate to present.

------
ilaksh
I have been messing around with caching and combining files quite a bit over
the last few days, mainly since the latency between myself and my webserver
really add up, especially as the number of files I need to transfer increases.

The difference between San Diego and Chicago, or San Diego and New York, when
I compare it to running my application locally, makes me really want some kind
of quantum instant communication. But since we don't have that, I really would
like something like a CloudFront. Pretty much for every single application or
web page that I make, I would actually like to have that.

Actually, I think it would be better if everyone and every web site had that,
a way that sites could automatically be cached in servers local to everyone's
city. Wouldn't that be nice?

Which reminded me of the whole concept of content-centric networking.

Here is one related project that I found: <http://code.google.com/p/haggle/>

<http://en.wikipedia.org/wiki/Content-centric_networking>

The hard part about this is that to really be effective it probably means
really changing the way things work. It is tough to ease into it.

This could also help with reducing the amount of data that needed to be
transferred. Maybe we could figure out a way so that every website in the
world would be compressed referencing a very large global dictionary that was
shared on every client (or possibly partitioned for local clusters, but that
is more complicated..)

Regardless of the level of compression, it would still probably be possible to
distribute quite a lot of the trending web content to be cached locally. Maybe
it could be a bit like a torrent client for people's desktops, or maybe web
application servers could have an installed program that participates in the
distribution system and also publishes to it.

Maybe it could be a browser extension or just userscript (Greasemonkey)
(probably has to be an extension) that would cache and distribute web pages
you view. So for example as we are clicking on Hacker News headlines we are
caching those pages on our local machines. Then when another person who has
the same script/extension installed clicks on that headline, it will first
check his local peers, and if I am in the same city and have that file
already, I can give him all of the content in a fraction of the time. If a lot
of people used that extension, the web would be much faster, and it would
solve a lot of problems.

I wonder if there isn't already a system like that. I mean there are probably
RSS feeds that come off of Hacker News and Reddit, and a reader could actually
precache all of that content. But more comprehensively, I bet there is quite a
bit of content that large numbers of say programmers are constantly accessing,
that could benefit from that type of system.

Can we make something like a gzip but not limited to within 32kb or anything,
instead it is a giant dictionary that is a GB in size with all of the most
common sequences for all of the software engineering web sites that are
popular today. Then instead of sending a request to San Francisco or Chicago,
I can just send a request to a guy less than a mile away who also happens to
be interested in Node.js or whatever.

Maybe something like <http://en.wikipedia.org/wiki/Freenet> or an open source
CDN.

Or something like this
[http://en.wikipedia.org/wiki/Osiris_(Serverless_Portal_Syste...](http://en.wikipedia.org/wiki/Osiris_\(Serverless_Portal_System\))

<http://www.osiris-sps.org/download/>

~~~
bigiain
"Actually, I think it would be better if everyone and every web site had that,
a way that sites could automatically be cached in servers local to everyone's
city. Wouldn't that be nice?"

HTTP over NNTP. I see a great need…

~~~
shakesbeard
Aren't ISPs already doing that? (the caching part)

------
urbanjunkie
I assume that this means that versioning of CSS / JS files now works - one of
the problems I've experienced with Cloudfront is updating files that don't
normally change very often (eg CSS). As Cloudfront doesn't support query
strings, changing version number of the following didn't work:
stylesheet.css?ver=201200505, but now it should.

~~~
ceejayoz
We've just been using mod_rewrite to rewrite stylesheet_([0-9]+).css to
stylesheet.css for our CloudFront stuff. Our build scripts pop in the file
modification time, so CF sees a new URL any time we update a file.

