

Show HN: AWS CloudFront (CDN) for dynamically generated content - zubairov
http://blog.elastic.io/post/22773181715/how-we-use-amazon-cloudfront-for-dynamically-generated

======
mryan
Nice post, thanks for sharing.

I recently had a similar problem - I needed to serve dynamically resized
images via CloudFront (with the images being stored in S3). Images are resized
on-demand - if they exist they are served directly from S3. If not, they are
resized and stored in S3, then the next request will serve them directly from
S3.

It works rather well, although it is quite susceptible to the thundering herd
problem - putting it behind a reverse proxy that supports collapsed forwards
would probably help here.

I've put the WSGI app I used for this on github - hopefully someone finds it
useful! <https://github.com/mikery/s3cacher>

------
tfennelly
Impressive. On FoxWeave, we're using Rackspace (at the moment). Fairly sure
they don't offer anything like this, which probably means we'll have to use a
caching reverse proxy.

Nice post... thanks.

------
drobiazko
What about the CloudFront pricing? Is it cheaper than scaling own servers?

~~~
zubairov
Good question. It depends on the caching strategy. As we essentially serve
Gists we have two types of URLs one something like [gist-id]/ meaning the
latest version of that particular gist and another URL is [gist-id]/[revision-
id] which is a particular revision. For 'latest' URL we just do a redirect to
particular latest version. For first type of resource we have 60 seconds
expire, however it is also not too expensive to re-generate it. For second
type of resource (with particular revision) we have much higher expire time
(several days). By this type of caching strategy I assume we will get only 20%
of all requests to our servers so saving is significant.

~~~
madarco
Since there is a 1005ms latency for a cache-miss, for the average user the
loading times are shorter? (based on your data on hit/miss ratio)

I was thinking to use Cloudfront to lower the latency (not costs) for dynamic
resources, because in Italy there isn't an ec2 datacenter, but there is an
edge location for Cloudfront.

~~~
zubairov
Hi,

No 1005 ms was not a cache miss it was a request to the origin server. Latency
was 799 ms.

I assume CloudFront will contribute some latency, however in our case (we are
using Heroku) our servers were already on Amazon platform, therefore
CloudFront --> our server latency should be significantly lower. So CloudFront
cache miss latency for the enduser will be not significantly higher than
origin server latency for the enduser.

~~~
madarco
Thanks, I've asked because I've read somewhere that an average request for a
"Miss from cloudfront" object was ~1 second, and I thought that your data was
confirming that value.

