
How To: Hosting with Amazon S3, CloudFront and Route 53 - PStamatiou
http://paulstamatiou.com/hosting-on-amazon-s3-with-cloudfront/
======
sehrope
Nice write up. We have a very similar setup ( _Jekyll generated static site +
S3_ ) for our website[1] and reading through this is kind of nostalgic of
getting it set up ( _and a friendly reminder to go back and gzip some of CSS
files_ ).

The biggest plus of this setup is that once it's deployed you don't think
about it. It just works and you never think about scaling. Oh and it's cheap (
_seriously it 's like peanuts a month as all you pay for is bandwidth at
$.10/GB_).

The biggest negative is getting SSL. CloudFront supports it but it's expensive
( _$600 /mo see [2]_). Compare that to the pennies it costs to host the non-
HTTP site on S3. In our case our cloud app is on completely separate domain
(SSL-only) and our public site is informational only so the trade off works.
The only SSL enabled link on our public site is for our contact GPG key and
it's linked directly to the HTTPS S3 URL.

[1]: [http://www.jackdb.com/](http://www.jackdb.com/)

[2]:
[http://aws.amazon.com/cloudfront/pricing/](http://aws.amazon.com/cloudfront/pricing/)

~~~
relix
Why put the GPG key on HTTPS page linked from a HTTP page? If the HTTP site is
compromised through MITM the attacker can easily change the link to a bucket
he controls, that is also HTTPS (i.e.
[https://s3.amazonaws.com/secure.jackdb.com/pgp/security_at_j...](https://s3.amazonaws.com/secure.jackdb.com/pgp/security_at_jackdb_dot_com.asc)).

I don't think it adds anything to security, but actually provides for a fake
feeling of safety.

~~~
sehrope
You're absolutely right about being able to MITM the HTTP piece and replace
the content. That's true for any mixed content site. In this case though I
disagree that having the HTTPS link to S3 is entirely useless. It's used
specifically for an SSL link to download our GPG key, that additionally is
available on a number of key servers and indexed by search engines like that
too[1]. In that usage it's one of many ways of getting that key and, like all
GPG keys, should really be verified before use anyway. For just about anything
else though I agree that mixed content is a very bad idea.

[1]:
[https://www.google.com/search?q=jackdb+gpg](https://www.google.com/search?q=jackdb+gpg)

~~~
relix
Alright, I thought I was missing something :)

------
ctcliff
I wrote an npm module to automate this workflow. You can read about it at
[http://caisson.co/](http://caisson.co/).

Simplifies the process to a couple commands:

    
    
      $ caisson init yoursite.com
      $ caisson push

~~~
primitivesuave
This is so awesome! You've taken a 15 minute process and condensed it to 15
seconds.

------
neals
Hi! I hope somebody can answer me this.

Why do you need this DNS routing? I tried to Google and see a large offer of
"hosted DNS" services, but I don't understand something:

I have a small site. It runs over at Digital Ocean. I point the DNS records of
the domainname to the Digitial Ocean server by putting then into the text-
boxes where I log into the domain-name-reseller.

Where in all this would I require a more advanced solution?

~~~
timdorr
CloudFront has different IPs for different regions and manages this via DNS.
So, if you're on an ISP peered directly with CloudFront and make a request
using your ISPs DNS servers, CloudFront's DNS might give you back the IP to a
server directly on that peering circuit. These kinds of relationships and new
servers are added/removed all the time, so the IPs change regularly. A hard-
coded IP in an A record won't work because of this. It works for your DO VPS
because the IP is static and there is only one server.

~~~
timrivera
there's also the IP Anycast [1] option. That's what you get with OVH CDN [2],
for example, but not with CloudFront. Anycast routing is totally independent
from DNS, thus it works fine with an A record and also makes it easy to
install SSL certificates.

[1]
[http://en.wikipedia.org/wiki/Anycast](http://en.wikipedia.org/wiki/Anycast)

[2] [https://www.ovh.co.uk/cdn/](https://www.ovh.co.uk/cdn/)

~~~
hhw
Anycast just refers to BGP announcing an IP address/range out of multiple
locations. Most commonly, it's done with DNS because UDP is connectionless.
It's becoming increasingly acceptable to do full anycast, i.e. have actual
TCP/HTTP(S) on an anycast IP, but there are risks to that approach. If routing
changes and you end up changing to a different location during a TCP session,
the new location won't have all the state information needed, thus dropping
the connection. It's theoretically possible to keep such state information
synchronized, but it's sufficiently complex that nobody is doing so. This is
why most of the larger, more established CDN's anycast DNS only and don't do
full anycast.

------
mattdeboard
One big warning here.

If you are wanting to serve static content for multiple domains (e.g.
somefont.ttf for foo.example.com, bar.example.com and baz.example.com from a
single CloudFront distribution) CloudFront is not your solution, because
CloudFront does not vary its cache on the Origin header. So if your first
visitor is loading foo.example.com/static/fonts/somefont.ttf, then the Access-
Control-Allow-Origin header for somefont.ttf will be set to "foo.example.com".
Subsequent requests for that file from (bar|baz).example.com will fail with a
CORS error.

It was a pretty shocking thing to find out. We've concluded AWS/CloudFront
isn't a viable CDN until this is fixed. Based on the following thread, it
isn't clear when or if it will be fixed:
[https://forums.aws.amazon.com/thread.jspa?threadID=114646#](https://forums.aws.amazon.com/thread.jspa?threadID=114646#)

~~~
prostoalex
Any reason to serve this from a single distribution? Their UI basically tells
you in no uncertain terms that subdomain == distribution.

~~~
mattdeboard
Where does it say that? I'm not talking about assigning foo.example.com as the
domain name for the distribution.

~~~
prostoalex
Right, this is still the scenario where you're using the distribution hostname
they give you, e.g. abcdef1234.cloudfront.net

[http://docs.aws.amazon.com/AmazonCloudFront/latest/Developer...](http://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/WorkingWithDownloadDistributions.html#DownloadDistValuesDomainName)

"Changing the origin does not require CloudFront to repopulate edge caches
with objects from the new origin. As long as the viewer requests in your
application have not changed, CloudFront will continue to serve objects that
are already in an edge cache until the TTL on each object expires or until
seldom-requested objects are evicted."

------
HeyImAlex
If you're needing an s3 deployment library for stuff like this, I'm planning a
major merge on mine (s3tup) later tonight or tomorrow. It uses yaml files to
declaratively control configuration of buckets and keys, and makes it nicer to
do more complex things like setting appropriate headers based on pattern
rules. Check it out here.

[http://github.com/heyimalex/s3tup](http://github.com/heyimalex/s3tup)

------
bobfunk
I built BitBalloon ([https://www.bitballoon.com](https://www.bitballoon.com))
to simplify all of this, while bringing benefits such as atomic deploys,
built-in form processing, automatic gzipping, bundling and minification of
your assets and perfect cache headers.

We have a comparison with S3 here:
[https://www.bitballoon.com/blog/2013/12/03/bitballoon-
amazon...](https://www.bitballoon.com/blog/2013/12/03/bitballoon-
amazon-s3-comparison)

------
davidcollantes
I use Namecheap DNS (free, as they are my registrar). I can control
everything, including APEX. It has never fail me.

And for hosting, Github pages (Jekyll rocks!) do a great job. I think you are
still paying too much, Paul.

~~~
itafroma
> And for hosting, Github pages (Jekyll rocks!) do a great job.

Using GitHub pages to auto-build your Jekyll site is great if and only if you
have no need to customize your Jekyll build or website environment:

\- You must use the versions deployed by GitHub; most of the time they're up
to date, but if there's a bug fix you're relying on in any of the libraries,
you're out of luck. I ran into this on my own site: RedCarpet had a Markdown
parsing bug that was fixed in version 3, but the Jekyll that GitHub Pages uses
depends on version 2. Jekyll loosened this dependency weeks ago, but it's not
in a full release yet.

\- You can't use any Jekyll plugins, even the ones marked safe.

You can avoid the above by building your site locally and uploading the
output, there are still a few other caveats:

\- Like timrivera mentioned, there is no support for server-side redirects
outside the baked-in ones (i.e. www to or from naked domain; name.github.io to
domain). S3 has these out of the box.

\- You can only use one domain per GitHub Pages repository.

\- GitHub will allow Googlebot to index your repository's master branch. If
you're used to that convention, you're out of luck: it's hardcoded into
github.com's robots.txt. You need to use a different name for your mainline
branch.

\- You cannot use a private repo for GitHub Pages, and GitHub's terms of
service require you to allow other people to fork your public repositories,
regardless of license.

\- There is no support for SSL.

If you're okay with all of that, GitHub Pages is totally fine. But if you
aren't, non-VPS alternatives like S3 are pretty attractive.

~~~
56267gs6h7u5uu
Re: Google--if you use a CNAME for your site on GitHub pages, will your site
also show up as name.github.io in google search results?

~~~
itafroma
It shouldn't: name.github.io → example.com is a 301 redirect. However,
[https://github.com/name/name.github.io/blog/master/*](https://github.com/name/name.github.io/blog/master/*)
and
[https://github.com/name/name.github.io/tree/master/*](https://github.com/name/name.github.io/tree/master/*)
_will_ show up in Google searches.

Here's an example, using the atmos.org example the GitHub Pages documentation
uses:
[https://www.google.com/search?q=Saying+how+it+was+%E2%80%9Cs...](https://www.google.com/search?q=Saying+how+it+was+%E2%80%9Csuper+light+and+fast+webserver+with+really+good\(better+than+pound\)+proxy+module.%E2%80%9D)

One result for atmos.org, and then a duplicate result for
[https://github.com/atmos/atmos.github.io/blob/master](https://github.com/atmos/atmos.github.io/blob/master).
Here's a screenshot in case you see something different:
[http://i.imgur.com/TavoyuW.png](http://i.imgur.com/TavoyuW.png)

The only way to prevent that from happening is to avoid using a branch named
"master" in your repository.

------
Wouter33
I'm already hosting a website with this setup. Works perfectly and is blazing
fast. I recommend it for everyone.

The website is a static marketing front for a web app that is being served
from a SSL subdomain on another cluster. The only thing i'm doubting about is
that i want to offer a one input field e-mail signup on the frontpage, which
of course, will be without SSL in this setup. What would you do? Skip this
fast signup and put the whole signup on the subdomain or use the signup with a
post to the SSL page (less secure)?

~~~
Kudos
I'd probably resort to an iframe on the SSL site.

~~~
Wouter33
Isn't this as safe as just posting to an SSL page? If there is a MITM-attack
they will just replace the contents of the iframe to another page?

------
subpixel
I built my first mobile-first layout using his writeup(s), and will likely
move from Heroku to S3 using this one. High five.

~~~
PStamatiou
thanks for reading!

------
dirktheman
Stammy! Nice writeup! There's a really simple way of redirecting your naked
domain to the www-bucket at S3: just point the naked domain to ip
174.129.25.170 and it will redirect automatically. Just be warned it's a free
service.

~~~
rattray
Is there documentation of this anywhere?

~~~
HeyImAlex
[http://wwwizer.com/naked-domain-redirect](http://wwwizer.com/naked-domain-
redirect)

------
nthitz
Great writeup. At the end he links to the AWS docs for this whole process
which I found equally if not more helpful.
[http://docs.aws.amazon.com/gettingstarted/latest/swh/website...](http://docs.aws.amazon.com/gettingstarted/latest/swh/website-
hosting-intro.html) but OP's tutorial definitely has some extra informative
tips

~~~
PStamatiou
Yeah the key point is that if you know you are going to do Cloudfront, the AWS
docs make you do extra steps. They have you setup DNS for S3, then change it
for CF.

------
SkyMarshal
Paul's general workflow also works with just Grunt and one of its many S3
plugins. For example, you can clone the Bootstrap github repo (which comes
with a nice Grunt build config), npm install an S3 plugin, add S3 deployment
tasks to Gruntfile.js, and boom - static site generator and deployer.

------
applecore
I feel like SSL/TLS is a requirement for websites in 2014.

Does Amazon S3 and CloudFront support HTTPS?

~~~
HeyImAlex
Even worse than the $600 a month, there's no way to _disable_ ssl on your site
if you're serving off cloudfront, so if one of your users makes an https
request and you're _not_ shelling out the cash for custom certs they'll be
greeted by a big red warning page in chrome (returned cert for
*.cloudfront.com).

~~~
timrivera
This is actually a great point _against_ using custom domains (CNAMEs) with
Cloudfront. At least if you can't afford the custom SSL certficate option.

Cloudflare somehow got this right. They serve non HTTPS enabled web sites with
different IP addresses so that you can never reach them over HTTPS (could be
better "This webpage is not available" vs. scary red "This is probably not the
site you are looking for!" message in Chrome). Plus, they have a great free
anycast DNS network that can be compared to Route 53. And best off all, you
never pay for the bandwidth.

------
elliottkember
Nice! We actually built a service to do this:
[https://getforge.com/](https://getforge.com/) including a few other nice
static hosting tweaks. Takes the hassle out of dealing with Amazon.

------
justinhj
A cost effective alternative to this (I'm open to being corrected) is to use a
cheap server (say a $5 monthly box from Digital Ocean) and Cloudflare (which
is free).

------
LogicX
FWIW, after evaluating many solutions, I'm switching my DNS from zerigo to
DNS.he.net - free, featureful, and backed by a company I believe will be
around.

