
GitHub Pages with a custom root domain is slow - dieulot
http://instantclick.io/github-pages-and-apex-domains
======
jnewland
Hey folks, Jesse from GitHub Ops here.

First off, if you use a DNS provider that has support for a ALIAS records or
something similar, pointing your apex domain to <username>.github.io will
ensure your GitHub Pages site is served by our CDN without these redirects.

I wish we could provide better service for folks without a DNS provider that
supports ALIAS domains in the face of the constant barrage of DDoS attacks
we've seen against the IPs we've advertised for GitHub Pages over the years.
We made the decision to keep DDoS mitigation enabled for apex domains after
seeing GitHub Pages attacked and going down a handful of times in the same
week. It's a bummer that this decision negatively impacts performance, but it
does certainly improve the overall availability of the service.

FWIW, we considered pulling support for GitHub Pages on apex domains about a
year ago because we knew it'd be slower than subdomains and would require DNS
configuration that would be challenging and frustrating for a large number of
our users. However, we ended up deciding not to go that route because of the
number of existing users on apex domains.

~~~
thisishugo
I think anyone tech savvy enough to be using Pages should also be savvy enough
to understand[0] why the A records can't (realistically) be as fast as the
CNAME alternative, and understand if you make it de facto redundant (i.e.
available, but not actively encouraged or supported).

I think it's fantastic that you provide apex support for _everyone_ even
though it must be exponentially harder to do that just providing CNAMEs, but
if you're upfront about the limitations the only people who are going to
complain are the type of people you don't want to be listening to anyway.

[0] I mean that in the sense that they'll comprehend the explanation, not that
they'll grok it inherently.

------
pilif
_> The short solution is, instead of using yourdomain.com, use
www.yourdomain.com. Then, redirect the root domain to the www subdomain using
a DNS CNAME record._

The root can't be a CNAME because no other record with the same name aside of
a CNAME can exist. Your domain root also has one SOA and two NS records (and
probably one more more MX records if you want to receive mail)

See RFC 1912 (Section 2.4)

~~~
dieulot
Damn, didn’t know that. Thanks, I’ll update.

Edit: Done, seeing changes will need an F5.

~~~
pilif
Note that some DNS providers hack around the issue (like CloudFlate by
pretending your CNAME was in-fact an A record
[http://blog.cloudflare.com/introducing-cname-flattening-
rfc-...](http://blog.cloudflare.com/introducing-cname-flattening-rfc-
compliant-cnames-at-a-domains-root)), but if you're self-hosting DNS or your
DNS provider doesn't do any special handling, then you can't have a root CNAME

------
dknecht
You can actually use CloudFlare and stay on Github pages. In the CloudFlare
DNS editor you can point your root at github's cname address and everything
will work. If you choose not to enable CloudFlare proxy service you can still
use the DNS to flatten Github's cname. See
[http://blog.cloudflare.com/introducing-cname-flattening-
rfc-...](http://blog.cloudflare.com/introducing-cname-flattening-rfc-
compliant-cnames-at-a-domains-root)

~~~
dkulchenko
You can, but then you get automated emails from Github Support telling you
that your DNS config is wrong and that you should be using CNAMEs rather than
A records (since Cloudflare flattens the virtual CNAMEs to As if you do a DNS
lookup).

~~~
jnewland
What domain? I'll get an issue filed to stop sending warning emails in cases
like this. Thanks!

~~~
dkulchenko
It's studio.zerobrane.com (pointing to pkulchenko.github.io/ZeroBraneStudio);
thanks for looking into it!

~~~
jnewland
If you have a subdomain, there's no need to use Cloudflare - if you CNAME this
domain to pkulchenko.github.io you'll use GitHub's CDN automatically.

------
bluesmoon
Your math is somewhat incorrect. First, average[1] page load time is only
relevant if your data distribution is a perfect bell curve. It never is. It's
more likely to be Log-normal, in which case a Geometric mean is a better
number, but again it it's unlikely to be perfectly Log-normal. It's likely to
be double-humped (though you may not notice it), but the median and the entire
distribution are very necessary. You'll find that the median load time is
typically lower than the arithmetic mean, but the 95th or 98th percentile is
typically much higher.

Secondly, you cannot simply divide by 70% to get the load time for 70% of your
users, because again, that assumes a very specific distribution (a linear
distribution, which doesn't exist for any site with more than 5 hits). What
you really need to measure is the "empty-cache" experience, which is different
from the "first-visit" experience, and is harder to measure since it's hard
(but not impossible) to tell when the user's cache is empty.

Lastly, you're assuming a user drop-off rate without looking at your own data
for user drop-off.

You should probably use a real RUM tool that shows you your entire
distribution, but also shows you how users convert or bounce based on page
load time. Looking at actual data can be surprising and enlightening (I've
been looking at this kind of data for almost a decade and it still surprises
me and forces me to change my assumptions).

My company (SOASTA) builds a RUM tool (mPulse), which you can use for free.
Other companies like pingdom, neustar, keynote, etc. also have RUM solutions,
or you can also use the opensource boomerang library
([https://github.com/lognormal/boomerang/](https://github.com/lognormal/boomerang/)
disclaimer, I wrote this... BSD licensed) along with the opensource boomcatch
server
([https://github.com/nature/boomcatch](https://github.com/nature/boomcatch)).

------
probonogeek
Can someone explain how "Visitors to this site’s index page have an average
page load time of 3.5 seconds. 70% of those are here for the first time. 3.5 ÷
70% = 5. So first time visitors have an average page load time of 5 seconds."
makes any kind of mathematical sense?

If only 10% of visitors were first time, would that mean their average page
load speed was 35 seconds? This is some crazy use of the word "average".

~~~
msuss
The assumption is that repeat visitors have a load time of 0.

~~~
probonogeek
As a builder of an internal app that is essentially 99.999% repeat visitors, I
need to figure out a way to replicate that behavior!

------
IceCreamYou
This article links to [1] as an explanation for the delay, but that article
says at the top that Github has since updated their configuration instructions
to help people avoid the issue.

[1] [http://helloanselm.com/2014/github-pages-redirect-
performanc...](http://helloanselm.com/2014/github-pages-redirect-performance/)

------
dublinben
I think that this can vary quite a bit. My simple GitHub page loads in roughly
1 second. I don't think it's ever taken as long as 5 seconds.

The linked site loads in less than half a second, but it costs $5 a month just
for a simple page.

~~~
buttscicles
If you're using github pages you're only hosting flat files, so S3 is another
viable option.

~~~
davidcelis
That's not necessarily true. My own site is a Jekyll site. To host that on S3,
I'd need to generate it first and upload the generated files as opposed to my
source files. Now that's not really a big deal, but I do enjoy the convenience
of only having to do a `git push` to deploy my site on Pages.

That being said, I notice times similar to another commenter above, around
1-2s usually. I don't think I've seen a five second load time.

~~~
lazerwalker
This sounds like the sort of thing that could easily be automated using a
five-line Bash (Ruby, Python, etc) script.

~~~
walesmd
I do something very similar to this, using Wintersmith and shell scripts. It
essentially boils down to using two repositories for my site: the first being
the raw/ungenerated files including the shell scripts, the second being the
generated files that are served by GitHub pages.

------
mayop100
If you’re having trouble with a root domain on Github pages you may want to
check out the Hosting product we (Firebase) just announced. It handles naked
domains by having your root A record point to an Anycast IP that serves
content from a global CDN. It’s lightning fast. We also support SSL (full SSL,
not just SNI) and do the cert provisioning automatically for you.

Check it out: [https://www.firebase.com/blog/2014-05-13-introducing-
firebas...](https://www.firebase.com/blog/2014-05-13-introducing-firebase-
hosting.html)

~~~
vulf
That's also $50/mo minimum to use a custom domain. Github's static page
hosting is free.

------
nfriedly
BTW, if you GitHub Pages site is www.example.com, you can point the root
domain (example.com) to the GitHub pages IP and they will redirect any 'naked'
visitors to the www version.

In other words, they make it really easy to make your site fast but still
catch users that didn't bother typing 'www.'

[https://help.github.com/articles/setting-up-a-custom-
domain-...](https://help.github.com/articles/setting-up-a-custom-domain-with-
github-pages#step-2-configure-dns-records)

------
shawnz
The article notes that DNSimple's ALIAS records avoid this problem. Would the
same thing be true of CloudFlare's new "flattened CNAME" records?

~~~
ninkendo
I didn't see such a note, but I'm not sure it would be true, either.

DNSimple doesn't actually implement a new DNS record type, it simply puts a
TXT record on your domain that says "ALIAS for some.fqdn", and presumably it
causes their DNS servers to do a recursive lookup for you (to whatever's in
the TXT record) when you try and look at the A record for the naked domain.

From github's DDoS prevention's point of view the result is the same: an A
record lookup points to their IP. They don't know that you got there by way of
looking at DNSimple's servers and their ALIAS technique.

~~~
X-Istence
No, the result is not the same. When you look up the records for the
<yourusername>.github.io you get a different set of records than the singular
IP address they tell you to add if you want to use the apex domain!

So from Github's DDoS prevention's point of view, the result is different.

~~~
ninkendo
So the answer to the issue is that the IP github tells you to use is the slow
one? That sounds strange.

What's to stop users from doing their own lookup, and setting their A record
to what the result is?

~~~
shawnz
I believe the reason is that the *.github.io hosts point to a CDN rather than
just having a single A record, and it is only when going through the CDN that
you bypass the "neutering". Regarding your second question, it seems that
github issues a warning if you do that:

[https://news.ycombinator.com/item?id=7738913](https://news.ycombinator.com/item?id=7738913)

------
philip1209
If you are technical enough to understand (and care about) the implications of
this issue, consider hosting on S3. Hosting costs me about $2 per month on
lower-traffic websites. The s3_website gem makes it straightforward. Response
times are reasonable and inelastic with regard to traffic.

If you are aiming for the fastest speed possible, check out the s3_website gem
support for Cloudfront - you can host your whole static website through a CDN.

~~~
atmosx
Aren't we all _technical enough_ to understand the implications? The author
makes clear, the blog is _hosted on DigitalOcean_.

The thing is that very few blogger drive enough traffic to make money out of
their blog[1]. If that's not the case, then why bother? If the content is good
and _free_ then waiting 2 seconds more is acceptable IMHO. :-)

[1]: [http://daringfireball.net/](http://daringfireball.net/) \- JG being
probably the most prominent example.

------
colevscode
Cole with Brace here ([http://brace.io](http://brace.io)). We recommend
redirecting the apex domain to a "www" subdomain. Note that even using apex
CNAME records (Alias records) are still a new idea, and depending on the
implementation may reduce reliability or performance.
([https://iwantmyname.com/blog/2014/01/why-alias-type-
records-...](https://iwantmyname.com/blog/2014/01/why-alias-type-records-
break-the-internet.html))

Here are a few resources from our blog that explain the www redirect approach:

\- [http://blog.brace.io/2014/01/17/cnames-
aliases/#cnameconfig](http://blog.brace.io/2014/01/17/cnames-
aliases/#cnameconfig)

\- [http://blog.brace.io/2014/01/19/custom-domains-
godaddy/](http://blog.brace.io/2014/01/19/custom-domains-godaddy/) (step 3)

(edited: added resources)

~~~
treitnauer
I also wrote an article about ALIAS-type DNS records for CNAME functionality
on naked domains and alternatives last week:

[https://iwantmyname.com/blog/2014/05/alias-type-dns-
records-...](https://iwantmyname.com/blog/2014/05/alias-type-dns-records-for-
cname-functionality-on-naked-domains.html)

Hope it's helpful!

~~~
aeden
I'm glad to see the turnaround from the original post that IWMN published
earlier this year, Timo, thanks for that.

------
geuis
I'm not sure I agree 100% with the title being changed. Originally is was
"GitHub Pages with a custom root domain loses you 35% of your visitors", which
after reading the story is not really what its about.

Also, if mods are going to change titles at least get the grammar right.
"Pages ... ARE slow" not "Pages ... IS slow".

~~~
dang
Grammar, huh. Fighting words!

"GitHub Pages" is the name of a product, therefore it's singular. (One
giveaway is the capital 'P'. If "pages" had been a generic plural, it would
not have been capitalized.) The New York Times publishes articles every day,
The Royal Tenenbaums is a Wes Anderson movie, and GitHub Pages, according to
this article, is sometimes slow.

As for the claim "loses you 35% of your visitors", it is (a) dubious, (b)
linkbait, and (c) violates the HN guideline against putting arbitrary numbers
in titles. Happy to change it to something better if you or anyone suggest
it—but editing that bit out was not a borderline call.

------
Joeboy
Is 5s really such a problem? I don't think I'd bail out of a website because
it took 5s to load, unless it was something I didn't particularly want to see
anyway. Which I guess might be why people didn't stick around for the tests
from which the 35% number is drawn.

~~~
krenoten
Yes. Numerous reputable entities have published reports demonstrating that
users notice quite a lot. Amazon claims that every 100ms costs them 1% of
revenue. Google claims 500ms costs them 20% of traffic. 5 seconds is a fucking
eternity, and anything you expose to users on the web with such horrible
performance will suffer greatly because of if. One exception may be banks.
Users are more forgiving of latency as their financial connection to it
increases.

~~~
Joeboy
I guess I find this plausible if we're talking about n ms multiplied by the
number of resources loaded, and your page doesn't render progressively. If
we're talking about total load time, I don't see why you'd even bother
clicking a link if you weren't prepared to wait a few seconds for it to load.

Edit: in the case of Google and Amazon, I can believe that being slow will
cause users to defect to other services. I don't believe that anybody will not
bother to read documentation because it takes a second to load.

Edit2: If this is true, can anybody explain why users behave in this seemingly
bizarre way? Do _you_ give up on pages after 500ms? Have you seen anybody else
do that? What is going on?

~~~
insensible
Look at page speed vs. bounce rate in Google Analytics for any well-trafficked
site. People are frequently casually clicking around, and the faster you have
content up on the screen the more casual users will engage.

By analogy, to get into a different mindset, think about channel-surfing on
the TV. If other channels show a picture in 0.2s, and as you flip around
there's a channel that takes 0.8s to show a picture, are you more or less
likely to surf past the slow channel?

~~~
Joeboy
Thank you for replying! I got a staggering number of unexplained downvotes
before anybody was prepared to talk to me.

Thinking about it I can well believe that if you want people to stay on your
site and click around, probably because you want to show them ads or products,
a small delay will impact on the number of clicks. I would imagine that's not
the motivation for most github pages sites though.

------
atmosx
Nice study, although a bit debatable but good catch whatsoever, made some
points there.

However, Github hosting is made by programmers for programmers OR _at least_
computer literate people. So it's exactly the group how ought to _know_ when
it's time to move to private hosting :-)

------
RazorX
Even after reading these comments and the new docs it is unclear to me the
correct way to use an apex domain on a Project Pages site using CNAME on the
root (with CloudFlare) to avoid this issue.

I use a subdomain of that main domain as the User Pages site.

How should DNS be setup and how should the CNAME files on GitHub read?

As an example, the domain on the left should load the site normally hosted
from the location on the right:

example.com -> username.github.io/blog

io.example.com -> username.github.io

Is this possible?

------
kyledrake
Hi, Kyle with Neocities here. We support custom domains for sites too!

We use an A record right now for root domains because DNS does not support
root domain CNAMEs, and as a consequence have very similar problems.

The only practical way to deal with the problem is to redirect root visitors
to www. If you go to google.com, you will notice that they do the same thing
and redirect to www from a proxy somewhere. Our next implementation will
probably do the same.

~~~
dieulot
Aw, that’s a shame. :(

Thanks for your honesty! I updated the article.

------
wyuenho
I am running backgridjs.com and I can confirm the author's results. I guess
that means I should try putting www in front and see how that goes.

------
cmalpeli
As I understand it, this is a similar issue on any app hosted on Heroku. You
need to CNAME to WWW and then 301 redirect non WWW to WWW. Alternatively you
can use DNS providers such as DNSimple who support ALIAS records.

[https://devcenter.heroku.com/articles/moving-to-the-
current-...](https://devcenter.heroku.com/articles/moving-to-the-current-
routing-stack)

------
suedadam
Many CDNs such as Akamai, Incapsula, CDNSolutions, etc. Would be able to do
the same thing;however, I wouldn't go as far as saying to leave Github Pages
completely, I've found that CDNSolutions in front of Github pages load
insanely fast. That could be the case for any site setup properly on a service
such as Incapsula, CDNSolutions, etc.

------
bobfunk
Great demonstration of the importance of load times!

BitBalloon ([https://www.bitballoon.com](https://www.bitballoon.com)) will
give you better speed with a root domain, but as with any other host you'll
still loose out on some of our baked in CDN support if you don't have a DNS
host with ALIAS support for apex records.

------
albertoleal
You may defer the DNS to CloudFlare. See the following for specific setup:
[http://davidensinger.com/2014/04/transferring-the-dns-
from-n...](http://davidensinger.com/2014/04/transferring-the-dns-from-
namecheap-to-cloudflare-for-github-pages/)

------
rudyrigot
Well, that other news of the day seems to bring an alternative:
[https://news.ycombinator.com/item?id=7738801](https://news.ycombinator.com/item?id=7738801)

------
tmenari
"Then, redirect the root domain to the www subdomain using a DNS CNAME
record."

But you aren't meant to CNAME the root zone, in case you have other records at
that level (MX, NS, SOA etc.)?

------
michaeljdeeb
I believe these redirects are also the reason why open graph data for Facebook
and Twitter cards won't render.

Running my site through their validators said too many redirects occurred.

------
hrjet
Why can't GitHub employ the DDOS mitigation behavior only during an active
DDOS attack? I assume such attacks are not that frequent; perhaps once a week
at most?

------
mslot
You could create an Amazon CloudFront distribution with your github domain as
the origin and use Route 53 to set up a root domain without CNAME tricks.

------
igor47
a better solution than doing DNS trickery is just doing proxying, but this
requires some other machine to serve as the proxy. i serve my github pages
blog on multiple domains, as described here: [http://igor.moomers.org/github-
pages:-proxying-and-redirects...](http://igor.moomers.org/github-
pages:-proxying-and-redirects/)

------
qwook
Thanks for the head's up. Just updated my github page to redirect to www and I
can see a massive improvement.

------
tedchs
Sounds like if you use Github Pages with a zone apex URL, you're losing out on
their CDN.

~~~
jnewland
That's exactly right.

~~~
naturalethic
Defitively

------
bitJericho
You shouldn't use a naked domain anyways, you'll never be able to grow a site
on a naked domain properly for various reasons.

~~~
robinson-wall
Would you like to expound on that at all?

~~~
bitJericho
Sure, here are a couple references that form my opinion:

[https://devcenter.heroku.com/articles/apex-
domains](https://devcenter.heroku.com/articles/apex-domains)
[http://www.hyperarts.com/blog/www-vs-non-www-for-your-
canoni...](http://www.hyperarts.com/blog/www-vs-non-www-for-your-canonical-
domain-url-which-is-best-and-why/)

No doubt there's ways around any problem with a naked domain, but why work so
hard on something so trivial? No user has ever turned away from a website
because it hadd "www." in front. That said, your naked domain surely needs to
redirect to your "www." address if you set it up this way.

~~~
clarkdave
It's not hard work to skip the "www" these days. DNS providers like Cloudflare
support CNAME-like functionality on the apex domain, and if you're using AWS
then Route 53 provides special "alias" records which let you hook the zone
apex on to an ELB, for example. I'm sure other providers have similar
functionality.

As for why, well, personally I prefer the look of a domain without the "www".
It looks cleaner to me.

~~~
bitJericho
Those are fair enough reasons. I see being tied (permanently) to a provider
like Cloudflare or AWS as a problem. I'd rather use the www and be allowed to
move to providers that don't necessarily offer the same features, or to my own
infrastructure where that is or is not an option.

Let's agree that for the most part it's a bad idea to change from www to naked
or the other way around after the launch of a website (for seo reasons). So
you have to pick one at launch and try to stick with it. Why choose the option
that looks nicer but has problems associated with it and potential vendors not
supporting it, vs the one that arguably looks messy but that all users
everywhere are well accustomed to and has none of the configuration issues
that affect naked domains?

There's postmortems out there about using naked domains and DDoS attacks.
There's issues with load balancing, with domain configuration, with cookies.

If your website gets overrun by HNers, what's your plan to compensate quickly?
How much of your plan is bogged down by the fact that you're on a naked
domain?

------
prottmann
I don't believe the lose.

On a GitHub page are programming specific solutions for a problem a developer
has.

When somebody search for a problem or find a link to a GitHub Project, he/she
will visit the page.

All others don't have an urgent problem to solve, so you loose only users,
that not need your solution. Can live with it. ;-)

~~~
Joeboy
That's what I thought too, although as far as I can tell it's possible to host
any content on github pages. I suspect the majority of it is programming
projects though, and programmers are not going to give up that easily.

~~~
prottmann
Thats what i mean, if i want an information i try to get it and not give up
because a server take 5 seconds "the first time".

Common Enduser surfin the web are an other species, but what should they seek
on a github page ?

