First off, if you use a DNS provider that has support for a ALIAS records or something similar, pointing your apex domain to <username>.github.io will ensure your GitHub Pages site is served by our CDN without these redirects.
I wish we could provide better service for folks without a DNS provider that supports ALIAS domains in the face of the constant barrage of DDoS attacks we've seen against the IPs we've advertised for GitHub Pages over the years. We made the decision to keep DDoS mitigation enabled for apex domains after seeing GitHub Pages attacked and going down a handful of times in the same week. It's a bummer that this decision negatively impacts performance, but it does certainly improve the overall availability of the service.
FWIW, we considered pulling support for GitHub Pages on apex domains about a year ago because we knew it'd be slower than subdomains and would require DNS configuration that would be challenging and frustrating for a large number of our users. However, we ended up deciding not to go that route because of the number of existing users on apex domains.
I think anyone tech savvy enough to be using Pages should also be savvy enough to understand[0] why the A records can't (realistically) be as fast as the CNAME alternative, and understand if you make it de facto redundant (i.e. available, but not actively encouraged or supported).
I think it's fantastic that you provide apex support for everyone even though it must be exponentially harder to do that just providing CNAMEs, but if you're upfront about the limitations the only people who are going to complain are the type of people you don't want to be listening to anyway.
[0] I mean that in the sense that they'll comprehend the explanation, not that they'll grok it inherently.
Possibly not a question you can answer but maybe someone else here can—what are the typical patterns for changing the content of the record? Is the content dynamic based on the requesting resolvers address and other factors? If so does the EDNS client subnet opt come in to play at all?
(I work on DNS things and am curious about what exactly a CDNs needs are.)
>The short solution is, instead of using yourdomain.com, use www.yourdomain.com. Then, redirect the root domain to the www subdomain using a DNS CNAME record.
The root can't be a CNAME because no other record with the same name aside of a CNAME can exist. Your domain root also has one SOA and two NS records (and probably one more more MX records if you want to receive mail)
Note that some DNS providers hack around the issue (like CloudFlate by pretending your CNAME was in-fact an A record http://blog.cloudflare.com/introducing-cname-flattening-rfc-...), but if you're self-hosting DNS or your DNS provider doesn't do any special handling, then you can't have a root CNAME
Instead of "redirect the root domain to the www subdomain using a DNS CNAME record" it should say "... using domain forwarding".
Most DNS hosts offer some mechanism for forwarding traffic from your apex domain to the www subdomain using a 301 (permanent) redirect. Then the www subdomain can be configured with a CNAME record.
You can actually use CloudFlare and stay on Github pages. In the CloudFlare DNS editor you can point your root at github's cname address and everything will work. If you choose not to enable CloudFlare proxy service you can still use the DNS to flatten Github's cname. See http://blog.cloudflare.com/introducing-cname-flattening-rfc-...
You can, but then you get automated emails from Github Support telling you that your DNS config is wrong and that you should be using CNAMEs rather than A records (since Cloudflare flattens the virtual CNAMEs to As if you do a DNS lookup).
Your math is somewhat incorrect. First, average[1] page load time is only relevant if your data distribution is a perfect bell curve. It never is. It's more likely to be Log-normal, in which case a Geometric mean is a better number, but again it it's unlikely to be perfectly Log-normal. It's likely to be double-humped (though you may not notice it), but the median and the entire distribution are very necessary. You'll find that the median load time is typically lower than the arithmetic mean, but the 95th or 98th percentile is typically much higher.
Secondly, you cannot simply divide by 70% to get the load time for 70% of your users, because again, that assumes a very specific distribution (a linear distribution, which doesn't exist for any site with more than 5 hits). What you really need to measure is the "empty-cache" experience, which is different from the "first-visit" experience, and is harder to measure since it's hard (but not impossible) to tell when the user's cache is empty.
Lastly, you're assuming a user drop-off rate without looking at your own data for user drop-off.
You should probably use a real RUM tool that shows you your entire distribution, but also shows you how users convert or bounce based on page load time. Looking at actual data can be surprising and enlightening (I've been looking at this kind of data for almost a decade and it still surprises me and forces me to change my assumptions).
My company (SOASTA) builds a RUM tool (mPulse), which you can use for free. Other companies like pingdom, neustar, keynote, etc. also have RUM solutions, or you can also use the opensource boomerang library (https://github.com/lognormal/boomerang/ disclaimer, I wrote this... BSD licensed) along with the opensource boomcatch server (https://github.com/nature/boomcatch).
Can someone explain how "Visitors to this site’s index page have an average page load time of 3.5 seconds. 70% of those are here for the first time. 3.5 ÷ 70% = 5. So first time visitors have an average page load time of 5 seconds." makes any kind of mathematical sense?
If only 10% of visitors were first time, would that mean their average page load speed was 35 seconds? This is some crazy use of the word "average".
This article links to [1] as an explanation for the delay, but that article says at the top that Github has since updated their configuration instructions to help people avoid the issue.
But I noticed that my DNS zone is quite different to how Github now tell you to do it (I have an A record to 204.232.175.78). So perhaps that is a factor.
Came here to say that. My github page (just flat html) loads in ~65ms. Granted, 65ms to load a couple kb of text isn't awesome, but it's not nearly slow enough to optimize for me.
That's not necessarily true. My own site is a Jekyll site. To host that on S3, I'd need to generate it first and upload the generated files as opposed to my source files. Now that's not really a big deal, but I do enjoy the convenience of only having to do a `git push` to deploy my site on Pages.
That being said, I notice times similar to another commenter above, around 1-2s usually. I don't think I've seen a five second load time.
I do something very similar to this, using Wintersmith and shell scripts. It essentially boils down to using two repositories for my site: the first being the raw/ungenerated files including the shell scripts, the second being the generated files that are served by GitHub pages.
s3_website[0] is a very neat solution to this. It integrates automatically with Jekyll. A simple 'jekyll build && s3_website push' uploads all your changes to S3. I'm using it to power all my static sites. It'll even automatically invalidate your Cloudfront distributions, if you like.
It could cost zero with a free host if I’d wanted to and still be as fast. The real performer here is CloudFlare and its edge cache, which doesn’t hit the server most of the time.
If you’re having trouble with a root domain on Github pages you may want to check out the Hosting product we (Firebase) just announced. It handles naked domains by having your root A record point to an Anycast IP that serves content from a global CDN. It’s lightning fast. We also support SSL (full SSL, not just SNI) and do the cert provisioning automatically for you.
BTW, if you GitHub Pages site is www.example.com, you can point the root domain (example.com) to the GitHub pages IP and they will redirect any 'naked' visitors to the www version.
In other words, they make it really easy to make your site fast but still catch users that didn't bother typing 'www.'
I didn't see such a note, but I'm not sure it would be true, either.
DNSimple doesn't actually implement a new DNS record type, it simply puts a TXT record on your domain that says "ALIAS for some.fqdn", and presumably it causes their DNS servers to do a recursive lookup for you (to whatever's in the TXT record) when you try and look at the A record for the naked domain.
From github's DDoS prevention's point of view the result is the same: an A record lookup points to their IP. They don't know that you got there by way of looking at DNSimple's servers and their ALIAS technique.
Anthony from DNSimple here. The ALIAS does synthesize the A record set, but it's the same A record set that is used when a username.github.io domain is resolved, which means it should work fine with Github's DDOS prevention.
The TXT record is only there for informational purposes and could be removed without affecting the system.
Since we have an Anycast network you will also get a result that would be similar to a CNAME resolution, meaning you'll typically get a "close" set of IPs that would be similar to what you would get from the resolution of a CNAME, which ultimately resolves down to A records as well.
No, the result is not the same. When you look up the records for the <yourusername>.github.io you get a different set of records than the singular IP address they tell you to add if you want to use the apex domain!
So from Github's DDoS prevention's point of view, the result is different.
I believe the reason is that the *.github.io hosts point to a CDN rather than just having a single A record, and it is only when going through the CDN that you bypass the "neutering". Regarding your second question, it seems that github issues a warning if you do that:
If you are technical enough to understand (and care about) the implications of this issue, consider hosting on S3. Hosting costs me about $2 per month on lower-traffic websites. The s3_website gem makes it straightforward. Response times are reasonable and inelastic with regard to traffic.
If you are aiming for the fastest speed possible, check out the s3_website gem support for Cloudfront - you can host your whole static website through a CDN.
Aren't we all technical enough to understand the implications? The author makes clear, the blog is hosted on DigitalOcean.
The thing is that very few blogger drive enough traffic to make money out of their blog[1]. If that's not the case, then why bother? If the content is good and free then waiting 2 seconds more is acceptable IMHO. :-)
Cole with Brace here (http://brace.io). We recommend redirecting the apex domain to a "www" subdomain. Note that even using apex CNAME records (Alias records) are still a new idea, and depending on the implementation may reduce reliability or performance. (https://iwantmyname.com/blog/2014/01/why-alias-type-records-...)
Here are a few resources from our blog that explain the www redirect approach:
I'm not sure I agree 100% with the title being changed. Originally is was "GitHub Pages with a custom root domain loses you 35% of your visitors", which after reading the story is not really what its about.
Also, if mods are going to change titles at least get the grammar right. "Pages ... ARE slow" not "Pages ... IS slow".
"GitHub Pages" is the name of a product, therefore it's singular. (One giveaway is the capital 'P'. If "pages" had been a generic plural, it would not have been capitalized.) The New York Times publishes articles every day, The Royal Tenenbaums is a Wes Anderson movie, and GitHub Pages, according to this article, is sometimes slow.
As for the claim "loses you 35% of your visitors", it is (a) dubious, (b) linkbait, and (c) violates the HN guideline against putting arbitrary numbers in titles. Happy to change it to something better if you or anyone suggest it—but editing that bit out was not a borderline call.
Is 5s really such a problem? I don't think I'd bail out of a website because it took 5s to load, unless it was something I didn't particularly want to see anyway. Which I guess might be why people didn't stick around for the tests from which the 35% number is drawn.
Yes. Numerous reputable entities have published reports demonstrating that users notice quite a lot. Amazon claims that every 100ms costs them 1% of revenue. Google claims 500ms costs them 20% of traffic. 5 seconds is a fucking eternity, and anything you expose to users on the web with such horrible performance will suffer greatly because of if. One exception may be banks. Users are more forgiving of latency as their financial connection to it increases.
I guess I find this plausible if we're talking about n ms multiplied by the number of resources loaded, and your page doesn't render progressively. If we're talking about total load time, I don't see why you'd even bother clicking a link if you weren't prepared to wait a few seconds for it to load.
Edit: in the case of Google and Amazon, I can believe that being slow will cause users to defect to other services. I don't believe that anybody will not bother to read documentation because it takes a second to load.
Edit2: If this is true, can anybody explain why users behave in this seemingly bizarre way? Do you give up on pages after 500ms? Have you seen anybody else do that? What is going on?
Look at page speed vs. bounce rate in Google Analytics for any well-trafficked site. People are frequently casually clicking around, and the faster you have content up on the screen the more casual users will engage.
By analogy, to get into a different mindset, think about channel-surfing on the TV. If other channels show a picture in 0.2s, and as you flip around there's a channel that takes 0.8s to show a picture, are you more or less likely to surf past the slow channel?
Thank you for replying! I got a staggering number of unexplained downvotes before anybody was prepared to talk to me.
Thinking about it I can well believe that if you want people to stay on your site and click around, probably because you want to show them ads or products, a small delay will impact on the number of clicks. I would imagine that's not the motivation for most github pages sites though.
Regarding giving up after 500ms: I don't think the issue is that people are consciously abandoning a site after a single page load that seems a bit slow. It's the cumulative burden of slightly slow pages that make the site slightly less attractive compared to other alternatives that respond faster. The differences are noticeable - if only subconsciously - and the result is that a portion of the users will move to the other service that just feels more responsive. Responsiveness of the site is part of the value being offered (even if people don't recognize it explicitly) For any site with significant volume of users and some effective competition for their service, this distinction results in measurable changes in use/conversions. I think the _actual_ change in user activity or conversions for s specific site would depend a whole lot on the nature of the service being offered and the alternatives available.
I would imagine less serious viewers will drop off quicker than motivated viewers.
Nothing can stop me if I need to buy something on Amazon or need to pay a bill online. If I'm just filling time and here's three interesting links to "fad of the day (hour?)" then slowest link might lose.
A simple A/B tester could insert an additional 50 ms to half the requests and some data analysis could calculate the slope of the graph in that area. Assuming that slope is perfectly linear for no good reason at extremes like 1500 seconds or 0.0000001 nanoseconds would be unwise.
Nice study, although a bit debatable but good catch whatsoever, made some points there.
However, Github hosting is made by programmers for programmers OR at least computer literate people. So it's exactly the group how ought to know when it's time to move to private hosting :-)
Even after reading these comments and the new docs it is unclear to me the correct way to use an apex domain on a Project Pages site using CNAME on the root (with CloudFlare) to avoid this issue.
I use a subdomain of that main domain as the User Pages site.
How should DNS be setup and how should the CNAME files on GitHub read?
As an example, the domain on the left should load the site normally hosted from the location on the right:
Hi, Kyle with Neocities here. We support custom domains for sites too!
We use an A record right now for root domains because DNS does not support root domain CNAMEs, and as a consequence have very similar problems.
The only practical way to deal with the problem is to redirect root visitors to www. If you go to google.com, you will notice that they do the same thing and redirect to www from a proxy somewhere. Our next implementation will probably do the same.
As I understand it, this is a similar issue on any app hosted on Heroku. You need to CNAME to WWW and then 301 redirect non WWW to WWW. Alternatively you can use DNS providers such as DNSimple who support ALIAS records.
Many CDNs such as
Akamai,
Incapsula,
CDNSolutions, etc.
Would be able to do the same thing;however, I wouldn't go as far as saying to leave Github Pages completely, I've found that CDNSolutions in front of Github pages load insanely fast. That could be the case for any site setup properly on a service such as Incapsula, CDNSolutions, etc.
Great demonstration of the importance of load times!
BitBalloon (https://www.bitballoon.com) will give you better speed with a root domain, but as with any other host you'll still loose out on some of our baked in CDN support if you don't have a DNS host with ALIAS support for apex records.
Why can't GitHub employ the DDOS mitigation behavior only during an active DDOS attack? I assume such attacks are not that frequent; perhaps once a week at most?
You could create an Amazon CloudFront distribution with your github domain as the origin and use Route 53 to set up a root domain without CNAME tricks.
a better solution than doing DNS trickery is just doing proxying, but this requires some other machine to serve as the proxy. i serve my github pages blog on multiple domains, as described here:
http://igor.moomers.org/github-pages:-proxying-and-redirects...
No doubt there's ways around any problem with a naked domain, but why work so hard on something so trivial? No user has ever turned away from a website because it hadd "www." in front. That said, your naked domain surely needs to redirect to your "www." address if you set it up this way.
It's not hard work to skip the "www" these days. DNS providers like Cloudflare support CNAME-like functionality on the apex domain, and if you're using AWS then Route 53 provides special "alias" records which let you hook the zone apex on to an ELB, for example. I'm sure other providers have similar functionality.
As for why, well, personally I prefer the look of a domain without the "www". It looks cleaner to me.
Those are fair enough reasons. I see being tied (permanently) to a provider like Cloudflare or AWS as a problem. I'd rather use the www and be allowed to move to providers that don't necessarily offer the same features, or to my own infrastructure where that is or is not an option.
Let's agree that for the most part it's a bad idea to change from www to naked or the other way around after the launch of a website (for seo reasons). So you have to pick one at launch and try to stick with it. Why choose the option that looks nicer but has problems associated with it and potential vendors not supporting it, vs the one that arguably looks messy but that all users everywhere are well accustomed to and has none of the configuration issues that affect naked domains?
There's postmortems out there about using naked domains and DDoS attacks. There's issues with load balancing, with domain configuration, with cookies.
If your website gets overrun by HNers, what's your plan to compensate quickly? How much of your plan is bogged down by the fact that you're on a naked domain?
That's what I thought too, although as far as I can tell it's possible to host any content on github pages. I suspect the majority of it is programming projects though, and programmers are not going to give up that easily.
First off, if you use a DNS provider that has support for a ALIAS records or something similar, pointing your apex domain to <username>.github.io will ensure your GitHub Pages site is served by our CDN without these redirects.
I wish we could provide better service for folks without a DNS provider that supports ALIAS domains in the face of the constant barrage of DDoS attacks we've seen against the IPs we've advertised for GitHub Pages over the years. We made the decision to keep DDoS mitigation enabled for apex domains after seeing GitHub Pages attacked and going down a handful of times in the same week. It's a bummer that this decision negatively impacts performance, but it does certainly improve the overall availability of the service.
FWIW, we considered pulling support for GitHub Pages on apex domains about a year ago because we knew it'd be slower than subdomains and would require DNS configuration that would be challenging and frustrating for a large number of our users. However, we ended up deciding not to go that route because of the number of existing users on apex domains.