The optimization level is in the sub 1 second range, so not having to pay one large RTT penalty for a DNS lookup is quite important. I've measured 300+ms RTT to Australia on the previous box I was using, that impacted the load times quite severely.
I noticed this line at the bottom of the page: "When it comes to picking a solution, I often choose the less traveled road". I don't agree with that at all, and it sounds a bit like NIH syndrome. It's always better to choose the most-traveled roads, especially in DevOps. If there's a problem, then you can join those communities and contribute to the projects.
I forgot to add that this applies only to R&D and hobby projects, for production setups I'm a bit more careful. :)
(I'm the author.)
This is misleading. Cloudfront doesn't charge anything for putting your domains on an SSL cert that uses SNI. They only change you if you need a cert without SNI, which requires them to allocate a dedicated IP address to you.
I'm hosting my personal blog on S3 and cloudfront, with SSL, for less than a dollar a month.
Performance and capabilities are fine for me, too. I get 0.15 seconds to first byte from Chicago, vs 0.24 for the author's site.
I plan on expanding on the featureset, so no S3 for me. :)
Maybe. I agree it's not as clearcut as always pick the less traveled road, but the difference between the two may include a competitive advantage that you'd be unwise to overlook. I mean, you're on HN; Paul Graham and Common Lisp back in the 90s is an excellent example.
If you need to handle bursty traffic, you're likely going to get best value in shared tenancy services until you can fully utilize your servers. Otherwise, you will probably end up paying for idle infrastructure.
This has been disproved for close to ten years empirically and academically. Route flaps generally result in convergence to the exact same destination if it has another path and is still online. If it's offline, then it's working as intended, and that's no different from a server being rotated via a DNS pool going down.
1: Quick search: https://www.google.com/search?q=tcp+anycast+paper&ie=utf-8&o...
But it is the case that transit and peering connections are not stable (in the sense of going up and down randomly or suddenly experiencing high levels of packet loss) and active monitoring is a must.
Assuming the end POP is still reachable along another regional route, I believe all the data I've seen shows that the client almost always hits the same destination they did before the flap.
For example, Cloudflare has a PoP in Seoul, but it has such limited bandwidth that most sites using Cloudflare are routed to Tokyo, Hong Kong, and even Los Angeles. Several of my clients in Korea signed up for Cloudflare a few years ago when the local PoP was still usable, but now all but two of them have canceled their subscriptions. Instead, I've been building a lot of caching proxies for them lately.
If anyone is here for the Winter Olympics right now and some of your favorite sites don't seem to be living up to Korea's reputation for ultra-fast internet, Cloudflare might be one reason. (Meanwhile, Amazon's PoP in Seoul is perfectly fine, albeit expensive.)
However, I spoke about this to some ISP/NANOG folks that I trust and they said that running a real CDN is a nightmare because all of your links (providers) hate you ... you're producing the exact opposite of the traffic that they want and they will not give you any breaks or help or benefits since you are their worst customer.
How accurate was that assessment ?
Running a Global CDN and ISP might be a tad too ambitious.
> Using SSL/TLS certificates
> The next pain point is using SSL/TLS certificates. Actually, let’s call them what they are: x509 certificates. Each of your edge locations needs to have a valid certificate for your domain. The simple solution, of course, is to use LetsEncrypt to generate a different certificate for each, but you have to be careful. LE has a rate limit, which I ran into on one of my edge nodes. In fact, I had to take the London node down for the time being until the weekly limit expires.
> However, I am using Traefik as my proxy of choice, which supports using a distributed key-value store or even Apache Zookeeper as the backend for synchronization. While this requires a bit more engineering, it is probably a lot more stable in the long run.
The drawback of the DNS method without synchronization between the nodes is that you run into the LetsEncrypt rate limit quite easily. My expansion to ap-southeast-1 and sa-east-1 is waiting for the LE cooldown.
Disclaimer: I'm the author of the article.
Side effect of this tool as you might have guessed is that using it will actually prolong the time your content stays in their cache.
Ansible is running docker-compose up -d on deployment an Traefik is doing the magic. I want to extend it to host multiple sites in the future. (Btw. Ansible ran from a central location is painfully slow because of the large latency to the edge nodes.)
The content itself is deployed using rsync, Ansible was just too painfully slow for that.
I thought Cloudflare uses Anycast to avoid targeted DDOS? How do they handle changing routes during HTTP requests?
People often mix and match anycast/Geo DNS and anycast/unicast http.
Some even go a step further and, for video files, anycast to a node that 302s to it's own unicast address.
Right way: 1-2 major Tier1 carriers across all of your PoPs with local peering for regional eyeball networks.
Wrong way: Using a different set of transit carriers at each location.
You really don't want that many AS paths to reach your content from a given location (3-4 is more than enough). What you're really going for with BGP anycasting is that your local ISP has a direct route to the closest PoP via exchange peering, or that the Tier1 path drop you off to the "closest" route. Transit carriers do this for a living, and they're usually quite good at figuring out route weighting inside their own network.
Yes, I know Netflix does it differently but they use a lot more smart geo DNS routing than anycasting.
Edit: IMHO it's also better to choose a Tier1 with a moderate sized network that values stability and performance over size. So someone like NTT over say Level3.
The route will not “change” unless cloudflare changes their routing, or you change your location/IP so that a shorter route exists. Once you’ve changed your IP, you’ve already interrupted any TCP sessions anyway.
You might find these two blog posts from LinkedIn to be helpful:
Depending on how the routing is set up, it doesn't matter if the route changes so long as you end up on the same host consistently (or one that can at least pretend it's the same host if you do some kind of fancy session mirroring, perhaps)
That's what I thought, too. But the article explicitly states this as a potential issue.
They are OK not great for smaller accounts.