Find the most reliable and fast public CDNs

jefftk · on Nov 27, 2013

The main question for a site like this is: how are you collecting your data? CDNs have dozens of nodes and the performance a user sees will depend on what the network looks like between them and the nearest one. For example, if all your measurements were from an EC2 instance in us-east you'd only be measuring a tiny and easily-gamed slice of CDN performance.

On their about page they write that they're "grateful to Pingdom for providing access to their data" but I can't find anything more about which and how many places they're measuring from or how they combine measurements from different places.

Disclaimer: I work for Google on mod_pagespeed and ngx_pagespeed, and I sit near the people who handle developers.google.com/speed/libraries/ But I don't know anything about how they serve the hosted library traffic.

jimaek · on Nov 27, 2013

Currently CDNperf uses Pingdom to gather its data. All CDNs are in the same Cluster group and these are the locations it uses:

Chicago, IL

Copenhagen, Denmark

Washington, DC

Milan, Italy

San Jose, CA

Lisbon, Portugal

Toronto, Canada

Las Vegas 2, NV

Amsterdam 5, Netherlands

Strasbourg 2, France

Charlotte 2, NC

I understand that cdnperf data does not reflect real life performance but with our limited resources this was the best we could do.

If you have suggestions on how to make it better please let us know.

alexgartrell · on Nov 27, 2013

I'm an engineer on Facebook's CDN and Edge network, and, in my experience, the hardest places to serve people quickly are South America, Africa, and Asia. You should try to get some timing data for those places.

jimaek · on Nov 27, 2013

Pingdom does not offer servers in these locations. Other services do but we can't afford them. Pingdom is actually sponsoring us with free Pro account :)

If Facebook or Google would be interested to sponsor us then sure, we can add more locations and do more awesome stuff.

bredman · on Nov 27, 2013

I'm actually not sure where the data for this site is coming from (there's some mention of Pingdom on the about page) but CDN performance can vary significantly based on user location. For that reason the only CDN comparisons I really trust are ones that are taken from end-user computers of a representative sample of the viewer you're trying to optimize for.

In this case if this data is from Pingdom monitoring locations it's particularly bad for estimating performance for a JS library that's likely loaded by end-users who are not browsing the web from well connected data centers.

Agreed that the look is nice and admittedly in the absence of other data I would probably trust this data.

bebraw · on Nov 27, 2013

Hi,

The data is based on Pingdom, yes. Recently we made sure each CDN is monitored from the same group of servers.

It's probably not ideal. It definitely would be great if we could provide some alternative metrics. Ideas are welcome. :)

trosc · on Nov 27, 2013

Why not let website visitors make your measurements by loading the JavaScript files from the CDNs in the background?

Something like (untested):

    var i = 0;
    var urls = [
      "http://cdn.jsdelivr.net/jquery/2.0.3/jquery-2.0.3.min.js",
      "http://code.jquery.com/jquery-2.0.3.min.js",
      ...
    ];

    function measureRemaining() {
        if(i >= urls.length) return;
        var url = urls[i++];
        measureLatency(url, function(latency) {
          // TODO: post (url,latency) to back-end
          measureRemaining();
        });
    }

    function measureLatency(url,responseFn) {
        var script = document.createElement("script");

        // Bust through browser cache
        script.src = url + "?bust=" + Math.random():
        
        var start = new Date().getTime();
    
        script.onload = function() {
            var end = new Date().getTime();
            var latency = end - start;
            responseFn(latency);
        };
    
        document.getElementsByTagName("head")[0].appendChild(script);
    }

    measureRemaining();

In this way, you'll get actual end-user performance from a (hopefully) large number of different network locations. You will probably want to do this in a separate iframe to avoid changing the behaviour of your webpage.

bredman · on Nov 27, 2013

In the context of full web pages how I've seen this done is by embedding a Javascript call that occurs after the page is loaded to your site. I suspect you could do something similar with this resource.

Alternatively you can look at what CDNs are hosting the resources and then just use the existing tools to compare the expected performance of each CDN. Of course this comes with a few caveats:

1/ Performance may be different on a given CDN depending on which "package" the customer has purchased. I know was the case ~1-2 years ago for Akamai.

2/ A CDN may perform well for small objects but not for larger objects, make sure you look at representative benchmarks.

3/ JS CDNs using CDN load balancing services like Cedexis mean more work to find all the CDNs that are being load balanced across (jsDelivr uses Cedexis)

(edit: formatting)

mrweasel · on Nov 27, 2013

One thing that would be interesting is "Latency from where". Yandex seems to do pretty badly, but is that also true if you're in Russia?

bebraw · on Nov 27, 2013

Nice point. We've been planning a map based visualization. That would definitely help in this regard.

computer · on Nov 27, 2013

Isn't actual real-world usage (market share) more important? If 99% of your users already has the resource cached, no request will be fired whatsoever.

deelowe · on Nov 27, 2013

Exactly, which is why using a big one, like Google's is my preferred option.

michaelmior · on Nov 27, 2013

I have no data to support this, but I wonder if that really matters. You have other resources you need to serve from your site no matter what happens. It seems possible that resources from your own site would end up being the bottleneck, so you might as well just serve up your own libraries anyway.

acdha · on Nov 27, 2013

Yes, although this can be more limited in practice: unless you're using the exact same version of a resource as everyone else you're getting a cache miss. In the case of something like jQuery, there are such a wide range of in-the-wild versions that most users are going to end up with many cached releases other than the one which you're using.

That's not the end of the world as long as the CDN DNS + server connection overhead is reliably better than your own but if you already use a decent CDN that's not a given and it might prove a de-optimization if the resource in question can be delivered in less time than it takes to perform an otherwise unnecessary extra DNS lookup and server connection. For high latency networks that's worth monitoring and reviewing closely as it's often a net-loss.

erichurkman · on Nov 27, 2013

The multiple version problem is even worse on mobile devices. It's not uncommon to have less than 100MB total dedicated to the browser cache. Visit a few dozen non-mobile friendly websites and your entire cache is blown out.

bredman · on Nov 27, 2013

You bring up a good point. For stuff like this where the resource is plausibly already loaded before the user ever even visits your site you may want to look at what version of this library your users are most likely to have preloaded.

You probably want to devise a scoring function based on likelihood that the resource is loaded already and cost if it is not and rank your CDN options based on that.

mey · on Nov 27, 2013

My security conscious nature has always been distrustful of using a public cdn. It represents a potential security and privacy concern to my clients. It'd be an interesting attack vector to inject malicious javascript code into a wide pool of sites at once. It is also an operational concern that these systems are going to continue operating for the long term or not suffer service interruption.

Jakob · on Nov 27, 2013

While true, it’s would not spread that easily. Google returns in its response header of e.g. jquery: "Expires:Fri, 21 Nov 2014 00:46:46 GMT"

As long as this file is in my cache, my browser won’t request it again for a year, independent what happened to it in the meantime.

> or not suffer service interruption

Preventing that is actually quite easy

    <script src="//cdn/jquery.js"></script>
    <script>window.jQuery || document.write('<script src="local/jquery.js">\x3C/script>')</script>

(a slightly more complex solution would be needed if the CDN is timing out instead of return an error)

julien_c · on Nov 27, 2013

In my experience, browsers don't cache things very aggressively. You'll re-download that file way before 2014.

wreegab · on Nov 27, 2013

If it is in the cache, why is the CDN needed anyways?

wcdolphin · on Nov 27, 2013

Using a centralized cdn would increase the chance of a cache hit.

wreegab · on Nov 27, 2013

I'm completely skeptical about the benefit of CDNs to users.

I would like to see some hard data about the number of web sites a user visit typically to understand how this is a meaningful argument. As of now, I lean toward thinking these CDNs are just yet another way to track users.

Anyways, I block them all by default, and my browsing works just fine.

hercynium · on Nov 27, 2013

How do you "block" a CDN? Is that a mis-wording? Do you have some means to automatically discover and detect the origin servers and connect directly to them, bypassing the CDN?

And, to answer your question, the benefits are substantial, well-documented, and provable on multiple levels.

First off, a CDN (when working properly) greatly improves the average latency for browsing a site, and in some cases even the bandwidth usage. Additionally, use of a CDN can increase the number of users a site can simultaneously serve. The best CDNs can not only withstand but actively deflect various types of DOS attacks. Some can even serve resources like images and video dynamically optimized for the browsing software or device.

There are many more benefits, and believe it or not, a huge percentage of the Internet's web and media traffic flows through CDN services - bypassing all of them is near-impossible (unless you somehow don't use any of the most popular sites and services)

robszumski · on Nov 27, 2013

The main benefit isn't any of this "cached forever" business, although thats great. It's the fact that the download is coming from 1 or 2 hops away instead of all the way across the country or world.

gcb1 · on Nov 27, 2013

also, if it was not... now the malicious one is. forever.

hercynium · on Nov 27, 2013

Indeed, any CDN could have their servers hacked, and that would be a nightmare. But it seems that most of the biggest ones have had a reasonably (some unbelievably) good track record of preventing and protecting against such problems.

I'm not saying it hasn't happened before, or that it won't happen again, but the CDNs that it happens to and/or don't handle it with the utmost care and expertise do not survive very long.

semenko · on Nov 27, 2013

IIRC, Twitter is also concerned about this, and recently proposed a hash-based validation for externally included resources (though I can't seem to find their proposal right now …).

larrybolt · on Nov 27, 2013

I'm curious if we ever will see the most popular JS/CSS frameworks/libraries integrated into the browser itself, and a simple attribute in the tag would allow loading the internal version, but still allow failover to the hosted one.

This could even prevent man-in-the-middle attacks on scripts that otherwise would never expire anyway like described here: http://thejh.net/written-stuff/want-to-use-my-wifi

untog · on Nov 27, 2013

Seems like an administrative nightmare. Incorrectly loading JS libraries is just one of many problems when your network connections is compromised.

larrybolt · on Nov 27, 2013

Why would this be an administrative nightmare? I can see how deciding which JS/CSS libraries should be included can cause dispute, but apart from that, why not?

I actually even always wondered why the default css applied to elements isn't standardised, so pages not containing reset.css or normalize.css-sheets render differently or how certain Javascript methods differ from browser to browser.

But I guess that is a rather different discussion.

untog · on Nov 27, 2013

I think it's part of the same discussion. It makes total logical sense for default CSS to be standardised. Why isn't it? Because coordination between browser manufacturers is extremely patchy.

So, different browsers would have different libraries included depending on who made them. Possibly different versions too. It would just be very messy, with little reward.

dudus · on Nov 27, 2013

It is standardized but as many things in the web pre-HTML5 not all vendors agreed with the standards.

thomasfromcdnjs · on Nov 28, 2013

I've started work on something like this in the past -> https://github.com/cdnjs/browser-extension

Using the Extensions API I could even stop the DNS check and inject the Javascript before that which was pretty awesome.

My version would simply just look for cdnjs.cloudflare.com links in the source before rendering but could be applied less strictly to other assets e.g. Google jQuery CDN

(I co-started cdnjs.com a few years ago)

acdha · on Nov 27, 2013

> I'm curious if we ever will see the most popular JS/CSS frameworks/libraries integrated into the browser itself

No - release management would be a nightmare on both sides (“Is feature X worth not using the built-in previous version?” “Ooops, new jQuery point release. Time to ship a Firefox update!”) and it offers no advantages over simply using HTTPS to prevent injection attacks and Cache-Control headers to allow saving a properly versioned URL forever.

kbar13 · on Nov 27, 2013

The problem with many of these publicly available "free" monitoring systems that deal with ping / http response time is that a lot of the time it ends up being a monitoring system for its own network.

With that being said, I like the look!

Jakob · on Nov 27, 2013

An interesting addition would be letting the client ping those CDNs, too.

jimaek · on Nov 27, 2013

You mean publicly show the hostname of each CDN to allow users to ping them themselves?

jread · on Nov 27, 2013

You're better off evaluating CDNs using real user/last mile testing than a few pingdom servers in data centers. Cedexis has has some decent analysis of this type on their website: http://www.cedexis.com/country-reports/

nl · on Nov 27, 2013

The current gold standard in CDN measurement ishttp://www.cedexis.com/country-reports/

It uses instrumented JavaScript on real user's browsers to check latency.

That seems a better method IMHO.

jimaek · on Nov 27, 2013

It is, but they monitor Enterprise CDNs not public ones. Sure you can find CloudFlare/cdnjs there but not Google, Yandex, jsDelivr, jQuery

Unfortunately this is hard to implement with our limited resources.

gcb1 · on Nov 27, 2013

yui official cdn?