The main question for a site like this is: how are you collecting your data? CDNs have dozens of nodes and the performance a user sees will depend on what the network looks like between them and the nearest one. For example, if all your measurements were from an EC2 instance in us-east you'd only be measuring a tiny and easily-gamed slice of CDN performance.
On their about page they write that they're "grateful to Pingdom for providing access to their data" but I can't find anything more about which and how many places they're measuring from or how they combine measurements from different places.
Disclaimer: I work for Google on mod_pagespeed and ngx_pagespeed, and I sit near the people who handle developers.google.com/speed/libraries/ But I don't know anything about how they serve the hosted library traffic.
I'm an engineer on Facebook's CDN and Edge network, and, in my experience, the hardest places to serve people quickly are South America, Africa, and Asia. You should try to get some timing data for those places.
Pingdom does not offer servers in these locations.
Other services do but we can't afford them. Pingdom is actually sponsoring us with free Pro account :)
If Facebook or Google would be interested to sponsor us then sure, we can add more locations and do more awesome stuff.
I'm actually not sure where the data for this site is coming from (there's some mention of Pingdom on the about page) but CDN performance can vary significantly based on user location. For that reason the only CDN comparisons I really trust are ones that are taken from end-user computers of a representative sample of the viewer you're trying to optimize for.
In this case if this data is from Pingdom monitoring locations it's particularly bad for estimating performance for a JS library that's likely loaded by end-users who are not browsing the web from well connected data centers.
Agreed that the look is nice and admittedly in the absence of other data I would probably trust this data.
Why not let website visitors make your measurements by loading the JavaScript files from the CDNs in the background?
Something like (untested):
var i = 0;
var urls = [
"http://cdn.jsdelivr.net/jquery/2.0.3/jquery-2.0.3.min.js",
"http://code.jquery.com/jquery-2.0.3.min.js",
...
];
function measureRemaining() {
if(i >= urls.length) return;
var url = urls[i++];
measureLatency(url, function(latency) {
// TODO: post (url,latency) to back-end
measureRemaining();
});
}
function measureLatency(url,responseFn) {
var script = document.createElement("script");
// Bust through browser cache
script.src = url + "?bust=" + Math.random():
var start = new Date().getTime();
script.onload = function() {
var end = new Date().getTime();
var latency = end - start;
responseFn(latency);
};
document.getElementsByTagName("head")[0].appendChild(script);
}
measureRemaining();
In this way, you'll get actual end-user performance from a (hopefully) large number of different network locations. You will probably want to do this in a separate iframe to avoid changing the behaviour of your webpage.
In the context of full web pages how I've seen this done is by embedding a Javascript call that occurs after the page is loaded to your site. I suspect you could do something similar with this resource.
Alternatively you can look at what CDNs are hosting the resources and then just use the existing tools to compare the expected performance of each CDN. Of course this comes with a few caveats:
1/ Performance may be different on a given CDN depending on which "package" the customer has purchased. I know was the case ~1-2 years ago for Akamai.
2/ A CDN may perform well for small objects but not for larger objects, make sure you look at representative benchmarks.
3/ JS CDNs using CDN load balancing services like Cedexis mean more work to find all the CDNs that are being load balanced across (jsDelivr uses Cedexis)
Isn't actual real-world usage (market share) more important? If 99% of your users already has the resource cached, no request will be fired whatsoever.
I have no data to support this, but I wonder if that really matters. You have other resources you need to serve from your site no matter what happens. It seems possible that resources from your own site would end up being the bottleneck, so you might as well just serve up your own libraries anyway.
Yes, although this can be more limited in practice: unless you're using the exact same version of a resource as everyone else you're getting a cache miss. In the case of something like jQuery, there are such a wide range of in-the-wild versions that most users are going to end up with many cached releases other than the one which you're using.
That's not the end of the world as long as the CDN DNS + server connection overhead is reliably better than your own but if you already use a decent CDN that's not a given and it might prove a de-optimization if the resource in question can be delivered in less time than it takes to perform an otherwise unnecessary extra DNS lookup and server connection. For high latency networks that's worth monitoring and reviewing closely as it's often a net-loss.
The multiple version problem is even worse on mobile devices. It's not uncommon to have less than 100MB total dedicated to the browser cache. Visit a few dozen non-mobile friendly websites and your entire cache is blown out.
You bring up a good point. For stuff like this where the resource is plausibly already loaded before the user ever even visits your site you may want to look at what version of this library your users are most likely to have preloaded.
You probably want to devise a scoring function based on likelihood that the resource is loaded already and cost if it is not and rank your CDN options based on that.
My security conscious nature has always been distrustful of using a public cdn. It represents a potential security and privacy concern to my clients. It'd be an interesting attack vector to inject malicious javascript code into a wide pool of sites at once. It is also an operational concern that these systems are going to continue operating for the long term or not suffer service interruption.
I'm completely skeptical about the benefit of CDNs to users.
I would like to see some hard data about the number of web sites a user visit typically to understand how this is a meaningful argument. As of now, I lean toward thinking these CDNs are just yet another way to track users.
Anyways, I block them all by default, and my browsing works just fine.
How do you "block" a CDN? Is that a mis-wording? Do you have some means to automatically discover and detect the origin servers and connect directly to them, bypassing the CDN?
And, to answer your question, the benefits are substantial, well-documented, and provable on multiple levels.
First off, a CDN (when working properly) greatly improves the average latency for browsing a site, and in some cases even the bandwidth usage. Additionally, use of a CDN can increase the number of users a site can simultaneously serve. The best CDNs can not only withstand but actively deflect various types of DOS attacks. Some can even serve resources like images and video dynamically optimized for the browsing software or device.
There are many more benefits, and believe it or not, a huge percentage of the Internet's web and media traffic flows through CDN services - bypassing all of them is near-impossible (unless you somehow don't use any of the most popular sites and services)
The main benefit isn't any of this "cached forever" business, although thats great. It's the fact that the download is coming from 1 or 2 hops away instead of all the way across the country or world.
Indeed, any CDN could have their servers hacked, and that would be a nightmare. But it seems that most of the biggest ones have had a reasonably (some unbelievably) good track record of preventing and protecting against such problems.
I'm not saying it hasn't happened before, or that it won't happen again, but the CDNs that it happens to and/or don't handle it with the utmost care and expertise do not survive very long.
IIRC, Twitter is also concerned about this, and recently proposed a hash-based validation for externally included resources (though I can't seem to find their proposal right now …).
I'm curious if we ever will see the most popular JS/CSS frameworks/libraries integrated into the browser itself, and a simple attribute in the tag would allow loading the internal version, but still allow failover to the hosted one.
Why would this be an administrative nightmare? I can see how deciding which JS/CSS libraries should be included can cause dispute, but apart from that, why not?
I actually even always wondered why the default css applied to elements isn't standardised, so pages not containing reset.css or normalize.css-sheets render differently or how certain Javascript methods differ from browser to browser.
But I guess that is a rather different discussion.
I think it's part of the same discussion. It makes total logical sense for default CSS to be standardised. Why isn't it? Because coordination between browser manufacturers is extremely patchy.
So, different browsers would have different libraries included depending on who made them. Possibly different versions too. It would just be very messy, with little reward.
Using the Extensions API I could even stop the DNS check and inject the Javascript before that which was pretty awesome.
My version would simply just look for cdnjs.cloudflare.com links in the source before rendering but could be applied less strictly to other assets e.g. Google jQuery CDN
> I'm curious if we ever will see the most popular JS/CSS frameworks/libraries integrated into the browser itself
No - release management would be a nightmare on both sides (“Is feature X worth not using the built-in previous version?” “Ooops, new jQuery point release. Time to ship a Firefox update!”) and it offers no advantages over simply using HTTPS to prevent injection attacks and Cache-Control headers to allow saving a properly versioned URL forever.
The problem with many of these publicly available "free" monitoring systems that deal with ping / http response time is that a lot of the time it ends up being a monitoring system for its own network.
You're better off evaluating CDNs using real user/last mile testing than a few pingdom servers in data centers. Cedexis has has some decent analysis of this type on their website: http://www.cedexis.com/country-reports/
On their about page they write that they're "grateful to Pingdom for providing access to their data" but I can't find anything more about which and how many places they're measuring from or how they combine measurements from different places.
Disclaimer: I work for Google on mod_pagespeed and ngx_pagespeed, and I sit near the people who handle developers.google.com/speed/libraries/ But I don't know anything about how they serve the hosted library traffic.