DNSMadeEasy has a global traffic redirector ( http://www.dnsmadeeasy.com/services/global-traffic-director/ )
That then sends a request to the closest Linode data center.
Linode instances run nginx which redirect to Varnish, and the Varnish backend is connected via VPN to the main app servers (based in the London datacenter as the vast majority of my users are in London).
I use Varnish behind nginx to additionally place a fast cache close to the edge to prevent unnecessary traffic over the VPN.
Example: USA to London traffic passes over the VPN running within Linode, and the SSL connection for an East Coast user is just going to Newark. If the requested file was for a recently requested (by some other user) static file, then the file would come from Varnish and the request would not even leave the Newark data center.
The edge case is that some DNS providers (Google, OpenDNS) already pick what they feel is the closest end point.
I read about that stuff over here a while ago:
And this comment explains it best:
I haven't fully investigated this, and I don't know whether it is affecting some users. But when I implemented my solution I was aware that it might be possible for some small subset of users, for this to not result in a faster connection than if I'd done nothing at all (the closest resolver to Google may actually be further from the customer than the local server I run).
I'm just betting that for the vast majority of users this does bring about a noticeable increase in speed.
If you're using a North American Google DNS server, you'll get answers that say NA. If you use the DNS server in Europe, you'll get answers that say EU.
I'm assuming Google doesn't try to sync and cache between 184.108.40.206 instances, but I don't see why they would. That's a lot of work for no benefit.
From what I've seen end users hit a google dns cluster in the approximate geo area. However I e definitely seen odd peering of a public DNS node in EU hitting provider anycast nodes in NA.
If a request via 220.127.116.11 surfaced at a DNS server in North America, and DNSMadeEasy (in my example) then answered that request with a "Oh, you must be in North America, well for you the IP address of the web site is"... then you might not have got the answer you expected.
i.e. You might be in Spain, and using OpenDNS (or Google) the DNS query against DNSMadeEasy might surface on the East Coast of the USA, and as such you'd end up at Linode Newark rather than Linode London.
That particular example is pure speculation, but it illustrates the point.
As I said, I believe that the amount this must happen is just a slight edge case and as a whole isn't worth troubling about. But it is there as an edge case I'm aware of.
And if someone reads this thread and thinks, "Hey, this distributing SSL stuff is a great idea.", then as always, caveat emptor and check whether any potential issues that might arise are an issue for you and your application.
In my instance (forums with current discussions) most static file requests are for image attachments in the very latest discussions, the hot topics. So Varnish fits this scenario really well. I didn't need a long-term storage of images in the CDN, I just needed to store the most recently requested items in the CDN.
Linodes are cheap, I was already using them in a distributed fashion to reduce SSL roundtrips, and introducing Varnish was a small configuration change.
I have tried a few other providers (most recently CloudFlare). But I was generally not happy with them, usually due to a lack of visibility.
I proxy http:// images within the user generated content over https:// when the sites are accessed over https:// . And occasionally I found that images would not load when I used a CDN provider for that. But never had enough data and transparency with the CDN to know why. Users notice this stuff though, so I'd have isolated users complaining of images not loading and no way to debug or reproduce it.
So I found that as my scenario made Varnish a good fit, and the bandwidth was within my allowance, and it was easy to do... well, I just did it.
I still experiment with CDNs every now and then, but largely I get more reliability and transparency from my own solution. I've also found this to be cost effective, though I would be OK with paying a premium if I found the reliability and transparency rivalled my home-rolled solution.
In the time it takes the user to pick their file(s) to upload, the initial SSL negotiation will most likely have finished. And if you upload multiple files serially, the browser should even reuse the current SSL context, so it wouldn't be ~300ms per file.
We ran a test of this approach using a similar stack in 2010. We had Ireland, Singapore, Sydney backhauling to Dallas, TX for a reasonably large population of users. Managing the backend pool was a bit of a challenge without custom code. nginx didn't yet support HTTP 1.1 backend connections. The two best options I could find at that time were Apache TrafficServer and perlbal. perlbal won and was pretty easy to set up with a stable warm connection pool.
Despite good performance gains we didn't put the system into production. The monitoring and maintenance burden was high and we lacked at that time a homogeneous network -- I tested Singapore and Australia using VPS providers as Amazon and SoftLayer (our vendors of choice) weren't there yet.
As a side-effect of using the VPS vendors we did and trying to keep costs in control, we had to ratchet the TTL for this service down uncomfortably low to allow for cross-region failover. In Australia the additional DNS hit nearly wiped out the gains in SSL negotiation.
With today's increased geographical coverage and rich set of services from Amazon, this is a much less daunting project if you can stomach the operational overhead.
Note that the lack of sanely-priced bandwidth and hosting providers in Australia is a huge problem. When Amazon lands EC2 there, it's going to really shake up that market.
It seems like in the diagram, the West Coast Client, instead of making a direct connection to the APP servers on the right, is instead making a connection to the ELB on the left, which then forwards the traffic to the nginx server, which forwards it to another ELB, which forwards it to the App servers.
If the client connected directly to the ELB in front of the App Servers, they would incur the SSL handshake latency, but would avoid the four extra hops (two per send and two per receive) on the ELB and nginx.
Over the lifetime of the connection, is it possible that this latency could be longer than 200 ms?
For the total latency to be longer than 200ms, about 20 requests would need to be made on the same connection, which will not happen given the number of requests we do at a time.
They even have an optimized version called railgun (https://www.cloudflare.com/railgun) that only ships the diff across country.
Edit: I'm clear that latency is reduced and how that's accomplished. I just wanted to get clarification that the connections between the early SSL termination and the web servers was also encrypted, too.
The trick here is to cut down on the latency of establishing the session.
One issue with CloudFront is the POST PUT DELETE verbs aren't currently supported, which is a kink for modifying data. You could use Route 53s LBR feature to route requests to nearby EC2 instances, then proxy back to your origin.
Sounds cool, but this would only work on Amazon or datacenters w/ cross-data center private networks (SoftLayer has this, for example).
Backlog. That increases the latency till a new connection can be accepted. However, the number of pooled connections can be increased to a fairly large number at the expense of more memory consumption. This is something that isn't an issue with nginx by using it as a HTTPS proxy.
Amazon's ELB (the EC2 load balancer) used to send HTTPS traffic to your back-end unencrypted, but I believe they have since fixed this.
That's what I mean. In that mode it's sending traffic that should be HTTPS over HTTP.
For example, the discussion of nginx could be abstracted into a discussion of graph theory, where a handshake has to occur with a secure cluster of nodes.
This is all just IMHO. Great post though!