In simplified terms, the server usually stores a public and private key, and sends the public key to the client. The client generates a random password, encrypts it with the server's public key, and sends it to the server. Only anyone with the private key can decrypt the message, and that should only be the server.
Now you don't want to hand over this private key to Cloudflare if you don't need to, because then they can read all traffic. Up until now, you needed to.
What they did was take the private key and move it to a keyserver, owned by your bank or whomever. Every time the Cloudflare server receives a random password (which is encrypted with the public key) it just asks the keyserver "what does this encrypted message say?" After that it has the password to the connection and can read what the client (the browser) is sending, and write data back over the same encrypted connection. Without ever knowing what the private key was.
The connection from Cloudflare to your bank's webserver and keyserver can be encrypted in whatever way. It could be a fixed key for AES, it could be another long-lasting TLS connection (the overhead is mostly in the connection setup)... this isn't the interesting part and can be solved in a hundred fine ways.
Edit: Removed my opinion from this post. Any downvotes for my opinion would also push the explanation down (which I hope is useful to some). I mostly agree with the other comments anyway.
(And hopefully it doesn't incur any downvotes and readers understand it is just an opinion.)
Let's first see what the problem is with the old method, storing the private key on the Cloudflare server:
1. Cloudflare can read all traffic.
2. Cloudflare can read traffic they can intercept, even if it's not directed towards their server. Think public WiFi access points.
3. Anyone able to break into Cloudflare, physically or technically (or legally, hello NSA!), has the same capabilities.
Number two is a bit far-fetched without taking number 3 into account, except that there might be national security letters to obtain the private key, after which government agencies can use their own intercepts. Still slightly far-fetched, but a plausible concern if you're the bank that the next Edward Snowden just happened to use (Lavabit's private key needed to be handed over, disregarding all other customers' privacy).
Now we implement this new system, where you can only obtain the decryption for the encrypted session key (as sent by the client, e.g. a browser).
1. Cloudflare can still read all traffic.
2. They can no longer passively read any traffic, or hand over private keys to law enforcement. However, the encrypted session keys could still be queried at the keyserver. The keyserver would need to actively maintain a blacklist of session keys that are not in use at Cloudflare, e.g. by having webservers send them as they were used, and raise alarm flags if a false one comes up. But I'd need to work out this attack scenario further to be sure there is no loophole that could make it look like Cloudflare traffic even if it's direct (but intercepted by whomever) traffic. I think something like that might be possible, but I'd need to work it out to be sure.
3. Anyone hacking Cloudflare is now limited in the same way Cloudflare is (so the same as point number two).
So far the security concerns. Now as for speed and DDoS mitigation...
- Cloudflare now needs to query an external server (if it's in their datacenter, it can be pretty much considered under their control) for every https connection. Assuming there is a direct peering connection to a datacenter next-door, this would only be a couple of milliseconds in network latency, but so is DNS and building a TCP connection and fetching a page... it all adds up and in the end you notice that pages need to load, even if only for a sec.
- The SSL termination point is now nearer to the client, as Cloudflare has datacenters all around the world. You could in theory build the keyservers next to many of them, but then you're rolling a CDN for a CDN (a delivery network of content (keys) for a content delivery network (cloudflare)). Perhaps not a bad idea, but it just sounds a bit silly. All things considered, at best the connection gets only a tiny bit slower.
- Cloudflare is still useful for mitigating attacks, just like with normal http traffic. I mean, they can read the data, so it's more or less the same as with unencrypted http. The difference is the keyserver: they might run over capacity. Luckily you can now solve that problem by throwing money at it (you can deploy more keyservers, Cloudflare already has enough front-end servers to handle the requests). So it's now a money issue, not a technical one, which is how you want things to be.
- Session reuse makes this somewhat less of a pain, as the keyserver need not be queried for repeated https connections. We'd need to see numbers to know how much of a win this is, though. In any case, it's never faster than not having the keyserver.
So in this area it's still a win, but so was handing over your keys. Having keyservers makes it slightly less of a win.
In conclusion, you have three options:
- Don't use Cloudflare. Fastest website, but risk of DDoS.
- Old-style Cloudflare https setup: handing over your keys. DDoS risk mitigated.
- New technique: providing keyservers. Slightly slower than the old style, but with a tiny security advantage that might help with either management or rare kinds of high-profile attacks.
Disclaimer: I'm not a Cloudflare tech, I'm a software engineering and information security student (and a major nerd in general, I might add). I might be missing something, but many other comments seem to be saying similar things.
- If you discontinue using CloudFlare, you don't need to generate a new key-pair, and all of the issues associated with the infrastructure for revoking the current key-pair.
- There is also this: https://news.ycombinator.com/item?id=8336041 , which mentions protection against key-recovery due to Heartbleed-type bugs, which is true. If the SSL-stack is attacked, the attacker only has the same access that CloudFlare does. They can't recover the keys from the key server.
This is a little bit like saying "From an engineering perspective, gold is a good conductor but otherwise I don't see the value."
Whether the desire to avoid sharing SSL keys with 3rd parties stems from management, regulatory or contractual requirements is irrelevant. The bottom line, from CloudFlare's perspective, is that they have been unable to address a significant and potentially highly-profitable market.
Keyless SSL might not be groundbreaking from a technical/engineering perspective but, as a service offering, it could unlock hundreds of millions in revenue for CloudFlare.
If a DOS is actively attacking the bandwidth of the keyserver, none of the attacking connections will reuse old sessions, though.
Generally the key you would give them is for, and limited to, the resources that they cache/reverse proxy, so the same "read all traffic" concern exists.
What Cloudflare did is essentially, as others have mentioned, PKCS11 over the internet. PKCS11 is an existing, very well proven technique of sequestering the key in a hardware device, that hardware device doing what Cloudflare is moving to the client location here. It's neat enough, but they seem to kind of exaggerating the innovation a bit.
And while I think banks and others will find this useful, its even more useful if I can do it from a far away place. When you create an SSL connection in Europe to a west coast server there are a number of round trips. If those can be avoided you can cut the latency of the transaction. That is a good thing too.
Note that this will at most cut the connection setup time by half, as your premaster secret (RSA) or handshake data (ephemeral DH) still has to cross the Atlantic and get back to you (and the TLS handshake has 2 roundtrips).
Furthermore, because of
> Generally the key you would give them is for, and limited to, the resources that they cache/reverse proxy,
all dynamic data still is supposed to come from your server. So the browser will have to wait for that, while everything from the European mirror is cached anyway.
You'd think if this was a known technique, the mentioned banks would already have been asking for it, implementing it, or doing it.
Personally, I think CloudFlare is one of the few companies on here doing innovating stuff, and solving real issues.
And if not - if they've pulled the wool over my eyes - then at least I can respect their marketing.
Isolating the key for security/control purposes is a well known technique. It's what ssh-agent/putty paegent does, where you can even forward key usage over the network to different machines. It's what PKCS11 does to SSL/TLS, where generally the key is a hardware security module (CAs and banks don't have keys sitting on their public facing machines, but instead generally have dedicated hardware for it -- they are generally doing, at a different scale, what is described), though there have been various implementations and proposals over the years to do it over the network (e.g. https://svn.opendnssec.org/docs/p11proxy.png and pkcs11-proxy).
Why isn't it commonly done as Rackspace proposes? Well it's kind of an odd need for limited gain, but is effectively PKCS11 used in a different context. If they are solving a specific need for some of their clients then fantastic, but it's the narrative of the innovation that makes it a bit weird, and yielded a lot of the purportedly negative commentary.
It's worth remembering, no-one else was offering this or doing this... no-one else had phrased the problem such that the answer (even though it was known) became apparent.
That got me wondering: Why don't attackers then only DOS the dynamic pages? Those that can't be migrated easily, like the login?
(We've been working on a lot of great updates to the WAF for just this kind of thing)
Generally with CloudFlare people put as much of their site behind the CloudFlare proxy as possible, for origin-hiding and other reasons, even if we can't cache it (yet).
Here is an example of how it could be used (in my TLS terminator):
Basically, if you have ever used async SSL API, you should be
aware of things like:
// Returns EVP_PKEY_RSA, EVP_PKEY_ECC
After performing sign/decrypt (which could happen in other
thread, or on a different server) you should call:
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
-----END PGP SIGNATURE-----
'His' apparently. I didn't know either, but apparently he was the first to crack Cloudfare's Heartbleed challenge:
His LinkedIn / twitter for reference:
I'm not at all suggesting that CF hasn't thought of this; rather I want to see their mitigation of the risk.
That said, silliness can lead to a recursive/"turtles all the way down" kind of issue.
It's _far_ easier to handle revocation on a couple trusted endpoints (especially when you own the CA), so 'normal' TLS does the job here just fine. No need for more turtles.
Currently, if someone compromises the Cloudfare servers, they gain the bank's private key and can impersonate the bank until the bank revokes their keys.
With this solution, if someone compromises the Cloudfare servers, they can impersonate the bank by relaying the decryption of the premaster secret through Cloudfare's compromised servers back to the bank. They can do this until Cloudfare notices and closes the security hole.
It's not clear that the difference is all that great in reality, as most of the damage will be done in the first 24 hours of either compromise.
Do you mean SSL key revocation in general broken or in this proposed solution by cloudflare. If it's the former, would you care to elaborate how it's broken?
This is not strictly true, is it? My interpretation is that at best they get temporary access to a server that will sign for them using the key, but the bank can terminate their signing servers at any time and then safely resume using their key without having to revoke it, since it never left their server.
This does somewhat increase the attack surface, but it lets the bank keep control over their keys and is better than having their keys get compromised and thus having to revoke them.
A compromise of an SSL terminator, or the ability to pretend to be one, gives the same capabilities as a compromise of the oracle itself. The advantage you get is that only the last requires you to revoke and reissue the key when you figure it out(in other cases, simply unplug the oracle). That's relatively minor most of the time: an open decryption oracle is already hair-on-fire levels of bad.
I can see why some organizations (for whom this process is unusually expensive) might be interested, but it probably isn't beneficial most of the time.
However the reality on the ground is that certificate revocation is somewhere between unreliable and damn right broken. Right now the primary way certificates are revoked is via operating system updates or via browser updates (e.g. Chrome has a revocation list they update semi-regularly).
Covered somewhat here:
So while adding an oracle does definitely increase your attack surface, at least you are never dependant on certificate revocation for your ultimate security.
I guess it really boils down to this: What do you trust more, certificate revocation or your own ability to secure the oracle? For me the answer is a no brainer, simply because I distrust certificate revocation so much.
To me, the biggest question is whether they can keep the oracle highly available even in the face of a DDoS. It still seems like the weakest link, just slightly more difficult for the attacker to target since they don't have a publicly advertised direct connection to it.
It may be pretty cynical to call that situation a "feature". But I'm sure it came up. And I'm sure they noticed.
It certainly wouldn't be the first time "trivial functional difference" translated to "massive legal difference."
"might not, under the current rules, which are always subject to change"
Aren't you now still depending on certificate revocation but have just shifted the problem downstream (it is now the bank's job to revoke you, rather than the user's browser's job).
Or do you yourselves use "Keyless" technology, so that CloudFlare servers contact some CloudFlare oracle so that they can communicate with the bank oracle?
>but have just shifted the problem downstream (it is now the bank's job to revoke you, rather than the user's browser's job).
You make it sound like that's not a _huge_ win... Which would you rather do, ensure revocation on millions of end user systems, or a handful of systems that are controlled by a party you have a close partnership with?
Do you require the customer to size their key services to some theoretical peak, or can you offload so much of the process that the customer just needs to make a one time investment?
(Full disclosure. I work for a competitor, but not on problems like this).
The keyserver is stateless, so you can have a bunch of keyservers (which don't need to be particularly near your own infrastructure) to deal with this, though.
There's also session caching (tickets or session id), so this only affects people who haven't connected to a cloudflare site recently.
It IS an issue, but it's not that hard to deal with.
If your infrastructure is unavailable, users won't get content anyway, SSL or not.
I guess technically the TLS-secured static content would now be at more risk than it was before, when the cached content would have been served. But existing session tickets would probably be honored, so only new clients would get rejected. Still, it's a tradeoff some people would probably be fine with.
Think of a large organization - you've been there (or not), there are 30 internal applications with self-signed certificates. Fail. The organization had purchased an HSM, but never really got it deployed because - well, that was too complex and it didn't integrate well with 3rd party network hardware and failed miserably in your *nix web stack.
This could be interesting - and I'm not commenting with regard to the efficacy or security concerns around this, but mainly the workflow simplicity it provides to large organizations who end up in self-signed-cert-hell because HSMs don't interoperate easily in a lot of use cases.
But to my original statement - this is a very good thing or a very bad thing for Thales and the like. The only requirement for an actually certified HSM, really, is certification against some hardware and software standard you have a checkbox to fulfill. Beyond that this would be a killer in the middleground for those who want an HSM like functionality but don't have any requirements to meet other than housing a secure segment where key management can be done in a more controlled manner.
Why not offer a "pass through" mode where the proxying is done on the network layer rather than the application layer? Of course in such a modus all CDN-like functionality could no longer be offered, but it could still do a fair amount of DDOS protection, no?
But yes, users' plaintexts would still be compromised.
"Security theatre" indeed.
> An institution should notify its primary Federal regulator as soon
> as it becomes aware of the unauthorized access to or misuse of
> sensitive customer information or customer information systems.
It means people can use a cloud service without giving them private keys. This seems much better than giving them keys.
I too was hoping there would be some clever math to make it so some part of the conversation between client and back end was an encrypted tunnel that cloudfare can route but not read. This doesn't seem to be the case.
But all the same, however small a step, its not theatre.
Of course you still need to trust CloudFlare to let them provide SSL functionality for you in the first place. But often the issue of "do I trust them today?" is separate from "will I still trust them tomorrow? and what happens if I don't?".
E.g. consider if I trust CloudFlare today, but want to be protected in the event that they fail and suffer intrusion. With this service, I could as an emergency just flip a switch (shut down the devices providing the key signing) and instantly disable CloudFlare's ability to accept SSL connections on my behalf, without having to trust that they manage to deal with the intrusion properly.
By allowing me to put less trust in them, I may trust them more, because the ability to counteract misplaced trust is much better.
The big deal is that CloudFlare's ability to sign SSL
sessions can be yanked with the flip of a switch, with no
need to trust CloudFlare.
Surely now the federal regulators will let us keep the barn door open?!
You always have to trust some server with the plaintext, unless you want a barn that has no door at all.
CloudFlare practices good security. This makes the marginal risk from moving sites onto their servers minimal or even negative, since you only have to worry about the current good security and not CloudFlare-in-six-years security.
Imagine an internet user who uses online banking with one of these banks; registered his domains with Namecheap; keeps some bitcoins at Coinbase; pseudonymously frequents Reddit; and posts anonymously at 4chan every now and then.
By merely passively eavesdropping on all traffic from and to your IP address, Cloudflare can build a profile that links your real world identity to all your Reddit posts you thought were anonymous; to the naughty threads you visited on 4chan; to the amount of money in your bank account; and to the domains and bitcoin addresses you own.
This is something that neither your ISP or any of those individual site owners could ever do, and which might even make the NSA a bit jealous. Now consider what Cloudflare could do on your behalf (and make it look like it was you) if, say, a disgruntled employee was actively out to get you.
I'm not at all saying that there is any reason to distrust Cloudflare, but is it not an enormous amount of power to place in the hands of one entity? Even more so than we do with parties like Google or Facebook, which get immensely more scrutinized over it?
With this, CloudFlare cannot impersonate you without your endpoint's active consent. As soon as you terminate the service that solves this puzzle for CF, they cannot impersonate you any longer.
So, yes, it is better. Maybe not "great" if your threat model includes CF getting 0wned, but definitely better than giving over your RSA key. ;)
You can't claim that this improves the bank clients' security. It's clearly worse than doing nothing.
"It's clearly worse than doing nothing." It's not very clear to me. Please explain.
This is not the case with any mechanism that requires you to hand over your keys. Not even issuing a revocation effectively terminates use of a compromised key.
What does it matter security-wise if the HSM module (which is what banks use) physically sits on Cloudflare premises or off Cloudflare premises? It's still connected to the same equipment.
It does matter for a variety of other reasons, for example what security clearances are required to work there, but it does not matter much for security.
Please summarize the differences between this protocol and PKCS instead of downvoting.
I have, incidentally, downvoted your comment, because you are complaining about downvotes. Don't do that.
(On the subject of HSM physical attacks: That's another issue altogether, and does not stop at the HSM. But normally that's not an attack you defend against, because you have the relevant contractual obligations against your infrastructure provider.)
I promise not to ask about downvotes again. But the question was honest; if I'm wrong I want to know it.
Could you point to some credible expert commentary (as opposed to anonymous noise on HN) describing why what CloudFlare has done here is wrong?
In this case Cloudflare apparently thought it was the right solution in theory but developed their own instead of using existing products and/or standards. I don't know the rationale for this, but I'd be interesting in knowing more, as you can read in my comment above.
I don't know why I should point out that Cloudflare did the wrong thing. Perhaps you are confusing me with someone else?
What I did say is that the alternative to the described solution is to use a HSM, and that their solution should offer equivalent security.
If CloudFlare has not done something wrong, then why did you say that?
Before you answer that question, remember: A hypothetical solution is not a practical solution. A practical solution is always a practical improvement over the case where there was no existing practical solution offered.
And before you say "They should have used HSMs", remember: CloudFlare has made it clear that HSMs being under their control was simply not an option. It was clear in their first blog post, and just for good measure, it was made absolutely explicit in an interview with Ars where CloudFlare's CEO said "there’s no vault we can ever build that they’ll trust us with their SSL keys".
So, how is there no practical improvement?
And that is not true. The alternative is not to let other organizations impersonate you without your cooperation. That is very clear from the article. Storing plaintext keys with Cloudflare was never on the table. That's not why they built it.
There reasons to why Cloudflare built their own, probably good ones because Cloudflare employs some talented people, and I would think they have to do with the scale Cloudflare operates at.
Network attached HSMs are off the shelf devices. If you've worked with PKI, you've seen them. And that is what they would have went with if they hadn't built this. If it was right or wrong to go with a home-grown HSM instead of an off the shelf one is not something I could possibly know -- but I know it's not a "huge improvement in security" to build your own. The fact that is offers comparable security is probably why the bank chose it.
If there is one thing to take away from the article, it should be: Don't invent your own security protocols. Buy off the shelf devices. If you really need to build your own, this is how.
That's a really fast turnaround for CloudFlare, and not sure how the Akamai filing isn't prior art. Actually, the Akamai filing is referenced in the CloudFlare patent.
yes, the SSL key doesn't leave the bank, but everything it is protecting is..
Ehh... I'd say Keyless SSL implements the opposite of that principle: encryption terminates with CloudFlare but authentication terminates in some bank.
Although the diagrams in the blog post show the non-PFS RSA handshake, I'm sure the architecture already supports the PFS DH handshake too.
Not to say that it's not useful, but the article describes it as some grand invention.
Free SSL is still in the works. More info soon-ish.
My money is on AOL as the CA.
Instead they separate the actual key signing, delegating it to the customer's device. That's nice and useful, but isn't quite what I was expecting.
It is obvious, and they effectively implemented a custom approach for PKCS11/ssh-agent. Yet the narrative implies some brilliant period of insight and innovation, when really it kind of isn't.
Which is where the "silly" notion that they must have did something novel came from -- their narrative claims it.
The fact that Cloudflare is the first global-level CDN to implement this kind of keyless SSL termination to me is innovation, even though it's based on pulling PKCS11 at the IP level. It's solving a real-world problem in their context, which nobody has solved before, and customers pay for it.
What's new is the whole "edge calls home with a signing request" piece.
A goal of total political cooperation or submission leads to economic sanctions leading to serious human health effects leading to defensive denial of service attacks. This accelerates the need to decentralize the financial network systems to make them more robust.
How can we imagine though that even after a complete transition to next generation systems that are ground-up distributed designs (not just stop-gap tweaks like this) that we won't have new types of attacks to deal with.
The starting point is the belief system that provides such fertile ground for conflict. We have to promote the idea that human lives have value and that lethal force is not an acceptable way to resolve conflict.
As long as decision makers are living in a sort of 1960s James Bond fantasy world we will all be subject to the insecurity of that type of world. Its largely built upon a type of primitive Social Darwinism that is still much more prevalent than most will acknowledge.
Its much easier to accept a compartmentalization of these problems and focus on a narrow technical aspect, but that does not integrate nearly enough information.