Hacker News new | past | comments | ask | show | jobs | submit login
Announcing Keyless SSL (cloudflare.com)
509 points by jgrahamc on Sept 18, 2014 | hide | past | favorite | 184 comments

For those who want to understand how it works (it took me a minute, so I'll try to explain it simpler):

In simplified terms, the server usually stores a public and private key, and sends the public key to the client. The client generates a random password, encrypts it with the server's public key, and sends it to the server. Only anyone with the private key can decrypt the message, and that should only be the server.

Now you don't want to hand over this private key to Cloudflare if you don't need to, because then they can read all traffic. Up until now, you needed to.

What they did was take the private key and move it to a keyserver, owned by your bank or whomever. Every time the Cloudflare server receives a random password (which is encrypted with the public key) it just asks the keyserver "what does this encrypted message say?" After that it has the password to the connection and can read what the client (the browser) is sending, and write data back over the same encrypted connection. Without ever knowing what the private key was.

The connection from Cloudflare to your bank's webserver and keyserver can be encrypted in whatever way. It could be a fixed key for AES, it could be another long-lasting TLS connection (the overhead is mostly in the connection setup)... this isn't the interesting part and can be solved in a hundred fine ways.

Edit: Removed my opinion from this post. Any downvotes for my opinion would also push the explanation down (which I hope is useful to some). I mostly agree with the other comments anyway.

Thanks for explaining this. I would also love to read your opinion. Would you be willing to share it in a separate comment?

(And hopefully it doesn't incur any downvotes and readers understand it is just an opinion.)

Sure. TL;DR: I think it's mostly useless. From an engineering perspective this is a creative way to circumvent management's requirements ("our encryption keys must be kept on premises"), but otherwise I don't see the value.

Let's first see what the problem is with the old method, storing the private key on the Cloudflare server:

1. Cloudflare can read all traffic.

2. Cloudflare can read traffic they can intercept, even if it's not directed towards their server. Think public WiFi access points.

3. Anyone able to break into Cloudflare, physically or technically (or legally, hello NSA!), has the same capabilities.

Number two is a bit far-fetched without taking number 3 into account, except that there might be national security letters to obtain the private key, after which government agencies can use their own intercepts. Still slightly far-fetched, but a plausible concern if you're the bank that the next Edward Snowden just happened to use (Lavabit's private key needed to be handed over, disregarding all other customers' privacy).

Now we implement this new system, where you can only obtain the decryption for the encrypted session key (as sent by the client, e.g. a browser).

1. Cloudflare can still read all traffic.

2. They can no longer passively read any traffic, or hand over private keys to law enforcement. However, the encrypted session keys could still be queried at the keyserver. The keyserver would need to actively maintain a blacklist of session keys that are not in use at Cloudflare, e.g. by having webservers send them as they were used, and raise alarm flags if a false one comes up. But I'd need to work out this attack scenario further to be sure there is no loophole that could make it look like Cloudflare traffic even if it's direct (but intercepted by whomever) traffic. I think something like that might be possible, but I'd need to work it out to be sure.

3. Anyone hacking Cloudflare is now limited in the same way Cloudflare is (so the same as point number two).

So far the security concerns. Now as for speed and DDoS mitigation...

- Cloudflare now needs to query an external server (if it's in their datacenter, it can be pretty much considered under their control) for every https connection. Assuming there is a direct peering connection to a datacenter next-door, this would only be a couple of milliseconds in network latency, but so is DNS and building a TCP connection and fetching a page... it all adds up and in the end you notice that pages need to load, even if only for a sec.

- The SSL termination point is now nearer to the client, as Cloudflare has datacenters all around the world. You could in theory build the keyservers next to many of them, but then you're rolling a CDN for a CDN (a delivery network of content (keys) for a content delivery network (cloudflare)). Perhaps not a bad idea, but it just sounds a bit silly. All things considered, at best the connection gets only a tiny bit slower.

- Cloudflare is still useful for mitigating attacks, just like with normal http traffic. I mean, they can read the data, so it's more or less the same as with unencrypted http. The difference is the keyserver: they might run over capacity. Luckily you can now solve that problem by throwing money at it (you can deploy more keyservers, Cloudflare already has enough front-end servers to handle the requests). So it's now a money issue, not a technical one, which is how you want things to be.

- Session reuse makes this somewhat less of a pain, as the keyserver need not be queried for repeated https connections. We'd need to see numbers to know how much of a win this is, though. In any case, it's never faster than not having the keyserver.

So in this area it's still a win, but so was handing over your keys. Having keyservers makes it slightly less of a win.


In conclusion, you have three options:

- Don't use Cloudflare. Fastest website, but risk of DDoS.

- Old-style Cloudflare https setup: handing over your keys. DDoS risk mitigated.

- New technique: providing keyservers. Slightly slower than the old style, but with a tiny security advantage that might help with either management or rare kinds of high-profile attacks.

Disclaimer: I'm not a Cloudflare tech, I'm a software engineering and information security student (and a major nerd in general, I might add). I might be missing something, but many other comments seem to be saying similar things.

There's also:

- If you discontinue using CloudFlare, you don't need to generate a new key-pair, and all of the issues associated with the infrastructure for revoking the current key-pair.

- There is also this: https://news.ycombinator.com/item?id=8336041 , which mentions protection against key-recovery due to Heartbleed-type bugs, which is true. If the SSL-stack is attacked, the attacker only has the same access that CloudFlare does. They can't recover the keys from the key server.

One more thought occurred to me (I didn't see it mentioned around here?) that you keep some control of the relation with Cloudflare, you can pull the plug instantly at any moment you wish if need be, and because of that, they in theory must be more "on their toes" all the time, can't become spoiled too easily. Or at least should they become, you have some concrete leverage. Also if some competition for CF arrives, you now have the freedom to switch.

> From an engineering perspective this is a creative way to circumvent management's requirements ("our encryption keys must be kept on premises"), but otherwise I don't see the value.

This is a little bit like saying "From an engineering perspective, gold is a good conductor but otherwise I don't see the value."

Whether the desire to avoid sharing SSL keys with 3rd parties stems from management, regulatory or contractual requirements is irrelevant. The bottom line, from CloudFlare's perspective, is that they have been unable to address a significant and potentially highly-profitable market.

Keyless SSL might not be groundbreaking from a technical/engineering perspective but, as a service offering, it could unlock hundreds of millions in revenue for CloudFlare.

> Session reuse makes this somewhat less of a pain, as the keyserver need not be queried for repeated https connections.

If a DOS is actively attacking the bandwidth of the keyserver, none of the attacking connections will reuse old sessions, though.

True, but I meant for the performance for legitimate users.

Is this not subject to side channel attacks (for recovering the key)? Also, whats the point if you are letting a third party read all of your traffic at any rate?

For starters, you can move away from CloudFlare, and you don't need to worry about issuing a new key-pair (b/c CloudFlare, or a unscrupulous CloudFlare employee could use the old private key to impersonate your site).

Now you don't want to hand over this private key to Cloudflare if you don't need to, because then they can read all traffic.

Generally the key you would give them is for, and limited to, the resources that they cache/reverse proxy, so the same "read all traffic" concern exists.

What Cloudflare did is essentially, as others have mentioned, PKCS11 over the internet. PKCS11 is an existing, very well proven technique of sequestering the key in a hardware device, that hardware device doing what Cloudflare is moving to the client location here. It's neat enough, but they seem to kind of exaggerating the innovation a bit.

I think the exaggeration is because its a sales pitch, but it illustrates an important concept -- doing something "known" but in a generic way can be more valuable than having a proprietary solution. Of course someone could come up and offer a 'Cloudflare' box that just does key sequestration and sell that.

And while I think banks and others will find this useful, its even more useful if I can do it from a far away place. When you create an SSL connection in Europe to a west coast server there are a number of round trips. If those can be avoided you can cut the latency of the transaction. That is a good thing too.

> When you create an SSL connection in Europe to a west coast server there are a number of round trips. If those can be avoided you can cut the latency of the transaction. That is a good thing too.

Note that this will at most cut the connection setup time by half, as your premaster secret (RSA) or handshake data (ephemeral DH) still has to cross the Atlantic and get back to you (and the TLS handshake has 2 roundtrips).

Furthermore, because of

> Generally the key you would give them is for, and limited to, the resources that they cache/reverse proxy,

all dynamic data still is supposed to come from your server. So the browser will have to wait for that, while everything from the European mirror is cached anyway.

It is very obvious... In hindsight?

Not sure why all the negative reactions here... Was anyone else doing or providing this type of "Keyless SSL" setup?

You'd think if this was a known technique, the mentioned banks would already have been asking for it, implementing it, or doing it.

Personally, I think CloudFlare is one of the few companies on here doing innovating stuff, and solving real issues.

And if not - if they've pulled the wool over my eyes - then at least I can respect their marketing.

PKCS11 and Hardware Security Modules (HSM) have been around for a long time. There's in wide use in companies like NetFlix and Square including "over the network". This is not new.

You'd think if this was a known technique

Isolating the key for security/control purposes is a well known technique. It's what ssh-agent/putty paegent does, where you can even forward key usage over the network to different machines. It's what PKCS11 does to SSL/TLS, where generally the key is a hardware security module (CAs and banks don't have keys sitting on their public facing machines, but instead generally have dedicated hardware for it -- they are generally doing, at a different scale, what is described), though there have been various implementations and proposals over the years to do it over the network (e.g. https://svn.opendnssec.org/docs/p11proxy.png and pkcs11-proxy).

Why isn't it commonly done as Rackspace proposes? Well it's kind of an odd need for limited gain, but is effectively PKCS11 used in a different context. If they are solving a specific need for some of their clients then fantastic, but it's the narrative of the innovation that makes it a bit weird, and yielded a lot of the purportedly negative commentary.

Sometimes the innovation is merely being able to rephrase the apparently impossible question into the one that has an obvious answer.

It's worth remembering, no-one else was offering this or doing this... no-one else had phrased the problem such that the answer (even though it was known) became apparent.

> Generally the key you would give them is for, and limited to, the resources that they cache/reverse proxy

That got me wondering: Why don't attackers then only DOS the dynamic pages? Those that can't be migrated easily, like the login?

The dynamic pages still go through CloudFlare; there's a Web Application Firewall which can block abuse directed at a specific URL, pattern, etc. The cache is only one part of the DDoS mitigation.

(We've been working on a lot of great updates to the WAF for just this kind of thing)

Generally with CloudFlare people put as much of their site behind the CloudFlare proxy as possible, for origin-hiding and other reasons, even if we can't cache it (yet).

And my patch for OpenSSL that does the same thing: https://gist.github.com/indutny/1bda1561254f2d133b18 , ping me on email if you want to find out how to use it in your setup.


Here is an example of how it could be used (in my TLS terminator):


Basically, if you have ever used async SSL API, you should be aware of things like:

In addition to these two, my patch adds:

If one of these is returned - you may get the data that should be signed/decrypted with:

Get the key type (in case of SIGN):

And get signature digest nid with:

Please be aware of the fact that `md` could be `NID_md5_sha1`, take a look at bud's code to figure out what should be done in this case (basically, you'll need to use raw `RSA_decrypt_private()`).

After performing sign/decrypt (which could happen in other thread, or on a different server) you should call:

to supply the result and continue handshake process. At this point `SSL_read()`/`SSL_write()` will start returning proper values.

-----BEGIN PGP SIGNATURE----- Version: GnuPG v1

iQIcBAEBAgAGBQJUG2D2AAoJENcGPM4Zt+iQJdoQAKZxbcGpzHFktSbU3uDocy3R fywWmqkYnoJ5jWF3xn4Excv4dAGhMfb/7tm9nt9zyV8g0Qsu8ChqWTl+kgK+hj9o mV+3jhqPDWR2VhmAC3J5ZsCpNm3IW/iNgGiU+u/k9N2i0WHjYSoTHM/NooN5GIu2 KKhNXPw1Y05yxOZWmbUInMl/uscGWDtzylRNyJpfLFFu3JDQy1sBTKD6UAZC5ERY 7LUZ1TqVdk1DPY3Tf/j4IaB9Ds9teGLGj63J8upJhDjWHibFzV5bx6X+FjknUB9M xaebV4yfHZNRHseBu2ZqTQ2f2MNnXVisdzJRX6oyYeyq872MsJjAFhbFhFTi0sTI T8Y9n8cjuctbn+zTISVyVqEEBl8udWTY1t14SJ9lNcdU3xAf9OzEBVdORpUDqFl+ zteRC145o7gs7mEtJjyBpy8mhXB3mc13ZkC2qaJIyqkqAPODu/xlqCga7oaogHNy Q2wy0HUeX69Ra0ada3TcJQgB14qESj3Uvq1hcgFk7SEXBxkU5NJ2OcItvU1+emd7 hRlQvDqiiQcK9WgsdOIKZpovtT3FswhsIy0Tv77Nx9PY04urOTEgmhPJHveCJOQq i0apvI09YgimXs4Sd5h3rs9TsKrDtG0BG0jM1zfo5zbcKE2IbMpmzOc84MxkwUSl tPV48uw46UVpu4zOOByM =zJGs -----END PGP SIGNATURE-----

No disrespect meant, but from a security perspective the idea of patching security-critical software with a patch from a stranger on the Internet is kind of crazy, isn't it?

All open source software is made up of patches from strangers on the internet.

Not really. Gatekeepers of important open-source software are usually people who are known in the community and often employed by companies who work in the area.

Indeed :) On a serious note, I'm waiting for a reviewal from OpenSSL team.

I think I heard that quote on XKCD once.

indutny is no stranger. Highly respected coder.

Though Fedor isn't just some complete stranger online.

Maybe that's the part I was missing. There was no link to more information in his (her?) HN profile.

> There was no link to more information in his (her?) HN profile.

'His' apparently. I didn't know either, but apparently he was the first to crack Cloudfare's Heartbleed challenge:


His LinkedIn / twitter for reference:



Let's be honest about it, this patch hasn't got any attention from OpenSSL team yet, but I heard that some people from the team are interested in it. Never got a response, but it looks correct to me.

Instead of keeping the key in a potentially vulnerable place, they're putting it in an oracle: pass ciphertext to the oracle, get plaintext back. I'm interested in the authentication between CloudFlare and the oracle. Cryptographic examples involving an oracle tend to refer to the oracle as a black box that just blindly accepts data, transforms it, and replies. Of course, then the oracle's content (a key, an algorithm) risks exposure through deduction if an attacker can submit limitless requests. See http://en.wikipedia.org/wiki/Chosen-plaintext_attack

I'm not at all suggesting that CF hasn't thought of this; rather I want to see their mitigation of the risk.

The key server only accepts mutually authenticated TLS 1.2 connections with a strong cipher suite. We also require both certificates to be signed by CloudFlare's internal Certificate Authority.

A nice thing about supporting a key oracle is that you keep the key material out of the process handling TLS. In the case of Heartbleed like bugs, this would protect against the loss of keys.

What stops them from using SSL between CloudFlare and the 'oracle'?

Not a thing. Not exactly an interesting implementation, but then it doesn't have to be-- normal SSL should work just fine.

That said, silliness can lead to a recursive/"turtles all the way down" kind of issue.

Not really. This while dance is necessary because it's (effectively) impossible to revoke certs on millions of end user systems.

It's _far_ easier to handle revocation on a couple trusted endpoints (especially when you own the CA), so 'normal' TLS does the job here just fine. No need for more turtles.

Would it be enough to simply only allow connections from Cloudflare IP addresses?

As Google and Yahoo will tell you after they found out the US government broke into their dedicated lines between data centers... No. It must be encrypted at every transfer without exception.

There's a huge difference between passively tapping a fiber optic cable and infiltrating a network to inject malicious traffic. All we've ever seen evidence of is NSA's passive tapping of Google & others.

We know the NSA targets routers. They have rootkits and remote exploits for them. They can do packet injection or anything else with the traffic passing through routers they take over.

There are more exploited routers than passive taps.

Wouldn't you want to prevent "passive" tapping? Passive in quotes because depending on what they find people will be killed, tortured, executed, kidnapped, arrested, etc. It's not passive when the NSA scoops up everyone's data and sends it to their spook friends around the world.

Attempting to solve a cryptographic attack with a configuration policy doesn't sound like a wise move.

one of the issues with that is latency, i wonder how they work around that. or maybe they just dont care for latency

The handshake is probably not much slower than it would be if the web browser were connecting directly to the bank, since there is only one round trip with the key server per handshake, and the latency between Cloudflare and web browsers tends to be pretty low since Cloudflare has so many POPs.

yes, but over thousand of connections its significant. what was a millisecond or less becomes 20ms. so 20x slower. not sure how much impact this has in real life, but this is one of the issue with this design (also one of the reason is rarely implemented like this but with HSMs instead)

This seems to only slightly reduce the threat to the banks.

Currently, if someone compromises the Cloudfare servers, they gain the bank's private key and can impersonate the bank until the bank revokes their keys.

With this solution, if someone compromises the Cloudfare servers, they can impersonate the bank by relaying the decryption of the premaster secret through Cloudfare's compromised servers back to the bank. They can do this until Cloudfare notices and closes the security hole.

It's not clear that the difference is all that great in reality, as most of the damage will be done in the first 24 hours of either compromise.

Since key revocation is fundamentally broken it's the difference between having a limited time period where you're exposed and being exposed until the cert actually expires.

"Since key revocation is fundamentally broken"

Do you mean SSL key revocation in general broken or in this proposed solution by cloudflare. If it's the former, would you care to elaborate how it's broken?

Ahh... now I remember reading/hearing about the OSCP ineffectiveness and stapling etc. a few weeks ago (after watching Ilya Grigorik's talk "is ssl fast already" or something like that). Thanks for the reminder.

> Currently, if someone compromises the Cloudfare servers, they gain the bank's private key

This is not strictly true, is it? My interpretation is that at best they get temporary access to a server that will sign for them using the key, but the bank can terminate their signing servers at any time and then safely resume using their key without having to revoke it, since it never left their server.

This does somewhat increase the attack surface, but it lets the bank keep control over their keys and is better than having their keys get compromised and thus having to revoke them.

Here, "currently" is the status quo, which is contrasted to "this solution". So "currently" means "without this solution".

That's how I read it too. You can gain control of a session, but the private key should remain safe. I could be wrong though.

Actually, thanks to session tickets, they can continue to impersonate the bank to existing users for potentially quite a while after Cloudflare lock them out.

This. SSL/TLS can ensure perfect forward secrecy when the private key is compromised, as long as the session information is ephemeral. With session tickets, compromise of the key used to create the tickets means every session created under that key can be compromised, regardless of whether the private key is lost. See https://www.imperialviolet.org/2013/06/27/botchingpfs.html for a great discussion of this.

As described, this scheme doesn't provide forward secrecy either. Anyone who can make requests to the key server can decrypt any past session made using the same SSL private key.

Actually, it does. I posted the RSA implementation, but Keyless supports ECDHE too. Tech details in tomorrow's post.

how so? forward secrecy is public key crypto, the private key never leaves the oracle so all cloudfare needs to do is refuse requests made with the public key. ideally the oracle would block it as well, but having cloudfare do everything means minimal administration for their clients.

So the communication between Cloudflare and the actual SSL key holder is secured by… what? Another key? In that case, any compromise of Cloudflare’s key is the same as a compromise of the original SSL key (at least in the short term).

Yes, I've seen designs for this sort of thing. It's super scary, because you're essentially offering a decryption oracle with your private key to the Internet. It's authenticated, sure, but even still you've actually increased your attack surface.

A compromise of an SSL terminator, or the ability to pretend to be one, gives the same capabilities as a compromise of the oracle itself. The advantage you get is that only the last requires you to revoke and reissue the key when you figure it out(in other cases, simply unplug the oracle). That's relatively minor most of the time: an open decryption oracle is already hair-on-fire levels of bad.

I can see why some organizations (for whom this process is unusually expensive) might be interested, but it probably isn't beneficial most of the time.

Here's the problem: If certificate revocation worked and was reliable this type of setup would not be needed and would actually make you less secure (for the reasons you stated).

However the reality on the ground is that certificate revocation is somewhere between unreliable and damn right broken. Right now the primary way certificates are revoked is via operating system updates or via browser updates (e.g. Chrome has a revocation list they update semi-regularly).

Covered somewhat here:


So while adding an oracle does definitely increase your attack surface, at least you are never dependant on certificate revocation for your ultimate security.

I guess it really boils down to this: What do you trust more, certificate revocation or your own ability to secure the oracle? For me the answer is a no brainer, simply because I distrust certificate revocation so much.

Beyond the brokenness of certificate revocation, just the simple fact that a financial institution in the US must formally notify the Federal Reserve if any of their private SSL keys are compromised is a huge factor that is better mitigated by this strategy.

To me, the biggest question is whether they can keep the oracle highly available even in the face of a DDoS. It still seems like the weakest link, just slightly more difficult for the attacker to target since they don't have a publicly advertised direct connection to it.

A notable difference is that a compromise of Cloudflare's key wouldn't [edit: might not] require the bank to notify the Federal Reserve.

It may be pretty cynical to call that situation a "feature". But I'm sure it came up. And I'm sure they noticed.

It certainly wouldn't be the first time "trivial functional difference" translated to "massive legal difference."

You should probably say "might not", because although we know a key compromise would require notification, you'd have to have an actual lawyer tell you if a compromise of CloudFlare would not. Unless the lawyers are on board with this as a solution to the notification problem, the whole solution doesn't do anything (legally).

It gets a little cumbersome to hedge casual conversation the way attorneys do, but sure, it would have been more accurate to have said:

"might not, under the current rules, which are always subject to change"

You would think the Federal reserve would just react by increasing the scope of what they have to notify for - seeing as a compromised Cloud Flare server could do something nasty in this scenario.

The communication between CloudFlare and the Keyless SSL Server at the customer site is mutually authenticated TLS 1.2 with a specific set of cipher suites.

So if someone breaks into a CloudFlare server, can they steal the CloudFlare private key and then make unlimited numbers of requests against the e.g. bank's oracle?

Aren't you now still depending on certificate revocation but have just shifted the problem downstream (it is now the bank's job to revoke you, rather than the user's browser's job).

Or do you yourselves use "Keyless" technology, so that CloudFlare servers contact some CloudFlare oracle so that they can communicate with the bank oracle?

The mutually authenticated TLS connection between Cloudflare and the oracle uses a certificate that has been signed by Cloudflare's internal CA. When you manage the CA and have influence over both endpoints, revocation can be trusted.

>but have just shifted the problem downstream (it is now the bank's job to revoke you, rather than the user's browser's job).

You make it sound like that's not a _huge_ win... Which would you rather do, ensure revocation on millions of end user systems, or a handful of systems that are controlled by a party you have a close partnership with?

It is a heck of a lot easier to revoke one certificate on one oracle you control yourself than to revoke a certificate on every end user though.

Breaking into a CloudFlare server does not get you this private key. CloudFlare does not keep this authentication key unencrypted on disk.

If you have access to a machine, why would you ever look for a key on disk instead of in memory?

There are different levels of "accessing a machine", and some of them might not give you access to the memory area where the key is stored.

I would imagine that that link would be IP Restricted, VPN'd maybe, carefully monitored etc.

Same thought here. And what if that connection breaks? Another single point of failure?

Multiple connections between CloudFlare and the Keyless SSL Server running at the customer site. Connections are reused, pipelined, load balanced etc.

Interesting. Multiple connections AND multiple servers?


Won't this approach increase the cap-ex investment on the part of the customer, and the complexity to scaling?

Do you require the customer to size their key services to some theoretical peak, or can you offload so much of the process that the customer just needs to make a one time investment?

(Full disclosure. I work for a competitor, but not on problems like this).

We don't expect there to be a cap-ex spend need here because we will be offloading most of the SSL processing to our servers (as well as offloading other parts of the web session).

The customer site now is responsible for providing key signing in a highly available fashion. It makes sense they would need to spend more (probably no more than for an hsm) to ensure the connection never goes down. If you can take down the connection from Cloudflare to the customer site, their website goes down.

I would also observe that given the constraint that the customer doesn't want to actually give out their private key (and nobody should be willing to give it out, really...), this is the minimal possible maintenance burden they could possibly incur. Yes, the customer has to do something, but we already take that as a given when we say we won't hand out the private key. If the customer wants to have the responsibility, being able to reduce that cost to the bare mathematical minimum is a big deal.

Key services are used to kickstart the SSL connection, not support it the whole way through (I believe?), so the 'load' is SIGNIFICANTLY less than terminating the entire connection for the whole session.

Keyless SSL is basically an analogue of ssh-agent(1) for OpenSSL. It's a nice feature that you no longer have to trust CloudFlare with your private key, but there's a huge tradeoff: if your keyserver is unavailable (ironically, due to any of the things CloudFlare is supposed to protect you from or buffer you against -- DDoS, network/server issues, etc.), they can no longer authenticate requests served on your behalf and properly serve traffic.

True -- if the keyless server is down, we can't even complete the ssl handshake for new clients.

The keyserver is stateless, so you can have a bunch of keyservers (which don't need to be particularly near your own infrastructure) to deal with this, though.

There's also session caching (tickets or session id), so this only affects people who haven't connected to a cloudflare site recently.

It IS an issue, but it's not that hard to deal with.

Please remember that CloudFlare is a sort of reverse proxy with some network protections and enhancements.

If your infrastructure is unavailable, users won't get content anyway, SSL or not.

CloudFlare can serve cached responses in the event that your origin server is unavailable. This is particularly useful for static websites, but that's not the most common use case.

In the use case for keyless SSL the caching is pretty useless. Banks (what the article described as the reason this was created) are interactive. They want CloudFare to block the DDOS while letting their customers through. The DDOS attackers aren't firing legit requests (sign-in attempts or whatever) and would never make it through to the SSL handshake.

If your origin server can be found on the internet, it can be DDoS'd/attacked, and your (dynamic) site will go down. This is true of all CDNs. The way to avoid this is to either have the CDN host your app servers or provide a private connection from your origin to the CDN.

I guess technically the TLS-secured static content would now be at more risk than it was before, when the cached content would have been served. But existing session tickets would probably be honored, so only new clients would get rejected. Still, it's a tradeoff some people would probably be fine with.

Cloudflare already has a heavy dependence on the origin server being up, particularly for non-cacheable dynamic websites, so adding a key server doesn't really change anything. It has always been critical that you keep the address of your origin server, and now your key server, secret so that it's not targeted directly by DDoS attackers.

CloudFlare can have a private connection to your private key infrastructure.

All other technicalities aside it's rather interesting. From an HSM perspective it either makes that hardware now very useful or very useless.

Think of a large organization - you've been there (or not), there are 30 internal applications with self-signed certificates. Fail. The organization had purchased an HSM, but never really got it deployed because - well, that was too complex and it didn't integrate well with 3rd party network hardware and failed miserably in your *nix web stack.

This could be interesting - and I'm not commenting with regard to the efficacy or security concerns around this, but mainly the workflow simplicity it provides to large organizations who end up in self-signed-cert-hell because HSMs don't interoperate easily in a lot of use cases.

But to my original statement - this is a very good thing or a very bad thing for Thales and the like. The only requirement for an actually certified HSM, really, is certification against some hardware and software standard you have a checkbox to fulfill. Beyond that this would be a killer in the middleground for those who want an HSM like functionality but don't have any requirements to meet other than housing a secure segment where key management can be done in a more controlled manner.

HSM technology and their vendors are doing fine and will continue to do well in the cloud. Physical control of identity keys is actually one of the few ways to link the cyber world with the physical world, and proxying key access is certainly one reasonable way to do it.

That's Hardware Security Module.

HSM = Hardware Security Module, yes, that was implied and my point.

While this is a cool feature, I wouldn't say the improvement is more than marginal: all potentially sensitive customer data is still available to Cloudflare in plain text. And after all, with a Business plan you can already use your own ("custom") SSL certificate which you can then revoke at any time.

Why not offer a "pass through" mode where the proxying is done on the network layer rather than the application layer? Of course in such a modus all CDN-like functionality could no longer be offered, but it could still do a fair amount of DDOS protection, no?

Well, for the use case given, with "Keyless SSL", if Cloudflare is compromised, then the bank doesn't need to report the incident to the Federal Reserve.

But yes, users' plaintexts would still be compromised.

"Security theatre" indeed.

I am not sure that a Cloudflare compromise would not rise to the level of a reportable event. In my experience/opinion "users' plaintext compromise" is certainly an instance of unauthorized access to customer information. I understand the whole framework here is risk-based so it is a matter of interpretation; but I do not want to be the person who has to explain to the nice folks from the OCC the intricacies of Cloudflare's implementation and why I deemed the compromise low risk.

  >  An institution should notify its primary Federal regulator as soon
  >  as it becomes aware of the unauthorized access to or misuse of
  >  sensitive customer information or customer information systems.
FDIC: Supervisory Insights https://www.fdic.gov/regulations/examinations/supervisory/in...

It seems harsh to call it theatre, and I expect all cloud services will adopt it.

It means people can use a cloud service without giving them private keys. This seems much better than giving them keys.

I too was hoping there would be some clever math to make it so some part of the conversation between client and back end was an encrypted tunnel that cloudfare can route but not read. This doesn't seem to be the case.

But all the same, however small a step, its not theatre.

So, this is not actually keyless SSL but SSL using something like a Hardware Security Module over networked PKCS#11. Did I miss something?

So CloudFlare won't get your private key, but will still get to see unencrypted plaintext for all traffic? Sounds like a huge improvement...

The big deal is that CloudFlare's ability to sign SSL sessions can be yanked with the flip of a switch, with no need to trust CloudFlare.

Of course you still need to trust CloudFlare to let them provide SSL functionality for you in the first place. But often the issue of "do I trust them today?" is separate from "will I still trust them tomorrow? and what happens if I don't?".

E.g. consider if I trust CloudFlare today, but want to be protected in the event that they fail and suffer intrusion. With this service, I could as an emergency just flip a switch (shut down the devices providing the key signing) and instantly disable CloudFlare's ability to accept SSL connections on my behalf, without having to trust that they manage to deal with the intrusion properly.

By allowing me to put less trust in them, I may trust them more, because the ability to counteract misplaced trust is much better.

  The big deal is that CloudFlare's ability to sign SSL
  sessions can be yanked with the flip of a switch, with no
  need to trust CloudFlare.
Sounds like we'll be able to close the stable door within seconds of the horse bolting.

Surely now the federal regulators will let us keep the barn door open?!

They're worried about the other 5 million horses in there.

You always have to trust some server with the plaintext, unless you want a barn that has no door at all.

CloudFlare practices good security. This makes the marginal risk from moving sites onto their servers minimal or even negative, since you only have to worry about the current good security and not CloudFlare-in-six-years security.


Imagine an internet user who uses online banking with one of these banks; registered his domains with Namecheap; keeps some bitcoins at Coinbase; pseudonymously frequents Reddit; and posts anonymously at 4chan every now and then.

By merely passively eavesdropping on all traffic from and to your IP address, Cloudflare can build a profile that links your real world identity to all your Reddit posts you thought were anonymous; to the naughty threads you visited on 4chan; to the amount of money in your bank account; and to the domains and bitcoin addresses you own.

This is something that neither your ISP or any of those individual site owners could ever do, and which might even make the NSA a bit jealous. Now consider what Cloudflare could do on your behalf (and make it look like it was you) if, say, a disgruntled employee was actively out to get you.

I'm not at all saying that there is any reason to distrust Cloudflare, but is it not an enormous amount of power to place in the hands of one entity? Even more so than we do with parties like Google or Facebook, which get immensely more scrutinized over it?

4chan's posting API is actually not (as of this very second) behind Cloudflare.

What threat are you concerned about?

You get to see all the credentials, passwords, balances, private information... of banks' customers. This is no doubt an improvement, you may even be more technically capable than the bank itself, but if someone breaches CloudFlare, they can still get everything they would want from hijacking a connection to a bank.

I don't really see the security improvement. Actually the key server would be an new attack vector, although one that could be firewalled pretty well.

Using CloudFlare implicitly requires trusting CloudFlare.

With this, CloudFlare cannot impersonate you without your endpoint's active consent. As soon as you terminate the service that solves this puzzle for CF, they cannot impersonate you any longer.

So, yes, it is better. Maybe not "great" if your threat model includes CF getting 0wned, but definitely better than giving over your RSA key. ;)

And that's why regulations make it almost impossible for banks to use CloudFare. But CloudFare sees that as a problem, and makes some effort to create a loophole.

You can't claim that this improves the bank clients' security. It's clearly worse than doing nothing.

Think of it as a compromise: You can leverage CloudFlare's CDN to mitigate DDoS and other sorts of nasty attacks AND assure TLS is used with every connection, without giving up your RSA key.

"It's clearly worse than doing nothing." It's not very clear to me. Please explain.

I think "It's clearly worse than doing nothing" is from a purely security perspective. This opens up a larger attack surface, so it's worse in terms of security. But it's better than doing nothing from a business perspective, since it allows for better performance and better handling of DDOS attacks. It's a comprimise. Potentially it is slightly worse in security, but it is much better for performance and uptime.

I think the parent means the key server the bank exposes to CF is a new attack surface. If an attacker could impersonate CF and connect to the key server, they too could authenticate connections as the bank without having the key.

It is a huge improvement. Nobody can impersonate the bank without the bank's cooperation. When cooperation ends, so does the ability to impersonate.

This is not the case with any mechanism that requires you to hand over your keys. Not even issuing a revocation effectively terminates use of a compromised key.

There is no practical improvement here.

What does it matter security-wise if the HSM module (which is what banks use) physically sits on Cloudflare premises or off Cloudflare premises? It's still connected to the same equipment.

It does matter for a variety of other reasons, for example what security clearances are required to work there, but it does not matter much for security.

Now cloudflare employees will have no possible access to the private key, nor do intruders who break into the cloudflare servers. This keeps the bank in full control of who has access to the key, they can stop responding to signing requests at any time and then keep trusting the key on their own servers in the future.

Cloudflare employees will have no access to the key inside a HSM even if it colocated on their premises. That's why you use them.

Please summarize the differences between this protocol and PKCS instead of downvoting.

Without a system like this, you would require many HSMs physically co-located with every server around the world, you would be trusting entirely in the ability of the HSM to withstand prolonged physical attack/analysis by a highly-resourced adversary (I'd consider this security suicide), and you would still not have the ability for the bank to cut off impersonation at any moment.

I have, incidentally, downvoted your comment, because you are complaining about downvotes. Don't do that.

A HSM does not need to be physically attached to "every server around the world". This is what they've built here, yet another network attached HSM, but not by following the standard PKCS protocols.

(On the subject of HSM physical attacks: That's another issue altogether, and does not stop at the HSM. But normally that's not an attack you defend against, because you have the relevant contractual obligations against your infrastructure provider.)

I promise not to ask about downvotes again. But the question was honest; if I'm wrong I want to know it.

If there's an established "correct" solution to this problem, why hasn't anyone pointed to it directly, and why didn't the banks use it?

Could you point to some credible expert commentary (as opposed to anonymous noise on HN) describing why what CloudFlare has done here is wrong?

Why do you think banks don't use HSMs? They do. They are off the shelf products. If it's the "correct" solution to your problems depends on what your problem actually is.

In this case Cloudflare apparently thought it was the right solution in theory but developed their own instead of using existing products and/or standards. I don't know the rationale for this, but I'd be interesting in knowing more, as you can read in my comment above.

I don't know why I should point out that Cloudflare did the wrong thing. Perhaps you are confusing me with someone else?

What I did say is that the alternative to the described solution is to use a HSM, and that their solution should offer equivalent security.

You said "There is no practical improvement here.".

If CloudFlare has not done something wrong, then why did you say that?

Before you answer that question, remember: A hypothetical solution is not a practical solution. A practical solution is always a practical improvement over the case where there was no existing practical solution offered.

And before you say "They should have used HSMs", remember: CloudFlare has made it clear that HSMs being under their control was simply not an option. It was clear in their first blog post, and just for good measure, it was made absolutely explicit in an interview with Ars[0] where CloudFlare's CEO said "there’s no vault we can ever build that they’ll trust us with their SSL keys".

So, how is there no practical improvement?

[0] http://arstechnica.com/information-technology/2014/09/in-dep...

That was in response to: "It is a huge improvement. Nobody can impersonate the bank without the bank's cooperation."

And that is not true. The alternative is not to let other organizations impersonate you without your cooperation. That is very clear from the article. Storing plaintext keys with Cloudflare was never on the table. That's not why they built it.

There reasons to why Cloudflare built their own, probably good ones because Cloudflare employs some talented people, and I would think they have to do with the scale Cloudflare operates at.

Network attached HSMs are off the shelf devices. If you've worked with PKI, you've seen them. And that is what they would have went with if they hadn't built this. If it was right or wrong to go with a home-grown HSM instead of an off the shelf one is not something I could possibly know -- but I know it's not a "huge improvement in security" to build your own. The fact that is offers comparable security is probably why the bank chose it.

If there is one thing to take away from the article, it should be: Don't invent your own security protocols. Buy off the shelf devices. If you really need to build your own, this is how.

previously they had your private key, so they could see unencrypted plaintext anyway.

See: Secure session capability using public-key cryptography without access to the private key.


And here's a similar patent from Akamai (via @cloudpundit) http://www.google.com/patents/US20130156189

Looks like the Akamai patent for basically the exact same process, filed 2012-12-14, priority date 2011-12-16 is still being examined... yet the CloudFlare patent filed 2013-03-07 was granted 2014-07-14.

That's a really fast turnaround for CloudFlare, and not sure how the Akamai filing isn't prior art. Actually, the Akamai filing is referenced in the CloudFlare patent.

From what I can tell, the main difference is that the claims in CloudFlare's patent describe more of the SSL handshaking process and the subsequent handling of requests (all of which is exactly the same as in standard HTTPS proxying). They both claim exactly the same technique for a web proxy to handle SSL connections without having the private key, but I guess the USPTO decided the fact that CloudFlare described more of that process in their patent claims constituted a improvement on Akamai's invention for some weird reason.

The article is somewhat light on content. There are standard protocols for HSM use. What is the reason you didn't use these? There are clear risks involved with inventing your own security related protocols.

We have a really long technical blog post coming tomorrow. Agree that inventing protocols is dangerous so we got iSEC Partners/Matasano to evaluate.

Are we reinventing Kerberos again?

isn't this completely missing the point, i.e. banks being able to say 'no third parties can see our clients identifying information/balances/etc?'

yes, the SSL key doesn't leave the bank, but everything it is protecting is..

It only protects one thing - server identity. The best ciphers do you use DHE for negotiating the key, so the conversation between bank and the client is secure anyway.

Under this scheme, the DHE is between the client and CloudFlare, not the client and the true end-point. CloudFlare still sees the full plaintext of the HTTPS session (as it must in order to do it's magic). The encryption is not end-to-end.

Banks will typically do what the regulations require and then, what the cost of doing it is. Sometimes, that order flips ;)

It's a matter of trust. You do trust Intel corp to not place an RF transmitter inside your CPU, right?

> World-renowned security experts Jon Callas and Phil Zimmermann support CloudFlare's latest announcement sharing, “One of the core principles of computer security is to limit access to cryptographic keys to as few parties as possible, ideally only the endpoints. Application such as PGP, Silent Circle, and now Keyless SSL implement this principle and are correspondingly more secure.”

Ehh... I'd say Keyless SSL implements the opposite of that principle: encryption terminates with CloudFlare but authentication terminates in some bank.

So the problem is, how to get a cloud in the middle while keeping the green lock in the browser? Just yesterday I read Douglas Adam's phrase "technologies biggest success over itself."

Interesting, but what about the latency issues of having to always contact the key server?

There's a very technical blog post coming on this tomorrow, but that problem is addressed by session tickets and by the fact that most of the TLS handshake is occurring with a CloudFlare server typically nearer the web browser than before.

Do you guys have a "global" session ticket cache shared amongst endpoints? Or do you ensure the same user gets routed back to the same termination node? I'm quite curious about this.

Session Tickets are shared globally. Session IDs are shared intra-data center (so regionally). The former works with Chrome/Firefox. The latter with all other browsers.

That's really cool, thanks for responding. I imagined that possibility but never heard of anyone actually doing it(not to say they don't). Maybe I'll need to experiment around with Golang's TLS package :)

Do you use per-site session ticket keys?

That is amazing. I can't wait to play with this code :D

How does this architecture address PFS? I'm guessing a future version would require the exchange of DH private key to make it work...

Nothing that complicated is required. When the server needs to sign the DH handshake, it sends the value to be signed over to the key server, and the key server replies with the signature.

Although the diagrams in the blog post show the non-PFS RSA handshake, I'm sure the architecture already supports the PFS DH handshake too.

I don't like to sound hateful, but this is an obvious solution that any competent person knowing how TLS works would find. If someone tried to patent it, I suppose every smart card would be considered prior art. The only "novelty" is that the connection to the "smart card" is the network.

Not to say that it's not useful, but the article describes it as some grand invention.

Then again, do you know of anyone else who has actually implemented and are using this commercially, at scale? They say it themselves in the article, a prototype is one thing but actually making it into a commercially viable product is something else.

Well, cloudfare can still read all the traffic. I thought that problem had been solved somehow.

Is this the free SSL announcement that CloudFlare said it was going to announce in October?


Is it related? ;)

not really -- only in that they are both SSL-related.

Free SSL is still in the works. More info soon-ish.

I, for one, can't wait. StartCom desperately needs some competition.

My money is on AOL as the CA[1].

[1] http://moderncrypto.org/mail-archive/messaging/2014/000618.h...

My question was more to the point of: will Keyless SSL work with Free SSL? :)

Not how you're likely thinking. That said, we will use the Keyless technology to expand our data center footprint into locations that we wouldn't feel comfortable storing customers' keys. That will end up benefiting even Free customers who will be faster around the world.

Wow, what a great read!

It seems that the correct title should have been "all your keys are belong to us".

After reading the beginning of the piece, I was expected something more...profound. Some deep mathematical breakthrough or something.

Instead they separate the actual key signing, delegating it to the customer's device. That's nice and useful, but isn't quite what I was expecting.

"Tomorrow, we'll publish a full post on the nitty, gritty techical details of how, what has come to be called Keyless SSL™, works."

CloudFlare has learned their lesson from past experiences and how makes the big media announcement one day with a smaller less publicized announcement later covering the technical details. This prevents reporters from stumbling across a HN thread tearing the "new internet saving technology" apart and reporting on weaknesses or flaws.

But it's already fairly obvious how it works. They essentially MITM with the keyserver to receive the SSL nonce. Of course, it's pretty silly to expect cloudflare to have some special mathematical revolution to solve the stated problem. In fact I figure if you could terminate SSL without an online private key, the encryption scheme is simply broken.

But it's already fairly obvious how it works.

It is obvious, and they effectively implemented a custom approach for PKCS11/ssh-agent. Yet the narrative implies some brilliant period of insight and innovation, when really it kind of isn't.

Which is where the "silly" notion that they must have did something novel came from -- their narrative claims it.

Innovation means different things to different people. To you, it seems to mean a mathematical or algorithmic breakthrough. To me, it also means getting an existing idea or technology, and deploying it in a new, real-world context, solving the UX / scalability / security / policy issues that arise in the new context, and make it commercially viable.

The fact that Cloudflare is the first global-level CDN to implement this kind of keyless SSL termination to me is innovation, even though it's based on pulling PKCS11 at the IP level. It's solving a real-world problem in their context, which nobody has solved before, and customers pay for it.

The approach is pretty obvious, and I instantly knew where they were going as soon as I read the phrase "session signing, the only part of the SSL handshake that requires the private key", but still, it's novel to take this existing concept and generalize it to solve the "I don't want to give CloudFlare my private keys" problem. It'll be especially cool if they establish an open standard for the keyserver protocol.

Yes - and it's nowhere near "keyless" - it's just that the key is somewhere elseTM.

Maybe they should have said "keyless termination"? I can see that as they are still terminating the connection and they don't have the key.

Indeed; what they are doing seems to be pretty much the same thing as any (RSA) hardware crypto device. Except that it's over the Internet instead.

Agreed. They could have just used PKCS11 and claimed the same result.

Sounds like Elliptic Curve Diffie-Hellman is used between client/server to establish a private key. Not sure how this is new.

At a glance, it appears that the non-ephemeral RSA signature is handled in the network, but the key exchange occurs at the endpoint.

What's new is the whole "edge calls home with a signing request" piece.

This is a discussion about cyberwarfare in a literal sense. The technical discussion shouldn't really be separated from the economic, political, social and human health concerns because all of those parts of the system interact deeply and directly.

A goal of total political cooperation or submission leads to economic sanctions leading to serious human health effects leading to defensive denial of service attacks. This accelerates the need to decentralize the financial network systems to make them more robust.

How can we imagine though that even after a complete transition to next generation systems that are ground-up distributed designs (not just stop-gap tweaks like this) that we won't have new types of attacks to deal with.

The starting point is the belief system that provides such fertile ground for conflict. We have to promote the idea that human lives have value and that lethal force is not an acceptable way to resolve conflict.

As long as decision makers are living in a sort of 1960s James Bond fantasy world we will all be subject to the insecurity of that type of world. Its largely built upon a type of primitive Social Darwinism that is still much more prevalent than most will acknowledge.

Its much easier to accept a compartmentalization of these problems and focus on a narrow technical aspect, but that does not integrate nearly enough information.

Would be interested to hear from people who are burying my comment if they have any kind of explanation for why they are doing it, such as counterpoints to my statements. In case there is some insight that I might gain from them, since apparently there is a strong disagreement.

If it is not patently obvious to you that broad ramblings about how "human lives have value", "lethal force is not an acceptable way to resolve conflict", "James Bond fantasy" to "primitive Social Darwinism" (and on and on) have next to no relevance to a discussion about Cloudflare's Keyless SSL implementation, of all things, then I don't know how anybody could ever get through to you to help you understand.

GP comment is too generalized to be constructive. That is especially so in this discussion of a specific network security platform, which presumably has specific faults that may be discussed instead of generalities like "belief systems" and "human lives".

You're right, I'm sure none of my general concerns about completely false belief systems and human lives are worth being viewed by any readers in this thread. My comment also may be a little bit difficult to contextualize or disconcerting for readers. Best to keep downvoting it so that it disappears.

Passive-aggressiveness is not an effective persuasion technique.

Almost every sentence of your comment contains non-obvious yet unsupported conclusions. I don't think a HN comment is the appropriate venue for your thinking; maybe a 20-page paper would be better.

The people thinking about this topic are my audience. That they don't want to hear it or bother trying to understand it doesn't mean that it shouldn't be available for them to view.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact