The MD5 hash function is broken, that is true. However, TLS doesn't use MD5 in its raw form; it uses variants of HMAC-MD5, which applies the hash function twice, with two different padding constants with high Hamming distances (put differently, it tries to synthesize two distinct hash functions, MD5-IPAD and MD5-OPAD, and apply them both). Nobody would recommend HMAC-MD5 for use in a new system, but it has not been broken.
RC4 is horribly broken, and is horribly broken in ways that are meaningful to TLS. But the magnitude of RC4's brokenness wasn't appreciated until last year, and up until then, RC4 was a common recommendation for resolving both the SSL3/TLS1.0 BEAST attack and the TLS "Lucky 13" M-t-E attack. That's because RC4 is the only widely-supported stream cipher in TLS. Moreover, RC4 was considered the most computationally efficient way to get TLS deployed, which 5-6 years ago might have been make-or-break for some TLS deployments.
You should worry about RC4 in TLS --- but not that much: the attack is noisy and extremely time consuming. You should not be alarmed by MD5 in TLS, although getting rid of it is one of many good reasons to drive adoption of TLS 1.2.
Many RC4 deprecation efforts have faced rollback in the face of issues like this; especially on hard to fix embedded devices (think TVs, Cars and phones) with comparatively weak CPUs.
Unfortunately, neither solution is easy: only the very latest chips have AES-NI instructions, and not many clients support Salsa20 yet (OpenSSL does not, for example, and it powers a lot of SSL stuff).
Either way, I don't think Salsa20 is a realistic suggestion for improving TLS performance.
Anyway, TLS is in a tough spot. It's such a widely adopted standard, with so many implementations, that making radical changes is exceptionally difficult. AES-NI leaves the standard mostly alone but requires new(ish) hardware, but on the other hand, implementing newer, faster primitives (like Salsa20) requires essentially turning the massive boat that is TLS.
There are no easy solutions, at least as far as I can see.
The problem is getting the installed base up to TLS 1.2.
But even TLS 1.2 won't help because 1.2 doesn't include ciphers that are screaming-fast without hardware-acceleration. AES-GCM is faster than AES-128-CBC/HMAC-SHA1, but Salsa20-256/HMAC-SHA1 is still twice as fast on my machine. Now if the AES-NI instruction set is available, then AES-GCM handily beats everything by a large margin. (Of course, using hardware acceleration, AES-128-CBC/HMAC-SHA1 is marginally faster than Salsa20-256/HMAC-SHA1, again on my machine.)
The ultimate point is that, without the AES-NI instruction set, new ciphers are just about the only way to get really good TLS performance.
I don't know the answer offhand, but I would suspect that hardware-accelerated AES-GCM would win. It certainly does in single-threaded, "one-session"-esque tests, and the margin of its victory makes me think that hardware-accelerated GCM would be hard to beat by anything.
On my machine, a single thread/core running nothing but AES-GCM can encrypt/decrypt 8192 byte blocks of data at 1.32 GiB/s (this is using OpenSSL's benchmarking feature). Yes, that's gigabytes, not gigabits. It's literally faster than IO for my SSD. (Salsa20, without a MAC, can do the same at about 0.64 GiB/s.)
When I told OpenSSL to use four threads in parallel, it ranked at 5.01 GiB/s, which is absolutely crazy.
That said, beyond a general leaning towards AES-GCM (simply because it is so fast with hardware acceleration), I don't have any hard data on which would be the victor. But I may just construct some benchmarks to test that out, because it's an interesting question.
Note that the suggestion of using Salsa20 is to replace RC4 not only to get better performance, but because RC4 is broken (as you know).
Salsa20 (and ChaCha) can be implemented on constrained devices and reach RC4 like performance. On modern architectures the algorithms word based functionality better utilise the HW than RC4 and can reach better performance.
Yes, AES with HW-support such as AES-NI can provide really good performance too. But then we _only_ have AES (and DES/3DES). Do we want to reduce SSL, TLS to a single symmetric encryption primitive? And no stream ciphers?
Sorry about my alarmist tone - from time to time I need to get rid of my conspiracy theories.
If the security of an algorithm is weakened, then it's important to evaluate the use of the algorithm and make efforts to implement stronger security now. You should feel fortunate that you even get the time to move to something better before all hell breaks loose.
This is the same kind of thinking I hear daily when people say things like, "Just use bcrypt" without thinking about the consequences.
The tendency for programmers to think of security in a nihilistic way continues to boggle my mind. I don't think the article suffers from an alarmist tone. I think it's correct to look at something shitty and call it shit.
Yours is the kind of comment anyone can write without knowing anything whatsoever about cryptography, so I'm wary of going into more detail.
When people try to implement security without actually thinking about what the system is doing, it creates weaknesses in the security, not due to algorithmic weaknesses, but because the organization and the engineering discipline for the future is compromised. Thus, while "just use bcrypt" or "just use HMAC-MD5" might work today, the organization doesn't have the mind to update it when it finally does break.
This is exactly what happened (and is still happening) today after MD5 was broken.
Bcrypt isn't broken or even weakened.
HMAC-MD5 isn't broken.
HMAC-MD5 and bcrypt are unrelated.
Nobody is ignoring the problem of MD5; in fact, suspicion about MD5 animates the very first secure SSL specification we have, from almost 20 years ago. Nobody is saying "just use HMAC-MD5".
They instead should learn that Y is also potentially broken in a given circumstance - and maybe that doesn't apply to my current situation but I need a review process to check that it still doesn't apply to me at a given time in the future.
For someone designing a cryptography application, this understanding should be very deep. I don't think it needs to be as deep for someone who is configuring their Apache server and just needs to know what ciphers to enable and which ones to prefer. In this case it is best to follow an industry best practice based on the type of data being sent over the wire and the compatibility/performance required by the clients/users. Then schedule an annual or quarterly review of those choices to make sure they don't go out of date and keep an eye on security bulletins in case one of them is severely broken.
The author took the opposite lesson from TLS than the one that it actually demonstrates, and the commenter above is harping on that broken lesson.
Thank you for your replies tptacek, I've learned much from this discussion. If I could edit my top comment, I would.
One particular case I remember was use of md5(md5(md5(unix_timestamp()))) to generate "secure" session tokens.
Sorry to say, but "just use bcrypt" is currently the right three word statement that you can use if anybody is asking "I'd like to hash a password, and I don't want to learn all of crypto before I do." Bcrypt is currently among the algorithms that are hard to break if used correctly, deployed widely, has wide support in deployed languages and frameworks and it's fairly simple to use. There's little room for major fuckups here.
There are algorithms that are harder to break (scrypt) or an official standard (PBKDF2), but seriously, bcrypt is currently good enough. Sure, it's always better to read and learn, but sometimes people just have to get things done and I'd rather see them use bcrypt than sha1 or unsalted md5.
tptacek is appears to be too modest to say it himself, so I'll go ahead and point it out: He's not "just a programmer", he's a well-respected computer security and vulnerability researcher.
This isn't to say that you should ever simply take his word for stuff, but rather that you are on one hand preaching to the choir, and on the other that you are probably not considering practical effects on security design that he has to wrangle with all the time.
For instance, it's probably a bad idea to hop immediately from one weakened (not even broken) cryptosystem to The New Hotness just because flaws are uncovered, especially for those doing this without thinking of the consequences. For every theoretical security bug you may fix while doing the conversion, you may very well introduce two much practical security bugs.
Cargo cults are bad wherever they are encountered, even when the cult involves something as seemingly as innocuous as "Cryptosystem $FOO has been weakened, time to jump ship".
I'll say what everyone's thinking: What are the consequences?
An odd point for the GP to make.
Rather, the increased collision resistance comes from the fact that the 64-byte keyed padding puts the MD5 context in a state unknown to the attacker before any of the attacker's data touches the MD5 state. As long as the HMAC key has at least 128 bits of entropy, all possible values of the 128-bit MD5 internal state are nearly equally likely. This makes it much more difficult for an attacker to produce collisions.
With that said, I was under the impression that sites will need to support TLS 1.0 for a good long while, and if that is indeed the case, would they not be better off using RC4? From my understanding, the RC4 attacks seemed less practical than attacks against the implementation of CBC mode in SSL 3.0 / TLS 1.0?
Yes, the installed base is going to keep TLS 1.0 and the legacy SSL block cipher construction in deployment for a long time.
Yes, smart people (among them AGL) have said that the RC4 attack is less practical than the M-t-E timing attack on the SSL CBC ciphers. (By the way, it would be great if we could start putting the blame on M-t-E instead of CBC; the vulnerability isn't in CBC per se. CBC is fine; M-t-E is proven not to be.)
* The timing attack also has remediations (see AGL's famous NSS patch) which don't change the protocol.
* The timing attack is fundamentally unlikely to get more powerful; it's exploiting a very simple, well-understood problem.
* Work on exploiting the RC4 attack is in its infancy, and there are multiple ways the attack could get both fundamentally more powerful and more efficiently implemented.
* There are no software-only fixes to the RC4 problem that don't break the protocol; RC4 is fundamentally and irrevocably broken.
If we want to move away from MD5 and RC4 we first must start deprecating their usage wherever we can. Removing suites in SSL/TLS that uses them is a pretty simple step. Moving _from_ good suites _to_ these suites is the totally wrong way to go.
Disclosure: I work for Microsoft.