

Nate Lawson Breaks Google Keyczar's HMAC Crypto Signatures - tptacek
http://rdist.root.org/2009/05/28/timing-attack-in-google-keyczar-library/

======
tptacek
Smart Google people worked really hard to get this right. But they used the
wrong comparison function --- one that leaked timing information --- and so
attackers could forge messages.

Atwood really thinks we should be trying to educate developers about how to
build crypto?

~~~
cperciva
This is an incredibly dumb bug -- while Nate claims that "The lesson from this
is that crypto flaws can be very subtle", this isn't really a subtle or novel
flaw from a cryptographic perspective: Replace "HMAC" with "password", and
this attack was published decades ago and is probably the most commonly
mentioned side channel attack in the cryptographic literature.

 _Atwood really thinks we should be trying to educate developers about how to
build crypto?_

I'm not sure where Atwood comes into this; but I don't see how having non-
skilled developers stay away from cryptographic code would have helped here.
Yes, this is the sort of mistake which people with a little bit of
cryptographic knowledge ("just enough to be dangerous") can make easily -- of
course, that's as much an argument for giving them more knowledge as it is for
preventing them from having any at all -- but Stephen Weis is not such a
person. He did a PhD in cryptography, supervised by Rivest; no amount of
saying "crypto is hard, you should use existing libraries instead of writing
your own" would have kept him away from making this mistake.

~~~
smanek
There is obviously plenty of literature about side channel attacks,
particularly timing attacks (which you have contributed to ;-)).

But, among 'lay programmers' I've met, I would say that more than 9/10 have no
idea what a 'side channel attack' is. In my experience, its unusual (~ 3/10)
for a programmer to know what asymmetric encryption is. I don't expect most
programmers to know the particular math behind RSA (let alone ECC/etc), but
many don't even the general idea of having a function and its inverse with
properties A, B, and C.

~~~
tptacek
I suspect that the majority of people who "use" RSA don't feel like they need
to understand the math behind it, even though the math behind it has practical
security consequences.

~~~
brl
I think that plenty of people feel confident rolling RSA or DH into a protocol
design precisely because they _do_ understand the math. The structure of both
algorithms is very simple (In the case of DH maybe even trivial), but if you
don't understand dozens of (not-very-obvious) attacks against both algorithms
you're never going to be able to use them or implement them correctly.

------
crescendo
_This allows an attacker to iteratively try various HMAC values and see how
long it takes the server to respond. The longer it takes, the more characters
he has correct._

Can one of the security gurus here explain this in a bit more detail? I'm
guessing he would base this on an average over a sample of response times with
the same input. Otherwise, how would he account for other things happening on
the server that could affect the timing for a single request?

~~~
NateLawson
The important thing that probably isn't obvious is how few samples you
actually need. Jitter is essentially random, which means it is decorrelated
from the repeated pattern (1 compare takes longer, 255 take the same time)

With the added noise (jitter), all you need to do is distinguish the two
normal distributions. This takes less samples than you might guess at first.
Think of it as the same problem as distinguishing a biased coin (51% heads)
from an indentical-looking unbiased one, merely by flipping each. Given the
known bias percent and a desired confidence level (say 95%), calculate how
many flips you will need to make a decision.

------
pj
That is so sick. (in an awesome way)

It's like evolution in action. The defensive crypto guy's carapace has been
pierced by the hacker's slow methodical leverage.

~~~
tptacek
You want sick? Here's sick.

An AES block is 16 bytes long. You want to encrypt the lyrics to "Yellow
Submarine". You can't just repeat AES for each 16 byte block, because then you
can tell where the words "yellow submarine" --- 1 AES block --- occur: they'll
encrypt to the same block. So you use CBC mode, XOR'ing each plaintext block
to the previous ciphertext block, creating a unique ciphertext every time you
encrypt those lyrics. But "Yellow Submarine" is 902 characters long; the last
block is 6 bytes short. So you pad the message which a count of the pad
characters --- 06h 06h 06h 06h 06h 06h.

With me so far?

Decrypt the message. Check the padding: it should repeat as many times as the
value of the last byte. If not, you decrypted badly. Send an error to that
effect.

Hey guess what! I can decrypt parts of your message!

If I stick a random AES block in your message, it will scramble the padded
block. Every once in awhile, that scrambled message will end in 01h. 01h is
valid padding. The message will be gibberish, but you at least won't tell me
it had bad padding.

You wouldn't tell me if you decrypted to bad padding if I generated 02h 02h
either. But that's much less likely than 01h.

Now I know that your plaintext XOR my random block produces a last byte of
01h. Solve.

Now my point: this is actually way old news --- back to Bleichenbacher and
Vaudenay and probably before that (I'm not the resident expert) --- but
attacks like this have broken peer-reviewed versions of TLS, the most peer-
reviewed crypto protocol ever designed.

Don't build crypto.

~~~
NateLawson
This attack was found by Bodo Moeller in OpenSSL. I posted about this here,
including a link to the slides: <http://rdist.root.org/2008/01/25/tlsssl-mac-
security-flaw/>

~~~
DamienMiller
... and by Bellare, Kohno, and Namprempre for SSH:
<http://www.cs.washington.edu/homes/yoshi/papers/SSH/>

------
brl
I'm skeptical that this attack would work against the Java implementation.
Surely a language which rivals C for performance does not compare arrays of
primitives one byte at a time?

I've thought about locally attacking Unix password authentication in a similar
way for quite some time by throwing random passwords at something like 'su'
until I have guessed the first 3-4 bytes of the password hash (with timing)
and then performing a dictionary attack against the truncated hash. This could
never work though unless the hashes are compared bytewise instead of with
something like an optimized memcmp().

~~~
brl
I looked it up and I'm wrong. Copying an array is implemented with native
memmove() but comparing two arrays really is done one element at a time.

