

Review requested for Django string signing - ropiku
http://groups.google.com/group/django-developers/browse_thread/thread/297e8b22006f7f3a

======
tptacek
This appears to be timeable. Rack and Rails just fixed a similar bug. Over
LANs, attackers can figure out the correct MAC for any message.

Also, what the heck is a "salt" on an HMAC-SHA1 signature? If you lose the
secret key, you've lost the verifier. You shouldn't be using HMAC signatures
in so many places that keying them individually is a problem.

~~~
simonw
Thanks - this kind of feedback is exactly why we're putting this out for
review.

Can you provide pointers or a bit more information as to how the timeable
issue works, and what the workaround is?

Is there a more technically correct term we should be using instead of 'salt'?
And is that feature a total waste of time? Remember, we're building a
framework for other developers to use - so saying "you shouldn't be using HMAC
signatures in so many places that keying them individually is a problem"
assumes that developers using Django (who may have no crypto / security
knowledge at all) will understand that. Is this something we should cover in
the documentation?

~~~
tptacek
You appear to have the same vulnerability as Google Keyczar had:

[http://rdist.root.org/2009/05/28/timing-attack-in-google-
key...](http://rdist.root.org/2009/05/28/timing-attack-in-google-keyczar-
library/)

Which leads to the question: why aren't you simply using Keyczar?

~~~
simonw
We try to keep Django's dependencies down to the Python standard library + a
database adapter, so I'm pretty limited on which underlying crypto libraries I
can use.

How big a deal is that timing attack? Would it be appropriate to ignore it by
default, but provide people who are concerned about it the option to swap out
the sha1 signing for something using an external library that isn't affected
by it. Or even just drop an optional "time.sleep(random.random() * 0.01)" line
in to the code that generates the signatures?

~~~
simonw
Forget the dumb time.sleep() idea, I just looked at the keyczar implementation
of constant time string comparison. I'll roll that in to the library, thanks.

~~~
illumen
hello,

I just tested the keyczar routine, and it can still be found.

The sleep needs to be long enough, and random enough that it makes the other
variance statistically insignificant.

cheers,

~~~
tptacek
What version of Keyczar did you test?

Here's the current code:

[http://code.google.com/p/keyczar/source/browse/trunk/python/...](http://code.google.com/p/keyczar/source/browse/trunk/python/src/keyczar/keys.py)

From the code:

    
    
        correctMac = self.Sign(msg)
        if len(sig_bytes) != len(correctMac):
          return False
        result = 0
        for x, y in zip(correctMac, sig_bytes):
          result |= ord(x) ^ ord(y)
        return result == 0
    

This code performs the same ALU ops on every byte of the MAC, and does not
perform any bytewise comparisons if the MACs aren't the same size. It's not
timeable.

~~~
illumen
It is quite easy to time for yourself. Try timing it many times with a
sig_bytes off by one(only one character changed). Then compare that with the
correct one. You'll notice the (small) difference.

There are no tests for it in the keyczar that I could find. I doubt they
tested it.

The reasons are many... but this is python, so each operation is much more
complicated than C/asm. There are lots of branches, memory usage and cache
effects - so code is much more input dependant - especially the integer ops
used here. Take a look at this code to see one of the reasons why this python
code has variable timing depending on input(like most code does).

x,y = 3,3

id(x) == id(y) # python shares ints in different variables

~~~
tptacek
The only data-dependent comparison you've got to work with is a comparison
between two register-scale fixed-width integers. I think you're doing
something wrong.

Perhaps whatever you're doing to create the "off by one" case is disrupting
the cache, and you're just seeing the effects of that.

------
brown9-2
I just want to say that it is really refreshing and awesome to see a framework
developer/author/team reaching out to the community as a whole for this kind
of review, rather than requesting feedback from a small group of committed
contributors.

------
cstone
I can tell you're pushing for terseness in the signed text (every character
does count), but I'd definitely consider explicitly describing the
cryptosystem and version you're using in each cookie.

Also, is there some code elsewhere ensuring that the separator character is
always escaped during signing? I don't see anything explicit, but I could
definitely be missing something; I haven't messed with Django internals much..

~~~
tptacek
This is a pointless feature and Django shouldn't implement it.

If Django-signed messages are being persisted long-term in something other
than an HTTP cookie, Django developers are abusing the feature. The sliver of
code that's been published is not sufficient to protect long-term persisted
data.

If Django-signed messages are simply being used for cookies and token URLs,
then it doesn't matter whether they're forward/backward compatible. They have
no long-term value. Applications should fail gracefully when unintelligible
messages are presented. In 99.999% of proper cases, this simply means bouncing
the user back to the login prompt to get a new cookie.

This isn't just pedantry. A fair subset of all cryptosystems have failed in
later revisions because of negotiation vulnerabilities. Along with the
"automatic key rotation" misfeature, this idea is as likely to burn you as it
is to protect you.

~~~
cstone
You're right that it should just fail gracefully, and that people shouldn't
use the data for anything long-lived. I'm not so sure the latter won't happen
often enough for compatibility to become an expected feature, though.

What I'm afraid of is the future hypothetical case where the scheme changes in
the future and a Django developer decides to add in backward compatibility
anyway--by having the verifier check the presented text against both the old
scheme and the new one.

If your main point is "don't reinvent the wheel, use an established system,"
though, I totally agree.

~~~
tptacek
Negotiation in cryptosystems is usually a bad idea. It's a bad idea here.
Anything that would make that feature useful would be an abuse or a threat.

One of the worst ideas Simon is getting from Reddit right now is that he needs
to make this system _more sophisticated_. Version the cryptosystem! Use
truncated SHA256! Revoke messages on MAC failures! Use random sleeps!
Automatically expire keys! Look at this NIST standard I found!

What they need to do is _what every other web framework does_ , because every
other framework has been inspected already.

------
euroclydon
Anyone know the status of OWASP for Python?

Here is a link to their unimplemented encryptor.py class:

[http://code.google.com/p/owasp-esapi-
python/source/browse/es...](http://code.google.com/p/owasp-esapi-
python/source/browse/esapi/encryptor.py)

------
simonw
There's some really good discussion of this going on on programming.reddit:
[http://www.reddit.com/r/programming/comments/ald1m/calling_c...](http://www.reddit.com/r/programming/comments/ald1m/calling_crypto_security_experts_help_review_the/)

~~~
tptacek
OW MY BRAIN. No, don't truncate SHA256 to SHA1 sizes. No, don't use MD5 to
make your URLs shorter. No, DO NOT clear all of a user's signed cookies when
an HMAC fails --- these aren't passwords, they're crypto secrets.

