
Django-bcrypt - stevelosh
https://github.com/dwaiter/django-bcrypt/
======
stevelosh
Step 1: Install

Step 2: Add to INSTALLED_APPS

Step 3: Enjoy more secure password hashing.

...

Step 4: Set BCRYPT_ROUNDS to something higher than 12 when computers get
faster.

~~~
nicksergeant
You sir, are a machine. Steve started working on this 3 feet behind me a
little less than 2 hours ago.

~~~
jezdez
It took him two hours to monkey patch a few methods?

~~~
stevelosh
Actually the code was already written -- see the bottom of the README.

It took about 30-40 minutes to:

* Add the BCRYPT_ROUNDS setting.

* Remember how to allow for an optional setting (`if foo in settings` and `settings.get(foo, default)` don't work).

* Add the setup.py file.

* Write a README.

* Test with a real Django site to make sure nothing was broken.

* Create and push to repos on BitBucket and GitHub.

~~~
Jasber
_> Remember how to allow for an optional setting (`if foo in settings` and
`settings.get(foo, default)` don't work)._

Instead of:

    
    
        settings.get(foo, default)  
    

Try:

    
    
        getattr(settings, foo, default)
    

This looks useful. Thanks.

~~~
stevelosh
Good idea, pushed!

------
lamby
Is it just coincidence that I wrote almost-identical code about 7 hours before
this? (<https://github.com/playfire/django-bcrypt/>)

------
gst
Bcrypt (or scrypt) should actually be used as default for password hashing in
application servers. Using something like crypt/md5/sha-1 hashing per default
is just completely irresponsible.

~~~
cdavid
Why bcrypt should be used instead of sha-1 ? (I am genuily asking, I don't
know anything about security).

~~~
Groxx
Essentially, because it's slow.

SHA is designed to be fast, so dictionary attacks can find huge amounts of
matches nigh-instantly, and rainbow tables (or GPU EC2 instances!) exist to
brute-force up to 8-12 characters, more than enough to crack most people's
passwords. You can salt the hash (basically: pre-pend a unique string to the
password) to defeat rainbow tables, but SHA doesn't do this by default. You
can SHA something more than once to slow things down, but SHA doesn't do this
by default (and rainbow tables still work - they're _based_ on repeated
hashings).

BCrypt is slow by design (ie, thousands of times slower). It's a much-harder
calculation, and it automatically _uniquely_ salts what it's hashing, and you
can make it more secure incrementally by simply running it through more steps
(it's designed to do this). The speed isn't noticeable to a user while logging
in because checking if their password is correct is still extremely quick, but
attempted-brute-forcers run up against a brick wall as their attempts are now
rainbow-table-proof and slower by orders of magnitude. Best of all: it does
all this _by default_ , and it stores all the necessary information in the
result so comparing values against it is fool-proof. You can't mis-use it.

SCrypt takes all the advantages of BCrypt and goes a step further: it
guarantees that a certain amount of memory (ie, large) is required to perform
the hashing function. So while SHA / BCrypt can still be attacked more quickly
by, say, performing a thousand operations at a time through custom hardware,
SCrypt can demand so much memory that it's simply infeasible to do so, so
you're stuck testing one. password. at. a. time. It's the ultimate death to
brute-forcing, basically.

------
makeramen
bitbucket repo: <http://bitbucket.org/dwaiter/django-bcrypt>

~~~
ylem
I haven't done a diff, is the bitbucket or github repo "official"?

~~~
stevelosh
Technically BitBucket (the GitHub one has "mirror of django-bcrypt" as a
description), but I try to always push to both at the same time.

I'll take patches/pull-requests from either.

------
marcinw
Unfortunately, for a lot of us working in locked down environments, this code
makes use of the py-bcrypt module which uses the bcrypt C implementation. If
you're running on GAE, you're out of luck. I keep telling myself to port to
pure Python, but haven't had time. Anyone interested?

~~~
amalcon
There's a pure-Python implementation of PBKDF2. I know it's not as good as
bcrypt or scrypt, but at least it's better than MD5 or one of the SHA's.

~~~
koenigdavidmj
I saw a Twitter discussion with Colin a couple weeks ago here he pointed out
that Python no longer guarantees constant time for all of the basic binary
operations like addition (since fixed width integers can spill over to
arbitrary precision math if you overflow them, or breathe on them wrong).

Be careful.

~~~
tptacek
Side channels don't matter for bcrypt.

~~~
koenigdavidmj
Can you explain why this is? (Not doubting you; just curious.)

~~~
tptacek
The online operations exposed to an attacker in a bcrypt system are done
almost entirely on attacker-known data.

------
ylem
What were your timings like for bcrypt using 12 rounds? Also, while for the
SHA-x algorithms, there have been numerous tests--what about for the pycrypt
module?

~~~
stevelosh
Timings are going to be completely dependent on the server. Anything I tell
you will be wrong unless you're running on my server.

And I didn't test any SHA-x algorithms. Any algorithm that has a time you
can't easily increase by tweaking a number (BCRYPT_ROUNDS) will eventually
become insecure as computers get faster.

~~~
JoachimSchipper
It's not straight SHA-x, I'd wager; glibc includes a version of phk's
MD5-based crypt() based on SHA2 (with some bonus insanity thrown in; it's a
spawn of MD5 crypt and Drepper, after all.)

Like md5crypt, this new crypt() is not based on established cryptographic
principles.

------
kirubakaran
Does anyone know how to use bcrypt on App Engine? py-bcrypt is not pure python
and hence can't be used.

~~~
pjscott
I don't know of any pure-Python bcrypt implementation, but App Engine will let
you calculate SHA-1 hashes with the built-in hashlib module. To make this slow
enough, you'll need to iterate it a bunch of times (at least 1000 is the usual
recommendation). Something like this:

    
    
        import hashlib
        
        def slow_hash(password, salt, iterations=2000):
            h = hashlib.sha1()
            h.update(password)
            h.update(salt)
            for x in range(iterations):
                h.update(h.digest())
            return h.digest()
    

I haven't tested this code, but hopefully it illustrates the general idea.
Repeatedly run the hash function on the output of the previous iteration. You
may need to bump up the number of iterations later as computers get faster.
It's not as good as bcrypt, but it also doesn't suck, and it should run just
fine on App Engine.

------
cavilling_elite
I thought I remember a post a while back but:

What is the cost (time) of using bcrypt instead of the default?

~~~
stevelosh
Pretty much anything you want it to be. The higher the work factor
(BCRYPT_ROUNDS in this app), the more time it takes. That's the beauty of
bcrypt.

~~~
JoachimSchipper
That said, you might want to scale it back a little. 2^12 rounds takes 1-2 CPU
seconds (on a different implementation, but still.)

------
svlla
next step is to make this the default. unfortunately the django overlords
think it's not important enough to be default.

~~~
jacobian
A few things:

First, There's no such things as "the django overlords". We're an open source
community of thousands. Hundreds contribute. Dozens have commit access.

Vry few among these contributors, and none (that I know of) on the commit team
thinks that bcrypt support "is not important." If you'd like to prove me
wrong, please cite your source.

Now, this has been proposed a few times
(<http://code.djangoproject.com/ticket/5787>,
<http://code.djangoproject.com/ticket/5600>). Each time, substantive problems
with the approach have been found. A few choice quotes from the discussion
should illustrate the issues:

Me, on #5600: "[T]here's a problem with supporting any hash schemes not in
Python 2.3 (our lowest supported Python version): it means databases created
with a different version of Python break when used under a lower one." (Now
it's Python 2.4, and 2.5 soon, but the problem still holds.)

Malcolm, on #5787: "As soon as you start generating passwords that are only
computable based on an optional model, the database can only be used with
Django installations that have that model available. This removes the ability
to move the database around easily. Django operates on a "batteries included"
philosophy for exactly that reason: runs anywhere without lots of extra
dependencies."

Worse, this third-party module (i.e. Python bcrypt module), last I checked,
failed to build on Windows. As much as I hate supporting Windows, I recognize
why we have to.

What I'm trying to say is that the issue is technical, not personal or
political as you seem to think. If the technical issues can be overcome, I
don't see why bcrypt support couldn't be the default.

Finally, I'll close with the obligatory note that open source projects get
driven forward by people scratching their own itch. If this bothers you, fix
it. If your fix is rejected, try again. If you think we're a bunch of asshats,
fork the project. Do any of those things, and I'll respect you. Complain and
sling personal attacks and I won't.

~~~
burgerbrain
If you can't hash passwords properly, you shouldn't be dealing with passwords
at all. It's just damned irresponsible.

 _"the django overlords"_ _"Dozens have commit access."_

I suspect that's what he meant.

~~~
jacobian
> It's just damned irresponsible.

Since you feel that strongly I can expect to see a patch from you fixing the
technical issues I mentioned, right? You write it, I'll commit it. Go.

~~~
stevelosh
Look, I love Django, but to be fair this is kind of a bullshit response.

There are things I hate about Git, and I could fix them myself, but I don't
submit patches to Git. I just use Mercurial instead.

"Send patches" isn't a be-all, end-all response to any criticism of an open
source project.

~~~
jacobian
You're right. I get kinda pissed when people call me "dammed irresponsible"
because I can't find the time to solve some technical problems. It implies a
sense of entitlement to my time that rubs me the wrong way.

You did exactly the right thing: figured out how to solve the problem
regardless. _That's_ something that'll motivate me; calling me stupid or
irresponsible won't.

So yeah, you're right, it is bullshit, and so is the attitude I responded to.
Garbage in, garbage out, I suppose.

~~~
stevelosh
Having several open source projects of my own, and contributing to a few more,
I definitely know how you feel.

Here's what I want to know about this in a nutshell:

Does Django (the project as a whole) want to provide the best possible
security, within reason, for its contrib.auth module?

If not, why not, and why isn't it stated prominently in the documentation?

Is bcrypt not the best possible security, or reasonably close to it?

If not, why not? I'm not even remotely close to a cryptography expert, so
although bcrypt's support for arbitrary work factors seems to provide very
good security to me I know I could very well be horribly wrong in this
thinking.

Is providing bcrypt hashing for passwords in contrib.auth not within the realm
of reasonable effort? This could mean rewriting bcrypt in pure Python and
including it in contrib, to support Windows users.

If not, why not? Perhaps rewriting bcrypt in Pure Python is not easy -- I
haven't tried it myself.

If bcrypt hashing is secure and reasonable to implement, and Django wants to
provide the best security possible (within reason), why is this not a blocking
issue for Django 1.3?

I genuinely don't know the answers to any of these questions, so I'd really
love to know.

~~~
jacobian
> I genuinely don't know the answers to any of these questions, so I'd really
> love to know.

Well, I don't speak for the project as a whole, so I'm going to just answer
personally. I'll try to channel the rest of the core team as best I can, but
please don't take any of the below as any sort of "official" thing. I may very
well have a different point of view or be in the minority -- I often am,
actually.

> Does Django (the project as a whole) want to provide the best possible
> security, within reason, for its contrib.auth module?

I certainly do, and I'm sure the rest of the team feels similarly. We take
security issues very seriously and I'm disappointed we've not been able to
demonstrate that through our past actions (i.e. built-in XSS and CSRF
protection, our security releases, etc.) This indicates to me that we haven't
done a good job being clear about our goals with regard to security. So that's
something to work on.

I think, though, that reasonable people can -- and do -- disagree about what
"within reason" means. I mean, are we building Django to protect against
script kiddies? Malicious employees? Corporate espionage? Government agencies?

Me, I suspect I'd choose to fold in the face of a lawsuit or subpoena, so I
don't particularly care if my passwords are safe against the NSA or something.
But that's because I'm a spoiled comfortable middle class yutz.

> Is bcrypt not the best possible security, or reasonably close to it?

Personally I have no idea. I'm not a security expert, nor am I even a well-
informed amateur. I've read (here and on Reddit, mostly) that bcrypt is the
best there is. I've read that bcrypt is for lamers and scrypt is better. I've
also been told that salts & sha1 is fine. I've also been told that sha1 will
eat my children and burn my house down. I'm honestly not qualified to judge.

Given what I know, my feeling is that bcrypt/scrypt is certainly an
improvement over sha1, and probably an improvement over any sha version. I'm
not convinced that it's an improvement over, say, multiple rounds of a sha
algorithm.

More importantly, I'm not clear on exactly how big a deal this is. There's a
spectrum: at one end, we activate our security policy, halt everything, and
release new versions, damned the backwards compatibility concerns. At the
other end of the spectrum we do nothing. I really don't know where on this
spectrum the issue falls. I suspect that it's somewhere a bit more serious
than the potential timing attacks we just fixed in trunk, but maybe a bit less
serious than the DOS attack our last security release fixed.

> Is providing bcrypt hashing for passwords in contrib.auth not within the
> realm of reasonable effort? This could mean rewriting bcrypt in pure Python
> and including it in contrib, to support Windows users.

Of course it's possible, but the devil as they say is in the details. I think
it should be possible to support bcrypt if it's installed, but the concerns
about data portability need to be addressed in some way. At the very least
there should be some "I don't care about data portability" flag you can set to
turn on bcrupt support.

A pure-Python bcrypt implementation would certainly help. But _I_ certainly am
not going to rewrite bcrypt -- I know enough about crypto to know that I
shouldn't be allowed with a thousand miles of writing an algorithm by hand.
And frankly there aren't any active committers I'd trust to write such an
implementation. It would have to come from a pretty unimpeachable source,
wouldn't you agree?

> If bcrypt hashing is secure and reasonable to implement, and Django wants to
> provide the best security possible (within reason), why is this not a
> blocking issue for Django 1.3?

Because nobody proposed it and we're well past feature freeze for 1.3 and very
close to cutting a release candidate. Also because there's a great third-party
app that provides this feature in a very easy-to-use way :)

But If a majority of the community wanted to make bcrypt (or whatever) a
blocking feature for 1.3 then I'd go along with it. I'd argue against it, but
again I'm just one voice. A loud one, maybe, but I'd like to think I can take
being wrong graciously. I was against template auto-escaping originally, for
example, so clearly I'e already got a good track record of being wrong about
security.

I hope that helps; it's late and I've had a long day. Please ask if I'm not
being clear.

~~~
stevelosh

       I think, though, that reasonable people can -- and do -- disagree about what "within reason" means.
    

Absolutely, which is where my next questions come from.

    
    
        I'm not convinced that it's an improvement over, say, multiple rounds of a sha algorithm.
    

Sure, for this conversation feel free to replace "bcrypt" with "configurable
rounds of SHA1".

    
    
        More importantly, I'm not clear on exactly how big a deal this is.
    

I agree here.

Yes, bcrypt is better.

Is it "better enough" to warrant backwards-incompatible changes? Maybe not.

I'm not clear on Django's database-compatibility policy though. Are databases
created with Django 1.X guaranteed to work with Django 1.Y (where Y < X)? If
not, then there are no problems. If so, then you're right, a backwards
incompatible change like this is not trivial.

Maybe I completely missed this in the docs.

    
    
        At the very least there should be some "I don't care about data portability" flag you can set to turn on bcrupt support.
    

This goes back to my question about databases working with older versions of
Django. Did I miss an important part of the docs?

    
    
        A pure-Python bcrypt implementation would certainly help. But I certainly am not going to rewrite bcrypt -- I know enough about crypto to know that I shouldn't be allowed with a thousand miles of writing an algorithm by hand. And frankly there aren't any active committers I'd trust to write such an implementation. It would have to come from a pretty unimpeachable source, wouldn't you agree?
    

This is the first argument that really convinces me. If you need to support
Windows and don't have anyone you trust to write a real crypt implementation
in pure Python, that kind of kills the idea in its tracks.

A third-party app usable by non-Windows-users seems like the best option.

    
    
        I hope that helps; it's late and I've had a long day. Please ask if I'm not being clear.
    

Definitely. Thanks for taking the time to answer.

