
How To Safely Store A Password - IgorPartola
http://codahale.com/how-to-safely-store-a-password/#
======
timtadh
B-crypt and S-crypt are great libraries to use to solve this problem. However,
the poor man's approach is as follows with HASH being your favorite hash
function

    
    
        h = HASH.new()
        HASH.update(password)
        HASH.update(salt)
        for x in xrange(X):
            HASH.update(HASH.digest())
        return HASH.digest()
    

this approach "strengths" the hash by forcing you to calculate it over and
over again. You should set X to be the number of rounds you want to conduct.
Ie. how slow you want you server to respond to an individual request. It is
always a trade-off between server slowness for individual requests and
"security" of the hash function. The goal is to make dictionary attacks take
longer than is feasible for you attackers to conduct.

[Note: you should absolutely have a different salt for each password with this
approach.]

~~~
brown9-2
Why would you use this "poor man's approach" over bcrypt or scrypt? My
understanding is that these two work on a very similar concept (work factor)
and are free to use.

~~~
ams6110
Some projects require FIPS 140-2 compliance. I've not been able to find that
blowfish or bcrypt are certified. See
[http://csrc.nist.gov/publications/fips/fips140-2/fips1402ann...](http://csrc.nist.gov/publications/fips/fips140-2/fips1402annexa.pdf)

~~~
burgerbrain
If a randomly clobbered together and unvetted system is compliant, but bcrypt
isn't, that just goes to show how little FIPS140-2 compliance actually means.
(as if everybody didn't already know it's worthless)

~~~
ams6110
Nonetheless, some projects mandate use of FIPS 140-2 hashing algorithms, and
afaik bcrypt is not one. So if you find yourself on such a project, bcrypt is
not an option on the table.

I'd be happy to find out I'm wrong.

------
Cushman
I wonder, is it easy to use bcrypt with a variable work factor per-password?

I'm thinking you could take your entropy analysis of the user's password and
set it so that "weaker" passwords use a higher work factor. This analysis
could be easily done before hashing every time the password is input, so an
attacker wouldn't be able to single out weak passwords from the hash file.

Theoretically, you should be able to tailor the numbers so that cracking a
weak password isn't any faster than cracking a strong one, right? Plus it will
encourage your users to use a stronger password in the first place, since it'd
make login slightly faster.

Any obvious problems with this plan?

~~~
gcr
Hm. I don't know about bcrypt at all -- is it possible, given the ciphertext,
to know roughly how much work is required to test a password?

E.g. an attacker can go "Oh! this password will take FIVE SECONDS to test, so
I know it must be a simple password." or "Hey, check this out; this password
can be tested in 0.1 seconds. It must be pretty complex."

In general, I'd guess that these kinds of information leaks are pretty bad
because if an attacker can see how hard a password is to test, he now knows
something about the password.

It may be better if, given a single unchanging hash, if it takes a variable
amount of time to test a given password against this hash, though that might
have its own can of worms.

~~~
gxti
> is it possible, given the ciphertext, to know roughly how much work is
> required to test a password?

The work factor is an input to the digest function, both when creating and
when validating the password. Normally it should be stored alongside the
digest itself so you can increase the work factor over time without disrupting
existing passwords. So you are correct. It might theoretically be possible to
correctly balance the work factor to counter variation in password info
entropy so that all passwords take about the same time to crack, and this
would be very cool and impress members of the opposite sex, but it would not
improve security at all.

Making a probabilistic password checker is also a superficially interesting
idea. Maybe my mind is too small to explore it completely, but it seems that
at best it would be no better than just increasing the work factor.

------
jmillikin
"It’s important to note that salts are useless for preventing dictionary
attacks or brute force attacks. You can use huge salts or many salts or hand-
harvested, shade-grown, organic Himalayan pink salt. It doesn’t affect how
fast an attacker can try a candidate password, given the hash and the salt
from your database."

Why are you storing the whole salt in your database? Isn't it much more common
to keep half of it in a configuration file? I know Django has a SECRET_KEY
parameter for this sort of thing, and hopefully other frameworks do also.

For that matter, why is authentication being handled by the web server? If
you've got data worth stealing (billing, emails, medical), you can afford to
spring the extra few hundred for a proper authentication server.

Gawker's password handling (7-bit salt, in the database, digested with crypt)
seems like the worst possible implementation of secure password storage.

~~~
izak30
I may be mistaken, as I was only browsing. I was looking at Django's auth
system last night (as all of this had me curious about the backend). I don't
think that SECRET_KEY is used for part of the salt at all, I think it's
primarily used for validation of site-generated data, signing requests, and
cookie encoding/decoding.

~~~
StavrosK
You are correct, Django's hashes are stored as salt$digest, or maybe
salt$function$digest nowadays. It wouldn't be very good if all your password
hashes became invalid because you changed the secret key.

------
niels_olson
Fun fact: this has been posted to Hacker News before, when it first came out.
I think most would agree this is a great example of why re-posting should be
allowed.

<http://news.ycombinator.com/item?id=1091104>

~~~
bigiain
+1

Thanks to codahale for writing this, r11t the HN user who originally posted
this early this year, and the discussion by everyone on that HN thread which
convinced me of the correctness of the bcrypt approach.

I've already shipped one project which I'm sure at least part of the reason we
successfully pitched was our demonstrated indepth understanding of password
security requirements.

(I also realised in retrospect that a project I'd designed and specced before
reading that article/discussion was going to be wrong in it's password
handling, I'm disappointed we never got to build that product, but I'm kinda
glad I'm not sitting here thinking "Fuck, what am I gonna do if $project's
database ever gets compromised? It'll be just as bad as Gawker...")

------
bostonvaulter2
If you have a 5-year old database using a given bcrypt work factor, how
difficult is it to transition to a new, higher work factor?

~~~
beej71
This is a good practical question. My read of the algorithm is that you must
force each user to enter a new password, and encrypt that at the new higher
cost.

If you wanted to "upgrade" the passwords to the higher cost key schedule,
you'd just continue the key schedule where it left off--but this would require
knowing the original password! So that's not really an option.

~~~
miGlanz
What about performing the hashing again, on the existing hash (of course it
would be simplest to encrypt from the beginning when user logs in again, but
just for discussion's sake). Let's say we have a legacy db with hashes created
with small work factor. We could simply perform the hashing on these previous
hashes (increasing work factors as appropriate). We could simply annotate the
fact that one have to perform the hashing two times.

Of course we're 'overthinking' it again, but is the above solution viable?

------
city41
My issue is asking a lot of people to change their password because I've
decided to change my encryption algorithm. Is there a best practice for
upgrading encryption without forcing users to do that? Something like when the
user logs in, hold onto their plaintext password for a bit, confirm it's
correct against your current algorithm, and then re-encrypt it with bcrypt?

~~~
niktech
Why not do it behind the scenes?

Simply use the existing MD5/SHA1 hash as input to bcrypt and update all
password hashes in your database in one go. Then, whenever the user logs in
you first apply the old hash function followed by bcrypt before comparing with
what you have in the DB.

~~~
city41
Nice, thanks. This is why I asked. That approach had not occured to me.

------
tocomment
HN, is this the consensus? I thought certain hashes worked fine when properly
salted?

~~~
ax0n
In a case like the Gawker incident, it isn't safe. If the source code to the
password hashing algo is compromised (it was) then the salt becomes useless.
In short, just use bcrypt.

~~~
Xk
No. No no no.

The source code to the hashing algorithm means _nothing_. It is already open
source!

The reason that the salt is there is to prevent against rainbow tables.

The salting did NOT become useless. If they had not salted passwords, then
many _many_ more passwords would have been broken because instead of having to
brute force each and every one, you'd just look it up in a massively large
hash table.

~~~
jrockway
No, he thinks the salt is stored secretly in the source code. Of course, it's
not; you store it right next to the hash so that you don't have to use the
same salt for every password.

------
16s
Micrsoft Active Directory stores what it calls an NT Hash in NTDS.DIT files on
domain controllers.

 _These are unsalted, md4 hashed, unicode strings._

They are fast and easy to crack if you ever do get your hands on them. The
point is that many big companies don't do passwords right, so why expect
Gawker to do so?

Edit: md4 is _not_ a typo. They use md4 not md5... OK.

~~~
count
And that's not the only place they're stored (assuming the default cached
credentials setting is still in place). Also, with the advent of pass-the-
hash, you don't need to crack Windows AD passwords to use them anymore. Just
having the hash is enough for loads of fun.

------
shabble
The part that I don't quite understand about bcrypt and the "scales with
hardware speed" claim, is that, as I understand it, validating a password
requires 3 parameters; password string, salt, and cost factor.

If you start out now with some given cost factor, that is unfeasibly breakable
with modern hardware, that factor will remain stored somewhere, presumably in
the database. Once computing hardware speeds up to the point that your factor
is now practical to break, you can increase the cost factor for new passwords,
but older ones remain susceptible to cracking. The only option would be to re-
encode them periodically with the new factor to keep them secure.

Can anyone clarify if I've understood this correctly, or if I'm missing
something fundamental about bcrypt? I've looked over the usenix paper, but I
can't see anything obvious to confirm one way or the other.

~~~
Strilanc
If you're using bcrypt then the client has transmitted the password to you.
You can store a higher cost hash after verifying the password against the old
hash.

------
iana
If an attacker steals a DB of passwords, it is only a matter of time before
computing power catches up and is able to crack the list, regardless of the
methods used to hash the values. Advances in cryptography are rare, advances
in processor capabilities are not. bCrypt may be the best we can do at this
point to delay this inevitability, but programmer's shouldn't come away from
this thinking that using bCrypt "solves" this problem. An interesting question
related to this is how long a time period is considered "safe" enough to
protect a stolen list? If the list is protected from brute-force cracking for
3 years after theft, is that enough time to render the passwords unusable? 5
years? It seems like the answer to this question would be used to calculate an
appropriate value for bCrypt "speed".

------
mcgin
I made an attempt to implement bCrypt on the last web app I built. The problem
I found with it is that there was no robust implemention of the algorithm for
the tech stack I was using (J2EE) - I'm not sure whether that is the case
outside of Java. jbCrypt was the only thing I could find, and if you look at
the source, it really is a poor implementation. I could have gone ahead and
rolled my own implementation, however I'm no security expert, and much prefer
to rely on something more robustt then my own implementation which in all
likelihood would have bugs in it.

~~~
jbellis
I googled [bcrypt api] to see how hard it would be to wrap with JNA (my guess:
not very), but the first hit was "jBCrypt - strong password hashing for Java."
So I stopped looking.

------
pmjordan
The Java implementation linked from the article, jBCrypt,
<http://www.mindrot.org/files/jBCrypt/jBCrypt-0.2-doc/> uses
java.lang.String's compareTo() function (lexicographical comparison) to
validate the hashes of passwords. Is there cause for concern about this in
practice, considering it will probably have varying runtimes for different
match lengths? (I realise that running in a JITted VM means I can't predict
the assembly language produced anyway)

------
Seth_Kriticos
What the article tries to say is, that your password hashing function needs
three attributes:

* it has to be dog slow (to make brute forcing hard)

* it should be complicated enough to avoid collisions (this really applies to most hash functions)

* it should be suitably salted, to avoid rainbow tables

-> bcrypt is designed like this, if in doubt, use it

------
tocomment
Some everyone is saying how easy it is to crack md5. Say I have an md5 hash
and I know it was created from a 1000 byte string. How long would it take to
enumerate all 1000 byte strings that could create that hash?

~~~
gxti
Actually you can (almost surely) stop cracking at 16 bytes, because that's how
long MD5 digests are and any more bits than that are going to give you hashes
you've already seen. You won't get back the original string but you don't
_need_ the original string.

~~~
seunosewa
Actually, you have to go a bit beyond 16 bytes to be sure.

------
boscomutunga
I think sooner or later computing power is going to increase to the point of
making bicrypt look like md5 or lesser,i think what matters is protecting the
hash.

------
asdfor
Its really simple , AS LONG AS the user uses a weak password, using bcrypt or
not wont protect him. Why ? Well instead of brute forcing the hashed password
i'll directly try to bruteforce using the normal login method of your site
(even if you rate limit my login attempts it wont take that much time...(see
proxys)(if you are thinking about rate limiting per username etc you suck).

If you need yours users account to be safe just force them to use a strong
enough password

hashed(password + salt) = epic win

~~~
Raphael
I think rate limiting per user is perfect. And if the real person wants to log
in while someone else used up their attempts, do a quick email confirmation.

~~~
asdfor
"do a quick email confirmation". And what happens if one or more of your users
gets targeted for a long period of time ? You will force them to open there
inbox every time they want to log in your site ? And this gets even better if
they target your site generaly, it will be a lot of fun for the majority of
your userbase to have to do that "open inbox" step, bet users will love it :)

Sorry mate but your method sounds easily exploitable ... heck using reCaptcha
would be less punishing for the user than your approach.

------
adulau
When seeing the title, I was expecting an article about CMAC or HMAC or even
PBKDF2-like function but it's the "old" bcrypt.

~~~
mfukar
You say that as if it's a bad thing.

EDIT: Oh, you mean it was similar to other submissions. Fair enough.

~~~
adulau
Not really. Just it was already mentioned sometimes ago about the fixed cost
function in bcrypt:

<http://news.ycombinator.com/item?id=762708>

------
joeguilmette
I use a different password for every website. It is easy to remember, I just
have a portion of my password that changes for every website. For example:

dg76fb23S for Facebook, dg76fb23S for hacker news

Been working great for years :)

~~~
Raphael
>dg76fb23S for Facebook, dg76fb23S for hacker news

So you use the same password for everything.

------
easysecured
What if there are no passwords to start with. If users do not enter a password
and instead a hashed key is generated for the device used by the user which
then encrypts everything before it is stored on the server.

This key could be generated in real time and would not be displayed anywhere
on the form and will be transferred in stealth mode to the server.

With no passwords to enter or transmit, there will be nothing to hack.... If
the key generated in origin is itself a strong key decrypting information
stored on the server will require first hacking the key which if not stored on
the server in the first place will make life hell for hackers as they will
require access to the individual devices as well.

Cheers, gurudatt

------
jimwise
I'm not so sure. The other side of the coin is that ``{MD5, SHA1, SHA256,
SHA512, SHA-3, etc}'' have had extensive peer review in the cryptography
world.

There's a long history of ``clever new ways'' to use existing crypto
algorithms turning out to have serious flaws -- a good example is early
attempts to improve the strength of (56-bit key) DES by encrypting three times
with three keys. This turns out to introduce enough non-randomness to make the
result much weaker than one might expect; standard 3DES works by _encrypting_
with the first key, _decrypting_ with the second, and _encrypting_ with the
third, which results in very different properties of the output cyphertext.

I'm not saying BCrypt has the same sort of issues, but I'd like to see some
cryptanalysis of this before trusting my users' data with it. Notably, there
seems to be no links to such analysis on the BCrypt home page -- not even an
argument from the author as to why this code _should_ be cryptographically
sound.

~~~
Xk
No. Use Bcrypt. _Always._

Bcrypt is backed by Blowfish, designed by Bruce Schneier. Go look it/him up.
It's secure.

MD5/SHA1/etc are not weak because they are cryptographically weak (though some
are), it is weak because they are fast. SHA3, when it is picked, will still be
a very bad choice because it too will be fast.

So, why Bcrypt? Well, it uses Blowfish. Blowfish has a very slow key
scheduling algorithm which basically involves a lot of hashing to get the
round subkeys. Bcrypt makes this even slower. So what? Well, with Bcrypt you
could set it up to take .3 _seconds_ to verify a password. Try bruteforcing on
that.

~~~
jimwise
The question is not whether BCrypt is backed by Blowfish, the question is
whether BCrypt uses Blowfish in a way which is cryptographically sound. If it
does not, then an attacker may not _need_ to use brute force. Assuming the
author has read more of Mr. Schneier's book than the quoted preamble, he
should know this -- Schneier discusses this at length in both editions of
_Applied Cryptography_.

Again, an example of this is the 3DES encrypt/decrypt/encrypt process vs. a
more naive encrypt/encrypt/encrypt process. One is substantially stronger than
the other. One is a secure way to use DES, and one is not.

~~~
tedunangst
You're speculating about the existence of a flaw which is there no evidence to
believe could exist. Just because some crypto constructions are not as sound
as naive theory suggests does not mean _all_ constructions are flawed, or even
capable of being flawed.

bcrypt is not an encryption algorithm. It doesn't "protect" your users' data,
so there's no reason to trust or not trust it with their data.

~~~
Xk
Yes there is. A hash function does protect a user's password.

If you don't believe me, consider this hash function

H(A) = (A>>1)&0xFFFFFFFF

There. Hash function. It sends any input to a 32 bit value. Would you use it
for your password though? No. I certainly would not.

Granted, that is an _incredibly_ weak example, but it's one that's easy to see
why it's weak, and thus why a hash function does protect a user's data.

~~~
tedunangst
ok, that's a fair point. i guess i just believe bcrypt does a better job than
that. :)

