
How To Safely Store A Password - r11t
http://codahale.com/how-to-safely-store-a-password/
======
sordidarray
I disagree wholeheartedly. The article is based around the fallacy we've seen
time and time again, of throwing more cryptography at a problem that
cryptography alone cannot solve. In the end, a crappy password is a crappy
password. Successfully discouraging your users from using a crappy password
has much better repercussions (for the user, for you, and for the web in
general) than switching from a hash-based authentication system to bcrypt. Use
bcrypt for additional security, but do not use it in an attempt to solve the
problem of crappy passwords that the article falsely claims bcrypt solves.

To be more technical, bcrypt, like PBKDF2 and other schemes of that nature,
add a significant amount of additional computation by iteratively applying a
primitive, but still maintain a relatively small circuit size (in comparison
to the primitive itself, which is usually designed to fit onto smart cards and
the like). Creating and using algorithms which are "memory-hard" or require
larger circuits reduces the number of circuits one can place on some area of
silicon and drives up the cost of an attack. In other words, mounting an
attack on bcrypt or PBKDF2 is still cheaper and potentially much faster than
we'd like it to be (which is the reason those algorithms are "tunable"--you
scale the number of iterations up as computers become faster). This, along
with some example memory-hard functions was the topic of Percival's paper,
"Stronger Key Derivation via Sequential Memory-Hard Functions," which
correctly cites Bernstein as the source of emphasis on practicality in
measuring an attack not by computational complexity, but the cost of launching
the attack. See
[http://www.bsdcan.org/2009/schedule/attachments/87_scrypt.pd...](http://www.bsdcan.org/2009/schedule/attachments/87_scrypt.pdf)
if you're interested in reading further.

~~~
codahale
I'm familiar with cperciva's scrypt, and I think it's a great solution. That
said, I've only seen C and Ruby bindings for it.

I'd love to recommend its use over bcrypt, as memory constraints are much more
expensive than computational constraints, but recommending something most
developers don't have access to would result in more weak hashing schemes in
production.

~~~
dfranke
In any language with sane FFI, scrypt is pretty easy to write new bindings
for. At my previous job I wrote a complete scrypt plugin for Openfire in a
weekend, and only a couple hours of that was for writing the JNI bindings.

~~~
codahale
Toss it up on GitHub, man.

~~~
dfranke
I wasn't able to convince my boss to let me open-source it.

~~~
codahale
Awww. :(

------
tptacek
But wait. What if I use a _64 bit salt_ and then AES-encrypt the password with
a key I store half on my server and half in a cookie I send to the user and
then they'd have to _break AES_ to get my passwords? How about _that_ , Coda
Hale?

~~~
jrockway
Only experts should be allowed to innovate in the security domain. Passwords
on lolcats are serious business!

We certainly don't want people in the software engineering industry to come up
with new ideas, implement them, and see how they work in real life. That could
lead to advancement in the field, and that would be bad, because I might have
to learn something new. Shudder.

~~~
tptacek
For a significant portion of users, the lolcats password is equivalent to the
gmail password, which is game over. But don't let that get in the way of your
learning experience, which is simply going to converge on bcrypt anyways.

~~~
jrockway
I know. Don't make anything cool, because someone might misuse it.

------
peterwwillis
I am a crypto noob, so please excuse me if I sound stupid. But why not force
your users to pick a password that is at least 8 characters and contains at
least one digit, one capitalized letter, one lowercase letter and one special
character, then gen a 16-byte salt and run it through SHA-512? The examples
given in the article are for 6 lowercase alpha-only characters, which has
always been relatively trivial to crack.

This brief article I found from 30 seconds in google shows my example at the
bottom ('96 characters'): <http://www.lockdown.co.uk/?pg=combi>

I acknowledge that bcrypt seems like a simple safeguard against password
attacks, but I don't doubt a hacker's ingenuity in developing a method to make
attacks against it reasonable - especially if it becomes widely adopted. And I
think once you've lost your password database you're already completely
fucked, in one way or another.

~~~
tptacek
Once you go down the road of "not doubting a hacker's ingenuity", you might as
well give up and stop applying OS patches. In any case, with "16 byte salts
and SHA-512", you're pitting hackers against the folk wisdom of PHP
developers. With bcrypt, you're pitting hackers against the IACR. You can't
know everything, but you can pick the right battles.

~~~
peterwwillis
Folk wisdom of PHP developers? I think the creators of crypt() are _probably_
smarter than this.

Password hashes are part of a way to mitigate a particular situation:
deciphering a login credential when the password database has been exposed.
Shadow files are locked down to root-only because you should never trust
people with your password hashes. If somebody has the hash, it's just a matter
of time. I don't assume time is on my side.

If bcrypt() somehow makes it near-impossible to brute force password hashes in
a "reasonable amount of time", we'd no longer need our shadow files to be
root-only. We could give our bcrypt hashes out to the world and say, "Ha! TRY
to crack this!" But you don't do that. Because you know deep down in your
heart, once that hash is out in the open, the game is over. It's possible
someone will crack it so you cannot trust it.

And if someone got access to your locked-down password database, someone is
deep enough in your system to have at least read-only access to your password
database. Whether or not they can decode the password they probably have other
ways of getting whatever it is they really want, which is rarely just an
account credential.

I will, however, agree that bcrypt appears to buy you much more time in the
event that your password database is exposed. I am skeptical of how much more
time that would be under the right circumstances, though, and the potential
consequences of CPU exhaustion in the event of some kind of small DoS on the
authentication layer.

Also: can you please comment on the original point of my post, which was using
a complex password in addition to a fast salted hash? When compared to bcrypt
and its purpose/results, is this still an inadequate technical solution, and
why?

(edited to make 3rd paragraph not retarded and add request for clarification)

~~~
codahale
You're asking about the relative merit of two hash functions, one of which
allows your attacker to test a candidate password in 1 _millisecond_ , the
other of which allows the same in less than 1 _nanosecond_. Using the latter
provides your attacker with a 1,000,000x productivity boost. Personally, I'm
not that generous.

As far as CPU exhaustion, there are some _huge_ sites which use bcrypt (like,
in the top 10)[1]. It is not a problem, provided you choose your work factor
carefully.

As far as complex passwords, use them. Try to get your users to use them. But
it's an orthogonal concern: your attacker will still be able to work their way
through the gigantic keyspace of your monster passwords at 1,000,000x the rate
of what they could do if you'd have used bcrypt.

[1] Just looked this up. Not the top 10, but the top 15 for sure.

~~~
assemble
Where did you get a list of how the top 15 sites on the internet
hash/bcrypt/whatever their passwords? That sounds like some interesting
information.

------
alextgordon
So say we tone it down to a more feasible 0.01 seconds per hash, or to put it
another way, 100 requests per second. That's only 4.8 months to crack that
password. There'll certainly be future hardware advances, plus GPUs could be
used to bring that down. And you can be sure someone used "password" or
"123456".

Moral of the story: use long and multiple passwords.

~~~
tptacek
I'm not even checking your math, because that's 4.8 months to crack _one
user's password_. A spectacular win compared to "salted hashes".

~~~
sorbus
I'm sorry if I'm completely misunderstanding, but couldn't a hacker generate a
rainbow table once - of all the common passwords, say - and then, once they
get a copy of the database in which the password hashes are stored, simply
compare them? In the same way that md5 rainbow tables exist, which greatly
reduce the computations required when looking for passwords?

Unless, of course, you salt them (which would turn it into 4.8 months to break
all the common passwords in a table, building a rainbow table with the salt
taken in mind - the only situation in which you're spending 4.8 months per
user is if they're each salted differently, for example with the username)-
but the article doesn't suggest salting bcrypt, so I'm ignoring that.

~~~
tptacek
No. Bcrypted passwords have nonces in them, just like "salted" hashes. You
don't have to think about it. You just use bcrypt.

~~~
boundlessdreamz
Complete noob. Can you explain this more. if you bcrypt(12345) isn't the
result always same ? If the hash varies then what does the inbuilt nonce
depend on ?

~~~
tptacek

       >> BCrypt::Password.create("hi,mom", :cost => 10)
       => "$2a$10$L/c.1uoZSh3oaU1fLrnYK.yyU4PiJXsIAzN22qnbU41liyLn5/of  2"
       >> BCrypt::Password.create("hi,mom", :cost => 10)
       => "$2a$10$3F0BVk5t8/aoS.3ddaB3l.fxg5qvafQ9NybxcpXLzMeAt.nVWn.NO"
       >> BCrypt::Password.create("hi,mom", :cost => 10)
       => "$2a$10$VEVmGHy4F4XQMJ3eOZJAUeb.MedU0W10pTPCuf53eHdKJPiSE8sMK"
    

Despite the pages and pages and pages and pages of conversation about how to
select and format "salts", adding random nonces to hashes is not one of the
world's great CS problems.

~~~
boundlessdreamz
I don't understand it still. The cost factor and the string remains the same
but different hashes are produced. So if a user inputs "hi,mom" and the app
generates a hash, how is it going to be the same as the one in DB ?

Or did you make a mistake and the cost factors are supposed to different ?

~~~
tptacek
Just do what the library tells you to do to check passwords and trust me that
bcrypt hashes already have nonces built into them.

------
peterwwillis
I haven't seen this link on this thread yet so here we go:

<http://people.redhat.com/drepper/sha-crypt.html> Ulrich Drepper's
implementation of a work factor in SHA password hashing. He addresses a
popular article on bcrypt() and how using SHA allows one to follow NIST
guidelines and still have the advantage of a time-consuming algorithm.

    
    
      me@myhost ~/ :) time perl -le'print crypt("some-password","\$6\$rounds=900000\$myrandomsalt")'                                                                    
      $6$rounds=900000$myrandomsalt$O4u/Z5FRNBi3fw6YhAM1V1hC1LAawq9Ri65Kx77GchzOWieXeRs6w83bYqotyqBcz.WE29NygNli93dBDAbpt/
      
      real    0m3.996s
      user    0m3.990s
      sys     0m0.005s
      
    

If someone could comment on this in comparison to bcrypt I would appreciate
it. Why should we use bcrypt if this exists in modern crypt() implementations?

------
shin_lao
A nice post, but it could use some more explanation about designing password
hashes.

For example salt + iterative hash works extremely well too, that's what is
done in the OpenPGP format (<http://www.ietf.org/rfc/rfc4880.txt> \- 3.7.1.3
Iterated and Salted S2K).

------
revelate
We implemented http digest authentication
(<http://en.wikipedia.org/wiki/Digest_access_authentication>) in our
application for the following reasons: \- We needed to securely authenticate
users connecting with browsers and rss readers without ssl. \- We needed an
encryption algo that we could implement in javascript so as to provide
reasonably secure login without ssl.

Is there any way to do the above while storing bcrypt passwords on the server?
Does using bcrypt in these scenarios force you to use https and pass full text
passwords to the server?

------
runn1ng
What is the difference, then, between using bcrypt and iterating MD5 10^6
times?

~~~
kelnos
Presumably, bcrypt is something that has been vetted by crypto experts,
whereas your idea is just some random thing you thought up. Maybe you
_personally_ are a crypto expert and know for a fact that your plan will work,
but the vast majority of people aren't. A packaged solution like bcrypt that
doesn't give the developer enough rope to hang themselves (and their users) is
a much better idea.

The idea is a developer will just do:

    
    
      passwd_str = bcrypt_string_from_password(passwd);
      db_store_password(userid, passwd_str);
    

and just be done with it, rather than having to decide between ROT13
(kidding), hashing, salted hashing, HMAC, etc. A couple years ago I thought
applying SHA1 to a password before storing it in the DB was perfectly fine.
Later I discovered that it was better to use a salt. More recently I've
discovered that both of those approaches are flawed. Fortunately I've never
written a large-scale application that deals with storing passwords, but I'm
sure there are many application developers like me who would write much safer
password storage routines if they didn't have to be the one to select an
algorithm.

Now, I'm not saying bcrypt is the answer (I know nothing about it beyond what
I've read on HN), but that's the general problem with your hypothetical
proposal.

~~~
ErrantX
_whereas your idea is just some random thing you thought up_

He might also have got the idea from one of tptacek's suggestions not long
ago.

The difference between doing the iterations and bcrypt is.. well.. not massive
in a practical sense. You could do either (if your already MD5'ing passwords
once [hmmmm], for example, then it is a quicker solution)

~~~
tptacek
There's a difference: bcrypt (and moreso scrypt) were designed to be hard to
speed up, while SHA1 was designed at least in part to be easy to speed up.

But in practical terms, "stretched" SHA1 is fine. I won't bitch at anyone for
using it.

------
skjlnSAjd
The problem with using a scheme designed to be computationally intensive is
that it doesn't scale. How can I justify 0.3 seconds of computation time if
I'm dealing with tens of thousands of connections per second?

~~~
derefr
Make the client's computer do the work.

~~~
ErrantX
That defeats the point of the hashing :)

~~~
cperciva
Actually, it doesn't.

~~~
KirinDave
In this case, it does. If the client's job is to hash the password then send
it to the server for comparison, you've basically got plaintext passwords.
They may be quite long, but we're at that place where things don't scale
anymore.

~~~
cperciva
Derive a 256-bit key from your password, then send that 256-bit key to the
server. Nobody is ever going to perform a brute-force search against the
256-bit derived keyspace, so any attack will need to run the KDF against a
list of likely passwords in order to get a list of likely keys.

~~~
KirinDave
"Never", huh? Seems to me like an extremely strong statement for a security
officer for a major operating system to be bandying about.

Your proposal also treats a 2-way cryptographic function as a 1-way
cryptographic function. That is, optimistically, a questionable thing to do.

~~~
cperciva
I am perfectly willing to stake FreeBSD's security on the belief that nobody
will ever perform a brute-force search against a 256-bit keyspace.

For that matter, Microsoft, Apple, RedHat, Debian, and Ubuntu all make the
same or (usually) even weaker assumptions in their respective packaging and/or
updating systems.

------
sordina
It probably still makes sense to salt before hashing. It might take months to
crack a password using bcrypt, but is there something preventing a database of
bcrypt hashes being built? (serious question)

~~~
codahale
The bcrypt algorithm has salting built-in.

------
jmillikin
The article is using a definition of "store" of which I was previously
unaware. If you'd like to store passwords (eg, in a keyring or password
manager), use GPG.

~~~
necubi
The article is talking about the problem of storing passwords in a database in
order to authenticate logins. This is for people writing web-apps, not users
looking to secure their own passwords.

~~~
jmillikin
The passwords aren't being stored, though; a value _derived from them_ is. In
a properly designed system, it is computationally infeasible for an attacker
to obtain the original password given the derived value.

~~~
khafra
> In a properly designed system

Proper design of such a system is the subject of the post. If you're a
developer who doesn't want to think about the design of such systems, that's
fantastic! Just use bcrypt.

------
caustic
This is one of the most useful posts on HN in my opinion. It is both very
practical and informative.

------
dschobel
from the bcrypt page: _the other [bit of terrible advice] recommends
reversible encryption, which is rarely needed and should only be used as a
last resort._

Why is reversible encryption a terrible idea for passwords?

~~~
pudquick
Because it means with <single piece of secret info>, your entire customer
base's worth of passwords is now decrypted.

You've now gone from "Even if we get attacked, we've chosen a provably
difficult / secure storage method. The damage should be negligible." to "If we
get attacked - I hope like hell they don't get <single piece of secret info>."

An attack is an attack. If they can get in and steal your password database,
what's to say that they can't get access to the key that decrypts your
reversible database?

------
oozcitak
_How? Basically, it’s slow as hell._

Is that it? bcrypt is good only because it is slow? In that case, why don't we
use whatever we like (MD5, SHA1, SHA256, SHA512, SHA-3, etc) and add a
_sleep(500);_ just before the call? It is guaranteed to keep up with Moore's
law as well!

~~~
rimantas
Did you read the article? The problem is that someone trying to bruteforce
your MD5 hash won't add that sleep(500). Bcrypt is slow by design.

~~~
oozcitak
I thought the attack was against the application. The article talks about the
security of the raw data in the database. I missed that on my first read.
Sorry for that.

------
huhtenberg
I really don't want to use this word, but this is _retarded_. Something like
RFC 2898 would be a good starting point for explaining why.
<http://www.ietf.org/rfc/rfc2898.txt>

~~~
codahale
I'm guessing you haven't read Provos and Mazières' USENIX paper. In the design
considerations, they say:

 _In general, a password algorithm, whatever its cost, should execute with
near optimal efficiency in any setting in which it sees legitimate use, while
offering little opportunity for speedup in other contexts._

PBKDF1 and PBKDF2 both rely upon general-purpose hash functions, which are
_much_ faster in dedicated computing environments such as FPGA or GPU
clusters. The Eksblowfish algorithm at the heart of bcrypt is _extremely_
resistent to optimization.

This is crucial for password storage, since if a CUDA implementation is 3-4
orders of magnitude faster than the implementation your application uses,
you've just chipped off a huge chunk of the advantage offered by your hash
function.

