I'm going to try to save tptacek the trouble of typing it out again:
1. Salt must be per user unique. Do not reuse salt.
2. Do not use md5 or sha* as they are, because they are too fast to compute and thus too easy to brutforce. Instead use a slow hash function, as slow as you can possibly tolerate in your app. E.g. Bcrypt, scrypt, or pbkdf2. These algorithms allow you to tune their slowness.
3. As computers become faster, you will need to make your hash functions slower. Have a plan in place that will allow you to recompute bashes to increase their slowness.
I get so tired of this. Almost ALL of you are thinking with blinkers on. The password field of your website is not the only vulnerability! Most, of the time, an even bigger vulnerability is PEOPLE. People at your company, for example, who have access to the database.
Listen up: rainbow tables CAN BE and ARE of a concern if you run a website, because unless you use a salt, anybody who has access to your database can (speaking figuratively) rainbow-table it straight into their bank account.
Stop thinking about JUST web pages and password fields. That is a mistake some banks and other corporations made, and their databases are now -- remember? -- available for download online.
I'm a bit tired of those "just never ever store passwords in plain text" statements applied way too much universally. Sure, for websites, most of time, bcrypt+HTTPS is one of the best choices. But the article in subject never mentioned that it's about the websites.
One fact. Unless your ISP uses some sort of digital certificates, or authenticates you based on physical connectivity (like port number on switch, DSLAM or whatever), your ISP stores your password in plain text. And that - controversially - is a Good Thing.
It's because your ISP had decided that connection-level security is way weaker than password storage security. The cables may run all over the city and they're harder to protect. The encryption takes place every time when you authenticate your connection, so nobody with a network sniffer would obtain your password easily (active attacks are still possible, but they're harder).
The real rules are much more complex, based on the environment your system's working in — the connection medium, geographical diversity, supported authentication schemes and so on. The only absolutely universally true rule is "secure the whole system as much as you can, starting from the weakest link in the chain (but don't invent your own schemes)".
(Sure, there are public-key crypto, zero-knowledge password proof schemes and so on - and they should be used if possible - but, for example, one the best security options a typical home router could offer for PPPoE is a CHAP with MD5 hashing as defined in ancient RFC1994.)
I see many recommendations here, and in prior discussions, to use bcrypt. Having finally actually looked up how bcrypt works, I have two concerns.
1. It uses Blowfish, but it modifies how the key schedule and S-boxes are set up. Doesn't that essentially make it no longer Blowfish, but rather Blowfish inspired? Blowfish has a long history and has been well studied. By fiddling with it, don't they throw that out?
2. Suppose you are storing passwords encrypted using bcrypt with a cost factor of n. You decide to increase that to m. How do you compute the new hashes for existing passwords? From the description of the algorithm in their paper, it looks to me like you can only do this by having the original, plaintext password. Thus you can only strengthen a stored password when the user logs in, so you can compute the new stored password.
Both of the above issues could be addressed by doing something like the following. What advantages does bcrypt have over something like this?
hash_password(start, cost, salt, pass)
h = pass
for i = start to cost
h = sha256(i . ":" . salt . ":" . h)
To store password p1, with salt s1, and cost c1, compute h1=hash_password(1,c1,s1,p1). Store (c1,s1,h1).
To later update this to a higher cost, c2, compute h2=hash_password(c1+1,c2,s1,h1), and store (c2,s1,h2).
Even better than this is to not generate a hash, but an HMAC with the hmac "secret" being some kind of site-key and salting the input. It can't be hit by continuation attacks, to which prepending an unchanging salt is completely vulnerable.
I was under the impression that even these are vulnerable, because both MD5 and SHA1 are computationally efficient, meaning you really should be iterating the hashing process many times over, or using something like blowfish. Also, I think MD5 has collision problems?
Which really just brings us to our final solution, which is use a system that is designed by others and has been thoroughly tested and trusted.
Folks, for most implementations here there are basically 2 issues to consider (if we rule out man-in-the-middle attacks):
(1) If your attack is from inside (i.e., someone has your database), you still need a salt. Don't let the rhetoric fool you. Even bcrypt uses a salt, although I have seen diagrams of its operation that never mention that little fact. The only real difference is that the salt and the encrypted password are stored, concatenated, in one database field.
(2) If you are ONLY concerned with attack from outside (i.e., an automated dictionary attack against your password field), then you get just as much functionality -- in some ways more -- by introducing a small server-side delay, and using your old hashing scheme... SHA or whatever. You actually have an advantage here because unlike bcrypt, you never have to increase some "work factor"... you just keep using the same delay value. And note that a password hashed with bcrypt at one work factor is not compatible with bcrypt running at a different work factor. They would have to be converted. (There's a "version" field in the encrypted password that is supposed to alleviate this problem somehow, but I am not sure how because that number does not seem to be related to the "work factor" at all. Work factors are user-adjustable. Version numbers are not.)
Further, I would like to add that I am not convinced that this "work factor" is cryptographically sound. What it does is cause the key setup stage to be implemented 2^N times, where N is your work factor. However, blowfish was designed such that an ideal number of rounds was built in to the key setup, in order to achieve full coverage of the key bits with the S-boxes. It is not only possible but maybe even likely that by increasing the rounds beyond that number adds nothing cryptographically. In fact there is a good likelihood that it reduces the cryptographic strength of blowfish.
For a very simplistic example: "encrypting" a message with Rot-13 might keep your kid sister from reading it easily. But encrypt it beyond the ideal number of times -- do it twice for example -- and the message is readable again! You can say "big deal, this isn't Rot-13", but the same idea applies... or at least could apply. Until I see a mathematically rigorous proof that extra key setup rounds do not regress the cryptographic strength, I will remain unconvinced. I read the Neils Provos and David Mazieres paper (I have it right here), and they do not even remotely show that cryptographic strength is any better with more rounds (or, for that matter, that it is not worse). They merely state that they "hope" that is the case. (Quote: "We hope that the unpredictable and changing content of the P-array and S-Boxes will reduce the applicability of yet unknown optimizations.")
Not very encouraging for something dealing with cryptography!
Sure... it will take more time with a higher work factor. But until the method is proven (and it's not), you run the risk of weakening the strength of your encryption.
Cryptography is a complex subject. You can't just willy-nilly toss something together like this and say you "hope" it works (as Neils Provos and David Mazieres did), and call it good. And anybody who places their trust in it without better analyses deserves what they get if it comes crashing down later.