Hacker News new | past | comments | ask | show | jobs | submit login
How To Safely Store A Password (codahale.com)
325 points by shawndumas on July 1, 2011 | hide | past | favorite | 209 comments



Many developers need to read and understand this. It is far from mainstream knowledge...

The other day, I saw the following post about password hashing in my RSS feed: http://isc.sans.org/diary.html?storyid=11110

- No mention of bcrypt (though the posts mentions key stretching using SHA1)

- "When selecting an algorithm to hash passwords, it is important to select carefully as it is difficult to change the algorithm later. You will have to ask users to change their password if you do as you no longer know what password they picked." --> seriously? you can update the password the next time they log in...

- "You could also add a secret, in addition to the salt. If the secret is not stored in the database, it would not be easily reachable via a SQL injection exploit (yes, you can use them to read files, but it requires sufficient privileges)." --> security through obscurity, nice

- "For the paranoid, you may want to do the hashing on the client side (javascript) . This way, the server never receives the plain text password. We do this here for the ISC website on our login form [2]." --> oh noes...

Note that the blog has ~15k subscribers according to Google Reader...

---------

I also launched a debate on StackOverflow the other day. A self-proclaimed "security expert" (he later edited his post to remove this part) was advising against using bcrypt, arguing that it would facilitate DOS attacks against the login page... He prefers security through obscurity, using a secret salt:

http://security.stackexchange.com/questions/4781/do-any-secu...

I thought it was a bad idea to leave this answer unchallenged, so I tried arguing with him. I was met with arguments of authority such as "Wow you are out of your element and could not be more misguided on this topic" or "you disagree because you don't understand. Show me an exploit you have written, then I'll pay attention to you.". Happily, some (more experienced) people got in on the debate. I hope this will help developers make an informed choice, if they stumble upon his answer...

(edit: list formatting)


What about Debian's reason for not using bcrypt, claiming the time it takes to hash is not a weak point in security of /etc/shadow?

http://groups.google.com/group/linux.debian.user/browse_thre...


I think they use this:

http://www.akkadia.org/drepper/sha-crypt.html

which the version of crypt(3) in glibc. It is almost the same as bcrypt. One difference is that it uses SHA256 or SHA512 instead of Blowfish. But that implementation has the same ability to expand the work factor like bcrypt.

So I'd say that the algorithm they're using is slow and is designed to be slow


DISCLAIMER: I'm a web developer, not a cryptographer. I've had a passing interest in cryptography for the last few months, but I've never worked in the field. I learned about it by reading Cryptography Engineering (great book) and various HN comments and blog posts. So take the following with a grain of salt... and I hope someone knowledgeable can chime in. That being said, let's roll.

> What about Debian's reason for not using bcrypt, claiming the time it takes to hash is not a weak point in security of /etc/shadow?

That's a good question, but I'm not sure the discussion you linked to really explains why Debian does not use bcrypt. The first post is, ironically, the most interesting one: it links to Coda Hale's great article, and to a StackExchange question where we learn that bcrypt has been OpenBSD's default password scheme since OpenBSD 2.1 (http://www.openbsd.org/papers/bcrypt-paper.pdf). OpenBSD being well-known for its code quality and security focus, this would indicate that bcrypt is actually quite adapted for OS password hashing.

Some critics:

First, if you don't have the salt, but you do have the hash, then a rainbow table attack is completely pointless. Reason being is rainbow tables store hashes with a 1:1 ration to text. How the table is traversed is another story, but the fact remains that one hash will lead you to one piece of text. Now add a salt. If the salt is unknown, the length of the salt is 8 characters, and the characters used in the salt are [A-Za-z0-9./], or 64 characters, then there are effectively 64^8 possible hashes for one password. That's 281474976710656 hashes. Even moving at 700,000,000 passwords per second, you have to generate that many hashes per password. Point is, you have one massive keyspace to search through. Good luck.

Here, Aaron says that salts defeat rainbow tables because they are secret, thus increasing the keyspace. While not knowing the salt does complicate password cracking, this is not the reason we use salts. Rainbow tables let crackers trade time for space: they pre-compute password hashes once, and then use this table to avoid brute-forcing each password separately. A salt defeats this because it forces the cracker to either generate a very big rainbow table, which is impractical, or to generate a different rainbow table for each salt, which is the same as brute-forcing... This has nothing to do with the salt being secret. Your only goal here is to have a random salt (a nonce, ideally).

Lastly, the SHA1 and SHA2 algorithms were designed with security in mind. Sure, they're fast, but that's the point. If you're concerned about knocking a login prompt, you shouldn't be considering the speed of the algorithm. Instead, you should be spending your time learning PAM. If you're concerned about someone brute forcing an unshadow file, bcrypt isn't going to help you if the password is low in entropy (he gives an example of a 6-character password- seriously???). If your password is high in entropy, as it should be, then even if SHA1 could churn through 400GBps, it's not going to find it. Case in point, consider http://distributed.net hacking the 72-bit RSA key. 72-bits of entropy, and it would take them 1,100 years at their current rate to exhaust the keyspace entirely. That's only an 11-character password with [A-Za-z0-9] and [:punct:] as the possible characters. 1,100 years for an 11-character password.

Here, the argument is that there is no need to use bcrypt if the user chooses a password with a high entropy. That may be true, but then, what's the problem with making brute force attacks even slower by using bcrypt (or another slow hash function)? Low-entropy passwords would be harder to guess. High-entropy passwords would be impossible to guess.

The rest of the discussion focuses on the use of salts, and on how Debian stores these salts in the /etc/shadow file. At no point do they really talk about key stretching / bcrypt.

------------------------------------------------

Now, I decided to look at what Debian currently uses for password hashing. In the latest login.defs file ( http://anonscm.debian.org/viewvc/pkg-shadow/debian/trunk/deb... ), you can see:

  # If set to MD5 , MD5-based algorithm will be used for encrypting password
  # If set to SHA256, SHA256-based algorithm will be used for encrypting password
  # If set to SHA512, SHA512-based algorithm will be used for encrypting password
  # If set to DES, DES-based algorithm will be used for encrypting password (default)
  # Overrides the MD5_CRYPT_ENAB option
  #
  # Note: It is recommended to use a value consistent with
  # the PAM modules configuration.
  #
  #ENCRYPT_METHOD DES
  #
  # Only used if ENCRYPT_METHOD is set to SHA256 or SHA512.
  #
  # Define the number of SHA rounds.
  # With a lot of rounds, it is more difficult to brute forcing the password.
  # But note also that it more CPU resources will be needed to authenticate
  # users.
  #
  # If not specified, the libc will choose the default number of rounds (5000).
  # The values must be inside the 1000-999999999 range.
  # If only one of the MIN or MAX values is set, then this value will be used.
  # If MIN > MAX, the highest value will be used.
  #
  # SHA_CRYPT_MIN_ROUNDS 5000
  # SHA_CRYPT_MAX_ROUNDS 5000
In my Ubuntu 11.04 install, ENCRYPT_METHOD was set to SHA512 (in /etc/login.defs). SHA_CRYPT_* is not specified, so the default number of rounds is used (5000). I think this means that, by default, Ubuntu hashes password by salting them and encrypting them 5000 times with SHA512. So, Debian uses salted stretched SHA512, which does slow down brute forcing a little... But is still not ideal when compared to bcrypt.

I see no real reason to avoid bcrypt when hashing OS passwords. Maybe they avoid it for backward compatibility? Maybe because it is not standard enough yet? Maybe they want to comply with FIPS 140-2?

Other great discussions on bcrypt / password hashing:

http://news.ycombinator.com/item?id=995634

http://news.ycombinator.com/item?id=1592007

http://news.ycombinator.com/item?id=266266

http://news.ycombinator.com/item?id=1091104


Thank you, that is an excellent reply, covering all of the main points raised in the link.

My impression is that SHA512 with 5000 rounds is competitive with bcrypt (per Schneier in February, at least). But ... this is not the default on Debian, DES is, which I understand is not remotely competitive with bcrypt.

I checked my server, and it uses the default DES... time to harden the passwords.

Refs

1. SHA512 vs. Blowfish and Bcrypt - http://stackoverflow.com/questions/1561174/sha512-vs-blowfis...

2. Schneier on SHA-512 variants http://www.schneier.com/blog/archives/2011/02/nist_defines_n...


I just thought about another reason why bcrypt might be more useful for a web app than for hashing OS passwords:

- when a webapp loses its user table, the devs will be responsible for compromising thousands of password (which user often re-use elsewhere). This is what happened with the Gawker hack and the recent lulzsec database dumps.

- when somebody accesses your /etc/shadow file, he likely already has root access to your machine, so you are pretty much owned anyway. Protecting your password might help a little, but it's less useful than in the above case.

tptacek talked a little about this in one of the posts I linked above: http://news.ycombinator.com/item?id=1091445

------------------------

Now, to address your point: stretched SHA512 (SHA512 iterated 5000 times) is indeed a lot better than calling SHA512 once. But bcrypt still has some advantages over it by design, which makes it even better for password hashing. Some links:

http://news.ycombinator.com/item?id=1091465

http://news.ycombinator.com/item?id=995685

http://chargen.matasano.com/chargen/2007/9/7/enough-with-the...

------------------------

Regarding your second link (Schneier on SHA-512 variants), I don't see anything about bcrypt or password hashing there. He talks about the advances of hash functions, but those are fast hash functions. Fast hash functions are very useful in cryptography, but they are the opposite of what you want for a password hashing scheme.


why is multi-iteration sha-2 less ideal than bcrypt, even if the work factors between the two are calibrated to require the same amount of computation?

the only argument i've heard is that sha-2 can be implemented in specialized hardware (well-funded attacker) that is much much faster than CPU/GPU, whereas this is not as easy for bcrypt?

however, with multi-iteration hashing, you can strengthen the hash further w/o the user having to log in again.


Regarding your question on the differences between bcrypt and iterated SHA-2, I think it's better if I link to comments by crypto / security experts:

http://news.ycombinator.com/item?id=1091465

> the only argument i've heard is that sha-2 can be implemented in specialized hardware (well-funded attacker) that is much much faster than CPU/GPU, whereas this is not as easy for bcrypt?

Yep. I think that's one of the main reasons. As tptacek said in the above link:

"There's a difference: bcrypt (and moreso scrypt) were designed to be hard to speed up, while SHA1 was designed at least in part to be easy to speed up."

In my opinion, the main advantage of bcrypt is the simplicity. It handles everything for you: salting, "slow" hashing, password checking. You simply have to specify a work factor and you are good to go. No need to implement any crypto yourself.

> however, with multi-iteration hashing, you can strengthen the hash further w/o the user having to log in again.

That's a very good point, I hadn't thought about that. I wonder if we can do this with bcrypt too.

IMO, as long as you use either bcrypt, scrypt, PBKDF2, or password stretching, you are good to go...


I know very little about security, but is there something wrong with using a hash in the client side and then using bcrypt on the server so that you never receive the plain text password?


If the client sends hash(password) to the server, hash(password), for all intents and purposes, _is_ the password.

After all, an attacker does not need to recover the password from the hash, all he has to do is capture the hash and replay it.



On the upside, it is an improvement for the use-the-same-password-on-many-services users, assuming the hash is not trivial to reverse like md5.

At this point, I'm less worried about somebody getting access to some random startup I signed up for than I am about them using the leaked password to gain access to my other accounts.

(I've recently changed my password scheme to using a different one for every service, but this just puts me within a minority.)


I'm no crypto expert but wouldn't using a nonce allow this approach to work?


How would you check the nonce? All your server gets is a black box hash.


Not a security expert here.

Doesn't http://en.wikipedia.org/wiki/Digest_access_authentication do this? Also the server could store the nonce in a session.


Hashing passwords on the client indicates that the salt is available to the client. In the event of a database compromise it's guaranteed then that the attacker will have the salt and be able to crack your passwords.

If the salt is stored on the server, in a login.php script for example, and there is (just) a database compromise then an attacker will be at a disadvantage because they will need to figure out the salt first before cracking any of the passwords.

In fact, this is something being ignored here quite a bit. A number of potential attacks only lead to the database - or part of it - being exposed. If you have a large enough salt stored in your unaffected code base then an attacker is going to have a hard time.


Salts are suppose to be considered public. For the most part, they are defenses against rainbow tables and to make an attacker have crack each password individually.


Exactly; Having a secret hash is just another example of futile security through obscurity. You want an attacker to be able to know as many parts of the puzzle as possible and still be thwarted.


Agreed, but an important part of the article was that even public salted md5 passwords are ineffective.


I'm no security expert either, but I'm guessing the real solution is SSL. If you hash passwords in javascript, a middleman could easily just look up the source of your application and decrypt the hashed password coming over the wire.


>If you hash passwords in javascript, a middleman could easily just look up the source of your application and decrypt the hashed password coming over the wire.

Hashing ≠ Encryption.

Unlike encryption, where an encrypted message is intended to be decrypted by a trusted party, a hashed password is never intended to be 'unhashed'. Rather, the hash of the password supplied by the user is compared against the hash value stored in the database. Given a sufficiently strong password P, it should be computationally infeasible for anyone to compute the value of P from hash(P).


Of course, if you're doing the hashing on the client, there's no need to reverse the hash anyway. An attacker can just replay the hash they sniffed. You can do some additional things to avoid this (like using a nonce as part of the hash), but the best solution is still to just use SSL.


An attacker in that position could also capture the password coming over the wire. Both are of equal security in that situation. SSL is also middleman-able.


not storing the password in sql has nothing to do with obscurity. In fact, it's smart. You do not understand what "security through obscurity" means.

Following your train of though, we should display the salt and the hash publicly. That is extremely dumb.

Put your passwords in your post, otherwise you'll be doing security through obscurity! oh snap?


That's not what I said. Some amount of secrecy might better than none. But, you should not depend on it.

If you use a "clever" scheme to store / generate your salt, you make your system more complex in an attempt to hide something that does not really need to be hidden. You might also end up introducing implementation flaws.

That's why bcrypt uses nonces for its salts, and simply appends them to the encrypted passwords.

Also see http://en.wikipedia.org/wiki/Kerckhoffs%27s_Principle


Many developers need to read and understand this.

I note that the question was migrated from StackOverflow to security.stackexchange.com. After all, this is an issue that the security nerds care about. Why bother ordinary developers with it? </sarcasm>


I'd like to see sites offer the option of not using password-based authentication. Instead I'd like to see public key based authentication as an option.

Basically, the site would have a copy of my public key (say my GPG key or an ssh key), and to authenticate I prove that I have access to the corresponding private key.


You've just described SSL with client certificates. Works perfectly well, is extremely secure, and has an extremely bad GUI in pretty much every browser ever. (It's somewhat difficult to use "on the road", but that's arguably a security feature.)


I ponder the quality of the GUI is a testament to the frequency of the mechanism's use.

But the worse GUI is arguably better than no GUI at all:

http://code.google.com/p/android/issues/detail?id=8196 and http://stackoverflow.com/questions/357491/iphone-client-cert...

So, for the purposes of the growing mobile browser market (where having this would be the biggest benefit IMO), this is not applicable.

Which is a bit unfortunate.


The only website I log in with using client certificates is https://www.startssl.com/ and it seems to work quite well. The UI is pretty rubbish in Firefox though I agree. Not tried other browsers.

There's an addon for Firefox called Enigform (I've not tried it out yet) which uses PGP for web authentication:

http://enigform.mozdev.org/


Easy to use on the road - don't use soft certs, use a smartcard! Works for every DOD network user around the world today...


With smartphones this is increasingly feasible. Lots of sites already use smartphone based two factor authentication (similar to rsa keys). There's no reason why a challenge / response system couldn't be set up using smartphones. For example, a website gives you a string of numbers, you input those in your smartphone app and get the response which yiu then input back to the site, the site can't determine the response ahead of time but it can validate it.


This is how blizzard (World of Warcraft) authenticators work. It's quite ironic how an online game has strong security mechanisms, yet many tools that people use just as often or more (gmail, facebook, pretty much all SaaS tools, including business) and that are are certainly more 'important' (in an objective sense, I understand that people are more attached to their WoW character than to their customer database) don't.


gmail at least has two-factor authentication

http://googleblog.blogspot.com/2011/02/advanced-sign-in-secu...

yet my bank's web site, not so much.


WoW and gmail and other 2-factor authentication use the simpler method, 2-factor authentication using a shared secret that isn't passed over the wire (the seed for a PRNG). However, this method is still vulnerable if a hacker gains access to the seeds on the host computer. This is what happened to RSA recently and it's a very bad thing when it happens.

What I was talking about was something different. Public/private key encryption where only the public key is stored on the server.

Here's an example scenario: the server generates a random pass-phrase then encrypts it using the public key. The end-user then uses their smartphone where the private key is stored to decrypt the message and return the original pass-phrase back to the server, proving they have the private key. There are similar ways of achieving a similar result that are less cumbersome and awkward. The advantage is that if the public key is leaked it's not a big deal, it can't be used to gain access to the system.


Facebook also has this feature. It'll send a SMS with the security code and you'll type in the web UI. Since it's SMS and not an app, it also works on non-smart phones. You need to link a mobile to your Facebook account. https://www.facebook.com/help/?page=132501803490562


That's already possible with SSL sites - sign up for an account at www.startssl.com if you want to see a real-life example.


Oops! StartSSL was compromised this month, although they say the issue is remediated now. http://news.netcraft.com/archives/2011/06/22/startssl-suspen...


Yes, using client certificates for authentication does not make your service immune from security problems.


This is how we use github among other things. AFAIK there isn't standard support for this sort of thing in the browsers though.


I'd prefer they'd use OpenID. It would be easier for them to implement, and it'd let you use PKI with providers like https://certifi.ca/


certifi.ca's very own cert is expired. That doesn't make me trust them much.


I don't use them either (since they don't support CNAMEing your own domain to them like MyOpenID), but it was just an example, there are more, and you can even install your own.


What do you do when you loose the keys? And the default behavior or enabling keys whenever your computer is open? One click bank account access?


    What do you do when you loose the keys?
You don't. And ideally you have a single key. If you can be trusted to keep a social security card and a passport, you can just as easily keep a key safe. Print it out. Store it in a safe place. We've been doing that for centuries.

    And the default behavior or enabling keys whenever your computer is open?
You can password protect your keys.


Passports and social security cards are physical items. Because of this, they have the inherent property of not being capable of ubiquity.

If someone steals your passport, you'll know: you won't be able to find it.

Digital keys have no such properties. If someone steals your private key, you will have no idea until you see them steal all your money and accounts.

Physical items also need to be carried to a destination to be used. If someone steals your passport, they may be able to take over your bank accounts, etc. For that, they have to actually go in person to meet a bank manager and pretend to be you. People do that fairly successfully, but it's hardly an efficient process.

A digital passport/key, on the other hand, could be abused immediately after it's been stolen, and could be used across all your accounts within the hour, before you've even realised you've been robbed.

Finally, you can only use one passport at a time. Not only it takes time, but you need a career criminal dedicated to each "process".

A digital key robbery, however, requires no human element, and thus can be done in parallel at a large scale. One could use Trojans to capture the keys of a large number of people and steal their money in an automated manner without ever showing up at a bank.

All those problems can perhaps be solved, but they are not easy and they have nothing to do with keeping the key safe - more to do with keeping the process of using the key as inefficient as possible. The best way to ensure your digital key cannot be abused is to make sure that you can only use it in person in front of other human beings, on authorised hardware.

So, pretty much like the way our bank cards work, then.


I think we have to distinguish between stealing and cloning here. If someone steals the scrap of paper you wrote your key on, then you will know. And passports do get cloned; the issue is convenience.

1. http://www.independent.co.uk/news/uk/home-news/clone-wars-mo...

2. http://www.expatsvoice.org/forum/showthread.php?t=7001

I guess that Mossad can gather the information needed to clone a passport in under half a minute.

So while I agree that the analogy between key reminder and password is not perfect, the point is basically sound.

(On rereading your post, your point [p]hysical items also need to be carried to a destination to be used made me realise that I might have misunderstood the point you were making, but also that you may have misunderstood the point ihodes was making).


The Brazilian government has been trying to popularize digital certificates stored in smart cards and tamper-proof USB tokens that seem to be a solution to the problems you pointed at. The main barrier to adoption right now is cost, mainly because certificate issuers have little competition and a good chunk of money between infrastructure and concession fees to recoup.


Here in Portugal our new national ID cards all have a public/private key pair and the card itself can sign stuff without copying the private key to the machine, making the process very safe even when using it on a public machine.

In a couple of years everyone will have on of those cards, the problem is that nobody has readers.


Here in Belgium we have the same cards, and the readers are cheap. The reason they will be wide-spread is because you can use them to file your taxes online, which many people are starting to prefer over the paper version. I think it's a good incentive for people to start buying these things. I just hope they will become standard on computers soon, but I fear not because the dominant markets (US, really) don't have such systems.


Yeah, if they start requiring the cards for delivering the taxes online I think we'll have millions of PCs with readers in a year or two.


The US military uses smart cards with X.509 certificates for authentication: http://en.wikipedia.org/wiki/Common_Access_Card


I have read this article and read the Wikipedia entry on bcrypt and I still cannot understand something. In this article it states that : "As computers get faster you can increase the work factor and the hash will get slower.". How can you make the algorithm slower over time and still be able to validate user passwords that were stored before you changed the speed? Could anyone enlighten me on this?


I believe that the work factor is encoded with the hash, so bcrypt can identify which work factor to use.


Thanks for your answer. At first I though it didn't make sense because the hash should be one-way and bcrypt should not decrypt the hash or mess around with it other than comparing it with what the user entered but I have searched a bit more and found this post on stackoverflow that explains in detail : http://stackoverflow.com/questions/4443476/optimal-bcrypt-wo...

Your answer was right, it stores the work factor, the salt and the hash so that at any given time you can change the work factor and it will adjust in the database!


The easy way is to support the previous work factors and increase it next time the user logs in (since you'll have the password in plaintext in memory). Either way, bcrypt with a paltry work factor of 7 or 8 is orders of magnitude slower than md5 and sha. Jack that up to 12 or 13 and you're pretty much good to go for years, the only problem is the near 1-second processing time on semi-current hardware.


It's a hash function and, at least over an insecure connection, you should not be transferring the plaintext password from client to server so it's not guaranteed that you will have the plaintext password.

Assuming you're using a secure connection and you're willing to send plaintext passwords over it then yes, you could re-hash the password when a user logs in.


In any kind of scheme where all the server stores is some kind of hash of the password, how are you going to verify a password without sending the plaintext password to the server, so the server can compute the hash and compare against the stored hash?


You could do that with something called a Zero-knowledge password proof[1], of which SRP[2] is an example. As far as I know, SRP is patented. It is extremely clever, though.

[1] http://en.wikipedia.org/wiki/Zero-knowledge_password_proof [2] http://en.wikipedia.org/wiki/Secure_remote_password_protocol


To the best of my knowledge the SRP patents are not a big deal, and SRP is used in a number of commercial products. The bigger problem with SRP is that it's dangerous; it is difficult to screw bcrypt up, but it's very easy to screw SRP up; if you have to implement it yourself, dollars/donuts you're going to end up with an implementation that allows me to log in without a password.


I agree. I should just clarify that I'm all for bcrypt, having used it and recommended to friends/clients/employers. When I found out about SRP, I was researching on ways to do exactly what tzs asked, i.e., verifying a password without sending the plaintext (or plaintext-equivalent). In my case, the client had a login form that couldn't be placed behind SSL (don't ask why). Do you know of any solutions to this problem?


One possible option is challenge-response authentication. There a number of variations, but the simplest example is to hash both the hashed password and a nonce on both ends and check the resulting hash[1]. This means that the server only requires that the hashed password is stored, and the client never sends the password in plaintext as it generates the hash itself. As long as the nonce is unique each time, we can prevent replay attacks.

Various schemes introduce a client nonce and server nonce (to prevent man-in-the-middle attacks) or generate a shared secret nonce through methods similar to the Diffie-Hellman key exchange (so attackers can't work out the nonce).

[1] Ensure that h(h(passwd)+nonce) is equal on both ends where h(passwd) is either stored (server) or created by the user providing the plaintext password.

[2] http://en.wikipedia.org/wiki/Challenge-response_authenticati...


The goal of hashing the passwords is to not store them in plaintext in the database, in case the server is compromised. In your case, an attacker doesn't need the password, the hash is enough to login.


I suppose you mean "goal <...> is to not store them in plaintext"? I was confused there for a second.


Indeed, thanks. I fixed it.


It's rather obvious that we're talking about a system in which the client submits the clear-text password to the server, and of course that should be done over SSL.


The vast majority of site authentications (including many very large sites) use http, so their authentication code should be hashing and salting the password on the client-side and sending the result to the server for authentication. In this model, the site never actually has a plaintext copy of the password.

In this model, the server would receive the (unsalted) hash of the password when the password is set, and a salted hash for each login. At no point will the server ever have the plaintext password.

I describe the implementation details here:

http://loewald.com/blog/?p=1374


This is not a secure system. It is completely broken. The hash simply becomes the password when you do this. All an attacker needs to do is replay the hash and they're in. The "plaintext" is irrelevant, because an attacker doesn't need it.

Also, if you're storing the unsalted hash in your DB, your security is doubly broken. Someone who accesses your DB can more easily brute force the passwords (it looks like you're using a single salt rather than a per-user salt), and your scheme is vulnerable to rainbow attacks as well.


If you know the previous work factor of a hash, you could of course re-hash it New-Old factor times.


Here is an example:

https://gist.github.com/1051238



So does Devise, one of the most popular authentication Ruby gems.


The article claims it takes bcrypt 0.3 seconds to hash a 4 char password on a laptop.

How does a server authenticate users in high volume with bcrypt? ~0.25 secs per auth request might warrant having a separate server just for authentication.


Several of the largest sites in the world have scaled bcrypt. There are longer answers as to why this is not a big deal, but if it helps to kill the red herring that bcrypt might not scale: Twitter uses it.


I'm really interested into the longer answer!

I'm sold on bcrypt but like to hear scaling stories :)


One possible, slightly longer, answer would be that "login" is a relatively rare operation in the grand scheme of things.

Users (usually) need to login only once per session. And on sites that have the little "Remember me"-tickbox a session may very well last for days or even months.


Please, please, share the long answer! (or one of them, at least...)

I would think that using bcrypt for authentication would not only consume more server CPU power, but would also open the door to denial-of-service attacks (if crackers attempted brute-force attacks on the log-in system). Clearly "use bcrypt" is just part of the answer...

I had some thoughts on a way to deal with this:

http://news.ycombinator.com/item?id=2705915

(but my ideas have their own set of problems.)


It depends on the work factor. On my dev server (a pretty old machine), with a work factor of 7, it's about 300 time slower than md5 (about 10ms per bcrypt hash). That's still plenty fast, and much more secure. Bump it to a work factor of 9, and I'm looking at about 1000 times slower (or getting close to 40 ms per hash).

Part of its beauty is you can adjust the work factor to match your hardware speed requirements.

If you really want ultra-secure, a work factor is 13 is getting pretty slow (about half a second per hash on my crappy machine), and yeah, that might justify an authentication server.

(note: my tests were not exhaustive)


Could you design a system where the hard work happens on the client side? For ex, the server sends the client a secret encrypted with the bcrypt hash.. and then the client machine spends a couple seconds working on it and sends back the response. It might allow for a huge work factor without impacting the server.


No, you couldn't. Nothing prevents malicious client from "letting itself in", no matter whether password was correct or not.


bcrypt at 10ms only being 300 times slower would indicate you can only try 30,000 MD5 hashes per second -- which is incorrect. Using a GPU based cracker on a current generation top-end video card you can calculate over 300 million per second.

Bcrypt at 10ms is 3 million times slower, not 300 times.


Your comparison is flawed - you're comparing GPU evaluation times for MD5 against CPU evaluation of bcrypt.


There is no such thing as a GPU based bcrypt hasher because it woudln't be - from my understanding - any faster then a CPU based one. Not all calculations are able to be sped up through the use of a GPU.

So I'm comparing the fastest you can do bcrypt hashes (100 a sec at that work factor) vs the fastest you can do MD5 hashes (250 million per second)


Most of your traffic should be coming from people with logged in cookies. If you make people enter their password on every page, you won't have to worry about high volume. :)

Note that bcrypt is basically insensitive to length of password. 4 chars, 8 chars, 32 chars, all take the same amount of time. And it's tunable. You could go down to only 0.01s per hash and still be a million times slower than plain MD5.


I don't think he's talking about users... I think he's talking about high volume web services.


Webservices usually use signed requested[1][2] for security, not username/password pairs.

[1] http://oauth.net/core/1.0/#signing_process

[2] http://en.wikipedia.org/wiki/WS-Security#Features


I'm in the process of writing some code that is hitting these very scenarios and characteristics right now. The metrics we put on the API call showed a remarkable spike when we switched from plaintext (in development mode) to bcrypt hashes.

However, after logging in and establishing a session, there's really no impact to the user. We were content to trade the speed of lesser hashing algorithms for the security offered using bcrypt. The only noticeable difference is that our login process went from near-instantaneous to a marginal, but noticeable delay, on login.

(The same delay can be noted when setting new passwords.)


I guess by the time there are several users per second just authenticating, the $200 for a GPU that can boost hash performance 100x won't be the problem any more.


An idea would be to use a JavaScript implementation of bcrypt to let clients do the math themselves and just send the encrypted password via HTTPS. This has several advantages. One being that the server-side performance is reduced heavily. Another one is that a password never shows up on the server in plaintext ever.

Disadvantage: Clients need to have JavaScript enabled.


If you are doing this, then the hash IS the password. That means if someone steals your database, the could log in as all of your users just by tweaking the client-side JavaScript with grease-monkey or similar.


What if you use a cheaper hash function (e.g. bcrypt with a lower work factor) on the server? The bcrypt hash is the password, but a very long one compared to the password that the user entered, so presumably it is very expensive to brute force even with a relatively cheap hash function.


Right. I just realize my approach is flawed ;)


That doesn't work. Among other issues, Javascript is so much slower than C or CUDA at such tasks (no native integers!) that the client is going to take forever to calculate a sufficiently-hard hash.


That might be a valid point. I just checked this: http://javascript-bcrypt.googlecode.com/hg/test.html The initialization phase takes very long.


I think the most significant issue is that JS on the client is an absolute minefield for cryptography. Apart from the speed issues, it's impossible to secure...


People keep posting this here. But I think http://www.tarsnap.com/scrypt.html should probably be considered the best way to do this today. Google is even using it in ChromeOS.


It is true that scrypt is better than bcrypt, but the transition from salt+SHA-1 to bcrypt is significatnly better than from bcrypt to scrypt, and scrypt doesn't have nearly as nice of an interface as bcrypt does.


The easy interface and wide availability of language bindings to bcrypt is a big deal. The easier it is to use, the more likely people are to use it, and the less likely they are to mess it up. Scrypt isn't quite there yet, and bcrypt is still very good.


I am surprised nobody had already talked about scrypt yet. Here is an old hackernews entry about it.

http://news.ycombinator.com/item?id=601408

I would have loved to use scrypt, but there is only a C implementation. I would had loved to have at least a javascript one.


Javascript? Have fun securing that one against timing attacks. Also, it's so much slower than the C implementation (remember that crypto algorithms are very carefully designed for 32- or 64-bit words or arbitrary-length integers; Javascript's doubles are a very poor match.)


Timing attacks are not a problem if you're using it to hash passwords. Also, Javascript implementations don't use doubles for pure integer arithmetic.


If I can observe your (SSL-encrypted) login session, I can time how long the server takes to process your (presumably real) password. What makes you think password hashes don't need to worry about this?


Would you really be able to do timing attacks on a server from the client side? I thought you needed a much greater accuracy of measurement, sub-millisecond, than you would be able to achieve with network latency in the mix.



But it would only reveal the size of the password, and you'd need to know the work factor, or the other way around, right? Or am I understanding it wrong?


Doh! To answer your question, probably the fact that it was late, or that I'm an idiot.


Unless the user is logging in 10000 times in a row, how do you plan to get enough data?


I have to point out the slowness of JavaScript is real in this case. PBKDF2-HMAC-SHA256 with 5000 iterations (and scrypt uses PBKDF2 on input and output of a memory-hard function) takes ~2 sec. on my machine in Chrome. Even new typed arrays are not very helpful.

This means that you essentially have to make your KDF's workload much weaker than if you used C implementation to get an acceptable performance.


It's good to raise awareness of this issue. When more devs began using bcrypt or scrypt, offline password cracking will be much, much more difficult.

The only reason GPUs are cited as testing 600 million hashes a second is that the underlying hashes came from a Microsoft Windows Active Directory where they were simply MD4 encoded. That speed is not possible with bcrypt. Devs need to understand this.

Edit: Yes, that's MD4 not MD5. Microsoft Windows NT hashes are simply Unicode strings that are MD4'ed. This includes Windows 7 and Windows 2008 server.


And note if you already have this hash you can use it to login directly anyway as most Windows network protocols take this hash directly. The real important thing IMO is NTLM challenge/responses based on the hash, which unfortunately is not much better. In case of NTLMv1/MS-CHAP it is three 56-bit DES operations on separate parts of the 128-bit hash (the third being only 2^16 so it is easy to precompute, as shown by asleap). NTLMv2's HMAC-MD5 is fast too.


Honest question. Why would someone who handles more than a couple-dozen login attempts per second choose to use bcrypt? It would seem that the computational overhead of supporting bcrypt at scale would not make a lot of financial sense.


You can change the work factor to scale the amount of time it takes to do the processing. Beyond that, there's a cost to security. If you actually need security, you may need to pay for it. How much is it going to cost you to buy an extra server vs the loss of business due to a serious security breach?


In the court of public image, simply loosing the data in the first place will cost you business, now matter how secure that data may be. The trust to secure your data will be lost, regardless of whether you used bcrypt or sha or plaintext.

If you consider the loss of business to be pretty much constant in the event of a loss of control of the customer's data, what is the value of extra server(s) (plus the cost of maintenance) for one of the least valuable pieces of information a customer can give you?

As an example, look at Instapaper. How many people are more concerned that the government has access to their sha1 encrypted passwords as opposed being concerned that the government has free access to their reading history?


For sure there's going to be loss of confidence either way. You're probably going to have a bigger loss if the security experts are pointing out that you couldn't even get password storage right, though.

Obviously they It's up to you whether you care enough about your customers to try to get security right. You gave a reasonable argument for why using bcrypt instead of md5 makes little financial sense. However, that same argument applies to plaintext vs hashed. Or hashed vs hashed with salt. In the end, you need to decide how much you care about security, and also weigh how much security you are morally obligated to provide.


Plaintest vs. hash, or salted hash vs. salted hash do not incur the same performance penalty.

And while I may feel morally obligated to provide great security, the people I report to would not be as impressed with that in a cost-benefit analysis.

tl;dr It's a great idea, but a tough sell, particularly at scale.


My hashed/unhashed comparison was not a performance statement, but a trust statement. If a breach will lose the same amount of confidence regardless of password storage, why bother with even hashing?

I can't imagine how "good security - scalable enough for Twitter" can be a hard sell. Your customers certainly won't by sympathetic if they learn that you willfully chose weaker security to save a buck.


Depends. Assuming you're only hashing the password once per session, I doubt bcrypt would make a huge dent in your overall CPU time.

With a work factor of 7, hashing might take around 10ms. Which is not a whole lot, but much, much more secure than MD5 and SHA1.


Can someone help me understand what a password cracker does with a list of salted/hashed passwords? How do they know they've figured out the right plain text passwords without bouncing them against the authentication logic?


Typically attackers do this during an offline brute-force attack (where they have a copy of the hashed password, probably along with other user information from the database).

The attack is done by getting a list of plain text passwords and then running them through the same hashing algorithm as was used originally (adding the salt value if present). Once the attacker has done that they just compare the hashed strings. If they match then it'll be the same password.

Commonly the attacker would start with a dictionary of common passwords and submit them along with common variants (eg, password, Password password1 passw0rd).

If that's not successful they can move onto pure brute force (eg, a , ab, ac, etc)


i suspect you don't know that the salt is stored with the password [edit: when using standard libraries like crypt and bcrypt - please don't invent your own scheme]. so when someone steals the password list they get the salt too.

the salt is not "secret" - it is stored in plain text for each password. it does not need to be secret to do its job (defend against rainbow tables).


Sometimes the salt is stored with the password. I sometimes see having a hardcoded salt referred to as 'security by obscurity', but considering that many attacks result in just a database dump, not giving attackers access to (at least part of) the hash is a useful security layer.


The real question should always be how you detect and handle these attacks. Allowing someone to attack your service for 12 years and eating up your resources in the meanwhile just sounds too passive a solution.


That's not the attack you worry about: instead, consider the case where someone somehow obtains the database and can do an offline attack on it. Be it a SQL injection or account compromise (or sheer negligence and publishing the database), once that happens you'd better handle passwords reasonably well.

If the only attack situation you're worried about is a online guessing attack, then there's no need to even hash passwords.


Sounds to be the case, thanks for the tips.


What always annoys me with the discussion of passwords is that everybody here focus on a technical solution that allows the user to continue to use insecure passwords.

That isn't the problem. Reuse is. And the best way around that is to not let the user select the password, just generate it server side. Technically this is easier to get right than some complex password generation scheme and the end result is properly better too.


Reuse is a problem, but weak passwords are the biggest problem. If you have a strong password that never gets cracked/leaked/intercepted but you use it everywhere you're still fine, but that's not recommended.

Generating server-side passwords is horrible as well since you're counting on the user to write it down and/or let the browser's password manager save it. What if the user wants to log in with a different machine/browser?

I think the best solution is to, first, forbid passwords under 10 characters. 8 is still relatively secure, but not really, 10 at least gives you a fighting chance. Second, run your own dictionary attack (you can load /usr/share/dict/words into a hashtable and check for membership in sub-second times..in fact, here:

    import sys
    f = open('/usr/share/dict/words')
    words = {}
    for line in f:
      words[line.replace('\n','')] = None
    
    if sys.argv[1] in words:
      print 'You fail'
) and alert the user if their password sucks and to pick a new one. (You might want to check for membership of their password with an 's' or a '1' or something added on as well.) You might also want to do some sort of complexity analysis as well that requires e.g. at least one uppercase, one lowercase, one number.

The infrastructure problem is places like Banks saying "Enter a password between 6 and 12 characters and only use alphanumeric characters!"


You can also initialize the words in more pythonic way:

  with open('/usr/share/dict/words') as wordlist:
      words = set(line[:-1] for line in wordlist)
Or, for one time check you can stop reading on match:

  import sys
  lookup = '%s\n' % sys.argv[1]
  with open('/usr/share/dict/words') as wordlist:
      [ sys.stdout.write('Dictionary password: %s' % lookup) or sys.exit(1) for line in wordlist if line == lookup ]
As for the objective, this checks only exact dictionary matches. In real world, you'd use cracklib to check any dictionary based or weak passwords rather than reinvent the wheel. See http://gdub.wordpress.com/2006/08/26/using-cracklib-to-requi...


While I agree that the first code is pythonic and nice, the second code is quite the opposite of that. Why forcing imperative code into a list comprehension? It is a lot easier to read as nested loop, as it doesn't build a dummy list with some strange "or" operation:

  import sys
  lookup = '%s\n' % sys.argv[1]
  with open('/usr/share/dict/words') as wordlist:
      for line in wordlist:
          if line == lookup:
              sys.stdout.write('Dictionary password: %s' % lookup)
              sys.exit(1)


better to pickle a frozenset and use that.

  import pickle

  # initialize:
  wordset = frozenset(line.lower().rstrip() for line in open('/usr/share/dict/words'))
  pickle.dump(wordset, open('/tmp/wordset.pkl', 'wb', -1))

  # when you want to use it:
  wordset = pickle.load(open('/tmp/wordset.pkl', 'rb'))

  'bear' in wordset # == True


Thanks for the frozenset. Yet pickle makes the file 2 times bigger and 20 times slower to parse on my machine:

  >python wordlist.py -w wordlist -p wordlist.pkl
  >python wordlist.py -T wordlist wordlist.pkl 10 3
  2.6.6 (r266:84297, Aug 24 2010, 18:46:32) [MSC v.1500 32 bit (Intel)]
  words(wordfile): 142.50 best msec/loop
  words_slower(wordfile): 161.00 best msec/loop
  words_normalized(wordfile): 295.84 best msec/loop
  words_pickled(pickled): 2943.66 best msec/loop
The winner is:

  with open(wordfile) as wordlist:
      return frozenset(wordlist.read().split('\n'))
Source: https://gist.github.com/1059725

Wordlist: http://www.freebsd.org/cgi/cvsweb.cgi/src/share/dict/web2?re...

Do you have different results?


right, the space saving harmed readability indeed, thanks


" If you have a strong password that never gets cracked/leaked/intercepted but you use it everywhere you're still fine, but that's not recommended."

That unfortunately puts a lot of confidence in the security of the sites that you give your password to. Your super safe password is not that safe if a site stores it as plaintext or MD5 and it gets hacked.

Otherwise I agree with your comment.


That's a bit too simple - it won't catch password1 or iloveyou, both of which will definitely be tried by any decent cracker. Try http://www.openwall.com/passwdqc/, by Solar Designer (who has, among many other things, written the John the Ripper password cracker.)


Another option is to test password complexity in javascript on client (ála Google and others), which provides immediate feedback to the user. This is prone to a dictionary attack only if the user uses well known character replacements like 'p@$sw0rd', which pass these tests undetected.

https://github.com/search?q=password+strength https://sun.athnic.net/password-test.html http://www.geekwisdom.com/dyn/passwdmeter http://cafewebmaster.com/check-password-safety-javascript-wh...


A middle ground would be to generate password suggestions, but give the user the option to change his/her own password.

I used a method a few years ago to generate passwords for users on an intranet. They were free to replace them, but a poll showed that most users went with the suggested password. I focused on generating passwords that users could pronounce (and therefore remember), like gekava86, lofesu75, 23wopeni, etc.

At the time, I didn't use BCrypt so they were a bit weak (though certainly stronger than whatever the users' regular password was). If you use BCrypt, you don't have to worry as much about the password being "strong", the focus comes back to it being unique.


Eight characters for a password seems plenty long to me, if you're truly using a random password. You have [A-Za-z] = 52 chars, [0-9] = 10, [!@#$%^&()_-+=;':",.<>/?\|[]{}] = 30, for a total of 92 possibilities. Then you have an 8 character password. Even if the attacker knows you're using 8 characters, you've got ((92^8)0.3)/60/60/60/24/365/100 = 271 centuries if it takes 0.01 seconds a try. And we're really talking over 8 thousand centuries if we're having bcrypt take 0.3 seconds.

So the problem is that passwords tend not to be random. This greatly decreases the problem-space. But b/scrypt solves the problem of length, within reason.


With $10k of GPUs, you can compute an MD5 hash for every 8-character ASCII printable password (yes, all 95^8 of them) in under a day.

This is why MD5 is not a good password-based key derivation function. :-)


Yet with just [A-Za-z] and 10 random chars you're at 52^10 ~ 1E17 while your wider choice set gives 92^8 ~ 1E15, so ten random alpha chars is really better than ten alpha+number+punctuation. Suddenly it takes a paltry 229 thousand centuries (or 22 million years) with bcrypt at 0.3s.

Using punctuation does not make a password "more random", it only makes the random choice set slightly larger. Bumping the length is much more significant than bumping the set as it's exponential.

The debate is whether it's easier for a user to both pick and remember a "random-enough" - i.e not easily bruteforce-able, regardless of hashing technology - [A-Za-z]{10} password or a [A-Za-z0-9!@#$%^&()_-+=;':",.<>/?\|[\]{}]{8} password.

For the record, Google's 2-factor auth generated app-specific passwords are [a-z]{16}, which gives 1E22, or 230 billion years with bcrypt at 0.3s.


I fully agree, but I'd focus more on the length than on the character set. It is not a problem if the password contains only alphanum characters, as long as those are random.

For instance, a plain alphanum password of length 9 is stronger than a password of length 8 that allows for special characters. (because 62^9 > 94^8)

In other words, adding one character to your password is as good as choosing from a bigger character set.

Okay, beginning at length 10, you'd need to add two characters, but that's really all you need to do up to length 20. (which is an insane length for a random password anyway)


C^x grows a lot faster than x^C, except for specific x's and C's you're almost always much better off adding to the length. And 20's nuttin'. At the risk of publishing more hastily written code for critiquing, here's a shell script password generator I use sometimes. https://gist.github.com/1058565


I personally hate it when sites do this. That's probably the best way to make sure I don't come back, as I'm not likely to remember the generated password. Worse, how do you give them the password? Email?


Also you have to trust the website: They shouldn't know your password, but if they just gave it to you then how do you know they're not keeping a copy themselves somewhere?


Can you stop a website that wants to save your password?


There are a lot of misconceptions on what makes a "secure" password—fluffy is puffy takes longer to crack than J4fS<2.

http://www.baekdal.com/tips/password-security-usability


"fluffy is puffy takes longer to crack than J4fS<2."

Assuming the attacker brute-forces all 3 word combinations, maybe. But "fluffy is puffy" is a valid English sentence and language is a rather poor source of entropy.


The problem with that approach is that people are very bad at remembering random strings of text, so they're likely to write this down or copy it somewhere on their computer.

Depending on the attack scenarios, that may not actually be more risky than allowing users to choose their own password..

I'd say that the best compromise of these approaches would be to encourage the use of password safe programs which at least store the password list in an encrypted format on local machines (of course this still relies on the user choosing a strong password for the password safe..)


Let's say I use bcrypt and want to increase the work factor every year. Is it possible to determine the work factor of an existing hash, or will I need to maintain this information alongside the hash? The java and python bcrypt APIs don't have any function that returns the work factor.

Can I increase the work factor of an existing hash, or must I wait until the user logs in and then use the plaintext password to generate a new hash?

Is there a bcrypt API that provides a hash comparison function that addresses timing attacks? The py-bcrypt example code uses the '==' operator to compare hash strings, leaking timing information:

    # Check that an unencrypted password matches one that has
    # previously been hashed
    if bcrypt.hashpw(password, hashed) == hashed:
            print "It matches"
    else:
            print "It does not match"
(from http://www.mindrot.org/projects/py-bcrypt/)


To answer my first question, the work factor can be extracted from the hash. It is stored as a 2-digit decimal number in the fifth and sixth bytes of the hash. See http://code.google.com/p/py-bcrypt/source/browse/bcrypt/bcry...


I recently put together a commentary + code on how to upgrade passwords hashed insecurely in your DB without user intervention. This was in response to MtGox talking about "slowly migrating users", which presumably means upgrading passwords at the point of login:

https://gist.github.com/1051238


If I'm not misunderstanding your purpose, why not just:

1) Take those md5 unsalted passwords and 2) bcrypt them

Then upgrade the users to non-pre-hashed bcrypt as they log in?

?


Why do you think that isn't what this is doing, only with fault tolerance to allow more than one scheme upgrade?


Because the juicy bits got cut off due to the way github is displayed on my phone. Sorry I couldn't see that portion!


In security the most important thing you can teach people is to be aware when things are harder than they look so they'll take a bit of extra time educating themselves and talking to other people.

Learning how to separate good from bad advice is a skill that needs to be maintained. Also, in all software that is supposed to provide some form of security, one has to be prepared for the eventuality that it probably contains errors.

I used to read a lot of books on cryptography and the use of cryptography. I've forgotten most of it by today, and to be quite frank: the more I know, the less I want to write crypto software. There is something deeply unsatisifying about work where you know that what will in all likelihood trip you up is some trivial, stupid mistake.

It isn't hard because of the crypto itself. Sure, certain cryptographic libraries can be extremely awkward to use (which in itself is a security risk), but the problem usually comes from where you aren't looking for them.

I cringe a bit when people advertise software as being secure because it uses this or that encryption scheme. I also cringe when people claim that they "encrypt databases" and their systems are therefore secure -- because, for most usage scenarios, I can't think of any really secure way of doing this. Not without compromises anyway. And while I honestly know that I have just a rudimentary grasp on cryptography, I know a lot more about it than most people who make products that hinge upon correct application of crypto.

Just the other day I was trying to determine how many rounds I wanted to use in bcrypt for storing passwords for a given system. I think I spent most of the day pondering this question, writing benchmarks and reading up on what other people said on the topic. A couple of days later a friend of mine emailed me a code snippet that implemented the password hashing scheme of a commercial product they use that makes shameless claims about being secure. (I think he probably just looked at the hashed passwords and made a guess about the method they had used. I don't think he had a look at the original source code). If memory serves the product uses unsalted SHA1 hashing. In other words, the vendor didn't even bother thinking about the problem.

What scares me a bit is that even I thought "well, if they claim to have well thought-out password handling and they are in the business of selling security systems, I suppose they have probably given this a lot of thought" the first time I visited their website. After all, large companies give them millions. Right?

I wonder what other things they are doing equally badly.


Does anyone know how Djangos built-in password hashing fares?


Looks like the built in stuff uses either sha1, md5, or crypt: https://docs.djangoproject.com/en/dev/topics/auth/


http://code.playfire.com/django-bcrypt/ Backwords-compatible bcrypt support.



To see if I understand bcrypt correctly, would this pseudo code achieve a similar adaptable computational cost?

  function(uniqueSalt128Bit, password, rounds) {
    var hash = SHA1(uniqueSalt128Bit + password);
    var length = Math.pow(2, rounds);
    while (length--) hash = SHA1(hash);
    return hash;
  }



Or a stretched newer algorithm like Whirlpool. Or stretched SHA-256? What advantages does bcrypt have over them?


It's been around for years and remained solid. It's implemented in every language under the sun. It can be tuned to be plenty slow, even today.

That said, there's nothing wrong with choosing another password derivation scheme. glibc's crypt() provides three: PHK's MD5 scheme, and Ulrich Drepper's stretched SHA-{256,512}. Just don't roll your own and you'll be ahead of the majority of programmers.


What properties of bcrypt make it better than SHA-512 with more iterations to make the time taken to process it equivalent? Is there something about the algorithm that makes it more difficult to speed up via GPU/hardware?


Why not PBKDF2?


PBKDF2 is fine. bcrypt has better library support. I'd rather you used a bcrypt library than try to roll your own PBKDF2.


bcrypt is also slightly stronger than PBKDF2, since it requires a larger circuit to compute.

The only reason you would want to use PBKDF2 is if (a) you already have a hash and don't want to add blowfish to your code/circuit, or (b) you want to be standards-compliant.


According to the scrypt website, PBKDF2 is weaker as compared to bcrypt and scrypt: "We estimate that on modern (2009) hardware, if 5 seconds are spent computing a derived key, the cost of a hardware brute-force attack against scrypt is roughly 4000 times greater than the cost of a similar attack against bcrypt (to find the same password), and 20000 times greater than a similar attack against PBKDF2."


On .Net this is definitely the way to go.

http://msdn.microsoft.com/en-us/library/system.security.cryp...

Make sure to use the defaults or better in terms of rounds. The salt provided when you don't supply one is cryptographically random.


There is a .NET implementation of BCrypt:

http://bcrypt.codeplex.com/


Now, how do I explain to my girlfriend that picking longer, more complex passwords is a realistic security precaution and that I'm not just a paranoid nerd?


For PHP users, phpass is a really nice password library. http://www.openwall.com/phpass/


Shameless plug for PHP devs: https://github.com/SaltwaterC/PasswordHash2 - you may like this as well.


What work factor do you recommend to make it reasonably fast and still bruteforce-proof with the current HW?


So I assume if you use bcrypt, you only verify the password once upon login, and then store a cookie that verifies the user is logged in? That brings other security risks.

But the alternative seems to be to make every request very slow, because it would require bcrypting the password with every request.


If you were to re-authenticate the user on every request, how do you have the client send the password to the server on every request without making the user enter it every time?


By storing it in a cookie - oh :-)

I suppose you could come up with some scheme where you create a new cookie with every request, a kind of one-time cookie to prevent session hijacking. Probably not worth it and not 100% reliable, though.

I guess just trusting in cookies is the only real option.

Or HTTP BasicAuth, it sends the password with every request I think (unencrypted, I know). In either case in theory you need HTTPS.


Oh man, please tell me I'm misreading you, and you don't actually store your users' usernames and passwords in a cookie.


No of course not. I mean yes, you misread me.


http://www.openwall.com/lists/announce/2011/06/21/1 - some bcrypt implementations out there are still broken if they used this code. You may give the SHA-based scheme a chance.


bcrypt is so flawed...


If you want to stand by that assertion, please provide some information to back it up. You'd become very popular in this community (and the security community at large) if you could provide a good basis for that statement.

Otherwise, you're just trolling, and I've committed a cardinal sin.


and/or stop using a single word and use sentences (phrase)

eight characters is easy - try cracking 50


> and/or stop using a single word and use sentences (phrase)

True.

This isn't meant for the user, though. This is meant for the developers. No matter how hard the developers try, users will always pick bad passwords. If you use salted SHA-1, then if the database gets compromised there goes 50% of the passwords. If, on the other hand, you use bcrypt, maybe only 5% of the passwords get cracked.

Bcrypt turns a massive news event (database leaked; thousands of passwords lost!) in to something much less newsworthy (database leaked; twenty passwords lost).


Not too difficult if it's composed of dictionary words. Why not automatically generate random strings to use? No pattern at all.


Assuming 5 words selected from a dictionary of 234979, there are 716382975036689591261090899 possibilities. If you have 1000 computers each cracking 10 billion attempts per second, you're looking at about 1135817 years. I'm going to call that difficult.


with cuda and the cloud it's quicker to try all possible passwords (starting with aaaaaaa...) than to search a rainbow table of dictionary words so ":Hy6&z@z" is now no more secure than "password"


Rainbow tables don't apply to salted passwords.


But you can still do a brute force attack--that's part of why SHA1 has some issues--you can beat it with brute force and enough GPUs.


You can add the configurable slowness factor to any secure hash. Just use salts but leave out k bits from each when storing them. The more bits you leave out, the more work it takes to verify a password as the verification procedure needs to brute-force the missing k bits.


You can also write a new cipher, discover a new key derivation function, and invent a new hash function. But that's not the point. Use bcrypt or scrypt.


Any analogy between using a standard technique with a standard hash and going for DYI cryptography is a bit stretched.


Really? You just described a KDF, and advised people to use it without providing any references to implementations. "Just leave out k bits of salt and brute-force it" -- do you really expect non-cryptographers to be able to correctly implement this algorithm? If you insist on using hashes, at least you could tell people to use PBKDF2 to turn a standard hash into KDF.


If you leave out bits, any password whose hash matches the rest will work, even if it doesn't match the 'lost bits' (since your login server won't be able to verify that, obviously). The only thing you've accomplished is to make the cracker's life easier.


You don't leave out bits from the hash but from the salt. The hash is stored whole and (obviously) needs to match whole.


Even so this doesn't account for another advantage of bcrypt: its "de-optimized" key scheduler, which adds to the difficulty of cracking the hash.


The key scheduler in bcrypt is a red herring. I don't know what was going through their minds when they wrote that. They could have used a larger number of iterations and gotten the same result.

The only reason why bcrypt is better than PBKDF2 is that blowfish needs a large circuit to compute.


Another thread where seeing Karma would really help...

Maybe it should be optional?


If karma is needed to judge the quality of someone's post, then the karma counts are probably inaccurate anyway. At best they'd be interspersed with votes by people who just thought the answer sounded good without really knowing how accurate and informed it is. At worst, the majority of the votes would consist of that type.

HN is better without karma, just think of it as a place to find out about concepts you might not have been aware, then drill down on it with some more trusted source.


the weaknews is actually storing the salt or code or key with the password. using bcrypt helps a bit but doesnt truly solve the issue. i think thats important to point out. sha1 with a salt u cant find beats bcrypt with a key u know any day


No. Not at all.

If someone managed to break in to your website and get the password hashes, chances are they also have your "secret" salt. There is no reason to separate the salt from the hash, and, in fact, there are no implementations which do that.

However, if I can't convince you of that, then if you ever make a website that takes passwords, please use bcrypt. You can do your super-special-salt-separation-scheme, but just use bcrypt instead of SHA-1.


"If someone managed to break in to your website and get the password hashes, chances are they also have your "secret" salt."

I believe the GP was referring to the scheme where you don't store the salts at all (or only store some bits of each salt.) The verification needs to brute-force the salt (or the missing bits) each time it verifies the password.

The missing bits are quite similar to the bcrypt workfactor - the more you leave out the harder it is for both the attacker and the legitimate verification to verify passwords.


Then how in the world does your code know what salt to use when the user presents his or her password?

If it is derived in code from some other piece of user data then it is still "known" if your DB leaks - you have to assume someone who stole your database also stole your code.


"you have to assume someone who stole your database also stole your code."

Maybe, but that doesn't mean that a separate salt, not in the database, will prevent certain attacks, and as such is a viable option. Security is about layering, not about 'xyz isn't 100% secure in 100% of the cases, forget about it'.


That's fair, but it doesn't change the insanely wrong statement that triggered this comment chain:

"sha1 with a salt u cant find beats bcrypt with a key u know any day"

This is fractally wrong.


OK then we agree there :)


"Then how in the world does your code know what salt to use when the user presents his or her password?"

It brute forces the missing part. Obviously you may only leave out a number of bits that keeps it feasible.


Or you could use something like bcrypt with a configurable 'cost' and stop making up things that you think will secure your passwords.

Anything you can brute force with your hardware can be brute forced on someone else's hardware.


"Anything you can brute force with your hardware can be brute forced on someone else's hardware."

How doesn't this argument apply to increasing work factor in bcrypt?


Let me back up here and try to be less smarmy and more straight with the facts of the matter.

The primary reason I think your idea isn't favorable to using bcrypt is because of the weight of experience and research. Dropping bits from your salt might be a perfectly valid crypto protection technique, but when it comes to crypto, I'm inclined to go with research over cleverness.

If dropped bits from the salt were a viable means of securing passwords, I'd imagine that someone would have implemented something like this already. It's one of those "oh, that's clever" ideas, but the cleverness trap is a dangerous thing. Just because something is clever doesn't make it good.

This is one of the most oft repeated warnings when it comes to crypto: don't build your own, you'll do it wrong.


The difference is that bcrypt's cost factor is an intentional implementation with well understood results. Truncating part of your key and intentionally brute forcing it is a kludge.




Applications are open for YC Summer 2023

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: