Hacker News new | past | comments | ask | show | jobs | submit login
Don't use bcrypt (2012) (unlimitednovelty.com)
66 points by dsr12 on Aug 26, 2013 | hide | past | favorite | 42 comments

As a quick summary of the previous discussion:

  - use PBKDF2 if you deal with a pointy haired boss who needs standards and accreditations
  - use scrypt if a reliable implementation is available for your platform of choice
  - use bcrypt otherwise
PKDF2 is worse than bcrypt (by roughly a factor of 5) and depends entirely on the security of hash functions; hash functions are more poorly studied than block ciphers.

Scrypt is better than bcrypt (by roughly a factor of 4000 according to cperciva). It's memory hard and it would be much harder to build an effective system to hash as fast as possible due to this.

Regardless, using adaptive hashing is what's important and you'll be good any of the three alternatives.

Thanks to the comments from cperciva, moxie and tptacek. All errors in this summary are mine.

> Scrypt is ... memory hard

Haven't seen the answer to this: does scrypt being "memory hard" imply that you'll need a ton of RAM for your server if you get lots of users logging in concurrently?

Not really. In the scrypt paper, cperciva recommends parameters that allow it to do a single password hash in 64 ms, and in the process use only 4k of RAM. Nowadays, as processors are faster and ram is cheaper, you'd probably want to up the parameters a bit, so that the encryption time will be similar (between 50-100 ms) on modern hardware, and the RAM usage will scale proportionally (probably 128k, based on the ratio of RAM prices between 2002 and now). That means that you would be able to handle 10-20 logins per second per core, using no more than 2.5 MB in the process (if they were all run in parallel on separate threads and so all of that memory was needed at once; that's obviously a worst case, most likely they would be scheduled more sequentially and use proportionally less RAM as you would free the memory after finishing an encryption). On a 12 core machine, you might wind up using 30 MB at at time, to support 240 logins per second.

If you need to handle 240 logins per second (which presumably means many many more requests per second, as there are usually many requests per login), 30 MB of RAM is the least of your worries.

Note that these numbers are simply extrapolations from the ones in the original scrypt paper, I haven't actually tried this on modern hardware. Regardless, there are parameters that allow you to trade off CPU time and RAM usage, so you can tweak it to provide the level of performance that you require (with the understanding that the faster it is and less RAM it uses, the faster cracking it would be).

> If you need to handle 240 logins per second (which presumably means many many more requests per second, as there are usually many requests per login), 30 MB of RAM is the least of your worries.

Well, if you have a large system of nodes, and until now you arranged things as an SOA where there's one node-cluster in particular that serves as an authentication service, doing all the hashing and table-lookup and comparison, and handing out session tokens for your other services to use--then with scrypt, that auth service will now be have the combined "memory hard"ness of every user's login requests on it for your entire system.

With bcrypt, that level of CPU+memory load isn't so bad--but with scrypt, I think I'd think about pushing the hashing step into the client library part of that service's API, before the RPC call, so that the load gets distributed throughout the system back to where it's being "spent."

In this case, it means that lots of memory accesses are required in mostly random order. This makes parallelizing on GPU hardware much more difficult than something like an MD5 or SHA hash, which don't require much memory access during the computation.

This also boosts the cost of making dedicated cracking hardware (ASIC or FPGA type hardware)

I'm having a hard time finding actual memory usage of a single password verification though.

After a bit more reading just now, it looks like you can also adjust the memory-work-factor of scrypt as well, which doesn't appear to be available for bcrypt (very unsure on this one though, please dont' take my word on it).

Does anybody out there have a good side-by-side comparison of practicalities like "memory usage", "flexibility", "can it be parallelized on CPU/GPU/FPGA/ASIC" and so on across all the popular hashing setups?


How can algorithms with an adjustable work factor be better or worse than each other?


They can't be. They each provide an arbitrary amount of work (or entropy if you want to look at it that way) to a non-ideal key (e.g. a password).

Moreover, any discussion of 'how much is enough' is a guess. How powerful the system used to brute force your keys will be is unknown.

I've read things like 'adjust it until it takes 0.5 seconds on your computer'. That makes sense if and only if your key can only be attacked on that same machine, which is probably seldom the case.

The rule of thumb is that it should be as expensive as you can make it (weighing user experience, operating costs, etc) and that more is better than less. There is no definite level of security provided by these algorithms (or families of) and their relative value can't be easily compared.

Scrypt is an exception as it can use an arbitrary amount of memory, which adds a dimension to the cost added.

Having said that, unless you're actually using these for what they were intended (deriving keys) and encrypting something then you're probably adding little actual security to your users' lives.

I would consider this topic firmly in the realm of security theater. There's entirely too much handwringing about hashing passwords and entirely too little about encrypting user information or providing anonymity (where appropriate).

I think we'd all be better off putting notices on signup pages that say "please don't reuse passwords" and spending more time on other more important security issues.

Better, where we can we should move to zero-knowledge, bad-password tolerant protocols like SRP and bypass this issue entirely.

> They can't be.

That's not exactly right. What "better" or "worse" here refer to is how much the advantage the attacker with specialized hardware has over you running on a standard server.

This advantage is largely dependent on how much memory the algorithm uses. The more memory, the better. In that regard the order is scrypt > bcrypt > PBKDF2.

More memory is better because this makes running on a graphics card or ASIC harder or nigh impossible.

> PKDF2 is worse than bcrypt (by roughly a factor of 5)

Both Bcrytpt and PBKDF2 have security parameters. You can make them whatever ratio you want.

(also scrypt)

As part of security work i've had this article cited back to me a number of times from developers who 5 minutes earlier didn't even know what bcrypt was but Googled for a rebuttal to my recommendation to improve their existing [plain text|md5|sha] scheme.

Some might consider 'just use bcrypt' as a cargo cult, but it greatly simplifies selling a safe password scheme to the 90% of developers who aren't crypto or security aware. It is something as simple as 'check the padlock in your browser'

Having a high ranking post titled 'Don't use bcrypt' complicates and muddies the issue all over again.

Yeah, the message shouldn't be "don't use bcrypt". It should be "use bcrypt in preference to any small fixed number of iterations of SHA-n or MD5, and in preference to PBKDF unless you need to comply with some particular standard, and use scrypt in preference to bcrypt if it's available in your environment."

I don't know of any distros (BSD or Linux) which have PBKDF or scrypt available as crypt(3) hashes, but there are distros which have bcrypt available. So, for many use cases, bcrypt is the most secure hash that's easily available without adding extra dependencies.

Probably the same people that think hashing a message and a secret key together is a secure MAC (sans Keccak).

30 votes on a hacker news post is hardly "high ranking" and if someone argued based on the quality and content of this blog post, I'd laugh all the way into making all his/her commits come from pull requests.

It's #5 on go-ogle

Exactly. A Google search for 'bcrypt' displays this post at anywhere from 4th to 10th position [0]

[0] https://www.google.com/search?q=bcrypt

Be sure to read Moxie Marlinspike's comment (the first comment on the page) there.

The first figure he shows is from Colin Percivals SCrypt paper, he should have cited his source.


I write this post because I've noticed a sort of "JUST USE BCRYPT" cargo cult (thanks Coda Hale!) This is absolutely the wrong attitude to have about cryptography.

Is it? I think, contrarily, thinking you can decide for yourself what crypto algorithms to use how -- is absolutely the wrong attitude to have about cryptography, UNLESS you have spent significant time studying crypto, which most of us developers have not.

Following the advice of those who have is absolutely the _right_ attitude to have about cryptography.

If the community is giving advice that could be better, then they should improve.

But likewise, unless cryptographers can give simple easy to follow and hard to mess up advice -- well, doing so is the only thing that's going to result in secure software. "Always use bcrypt for password hashing" is a great example there, that HAS resulted in more secure software, than when people made up their own hashing algorithms or didn't hash.

So: "just use bcrypt" should really be "just use a pluggable key derivation function". Then plug in something suitable. bcrypt is a start, but you might also consider PBKDF2 or scrypt.

If you like numbers, try the ECRYPT II Yearly Report on Algorithms and Key Lengths (2012) http://www.ecrypt.eu.org/documents/D.SPA.20.pdf

Thanks for this!

I'm using BCrypt at the moment and am looking at others like SCrypt so it's great to see actual reports about this.

Reasons why you might need encrypted passwords:

  1. Your password database has been stolen
  2. Your password transmission is susceptible to viewing/sniffing by 3rd parties
Reasons why you might need password encryption schemes that are difficult to crack:

  1. You might not know when the password is stolen, so you don't know to change it
  2. You know the password was stolen, but are not sure if someone could feasibly crack it
  3. You reuse your passwords and one password being cracked means all of your accounts are owned
  4. Your password is not complex, or you don't know if it is complex
  5. You rely solely on passwords for authentication
Ways to mitigate requirements on password encryption:

  1. Don't let the password database get compromised
  2. Secure the transmission of passwords
  3. Require strong passwords
  4. Don't rely on just passwords for authentication
Note that this doesn't take into account things like phishing/social engineering, brute forcing, hijacking sessions, or exploiting the application. In general, strong password encryption is only useful to prevent bad public relations from too many users getting hacked at one time. For individual security, a strong password encryption system is effectively trivial to bypass.

Note that password hashes are used for more than just password database; they are also used for key derivation functions, for example for deriving a key for encrypting a hard drive or document from a password that someone types in. Any full-disk encryption system needs to use a password based key derivation function for security, and your security if the disk is stolen is solely based on the strength of your password and the strength of your key derivation function (unless you have tamper-proof hardware that is able to store your key and only allows interactive authentication, rather than reading the key out and brute-forcing the password out of it).

In fact, encrypting files using a key derivation function was the original use for scrypt, not storing password databased.

> In general, strong password encryption is only useful to prevent bad public relations from too many users getting hacked at one time.

It's also good for lowering the value of cracking your system. If everyone (or a large fraction of developers) does it, it lowers the expected value of breaking into a random system that has low-value accounts. Since people frequently reuse passwords, stealing the password database for a low-value system and cracking them can be used for breaking into higher value systems, as you can try reusing the username or email and password pairs for logging into other systems.

It doesn't matter what you use it for. Encrypted passwords are simply not a strong enough single factor to prevent a successful attack. If you use FDE you should also be using a keyfob, thumb drive, etc. There's too many attacks on passwords alone, for example the most effective one, where the police compel you to reveal it. (Compelling you to reveal the location of a keyfob is arguably more difficult for them to do)

To be honest, I find it completely useless to hypothesize about why someone would attack a system, much less for something as silly as shared passwords. It's much less work to just attack the one account on the one system than to attack two completely different systems on the hope that the password is shared. And botherding/phishing to compromise accounts is much more simple & effective than trying to compromise a password database.

> It doesn't matter what you use it for. Encrypted passwords are simply not a strong enough single factor to prevent a successful attack. If you use FDE you should also be using a keyfob, thumb drive, etc. There's too many attacks on passwords alone, for example the most effective one, where the police compel you to reveal it. (Compelling you to reveal the location of a keyfob is arguably more difficult for them to do)

Security is not an absolute. There are plenty of attacks where a strong password & strong key derivation function for FDE will prevent many possible attacks. It helps plug the improperly erased discarded hard drives hole, the thief on the train who steals your laptop opportunistically looks through your data for anything valuable.

Yes, two factor is better, but it increases the complexity; and for many cases in which you want FDE, such as a laptop while travelling, there are far too many times where you keyfob and laptop will both be accessible to an attacker, thus negating the benefit of the separate factors.

It's better to have a good password and good KDF for FDE than it is to forgo FDE altogether.

> To be honest, I find it completely useless to hypothesize about why someone would attack a system, much less for something as silly as shared passwords. It's much less work to just attack the one account on the one system than to attack two completely different systems on the hope that the password is shared. And botherding/phishing to compromise accounts is much more simple & effective than trying to compromise a password database.

No, on an industry-wide scale it is not useless to hypothesize about motivations. If you decrease the expected value of an attack, you decrease the motivation for people to break into systems. Getting credit card numbers and easily cracked passwords out of databases that are accessible to front-end systems does a lot to help reduce the expected value of attacks, and thus reduce people's motivation to perform them in the first place.

Thus, recommendations for security should keep that in mind. If you make good, easy to implement recommendations for security, that help reduce the value of a successful attack, you can improve global security. For example, token based systems can help avoid credit cards being stolen; instead of each merchant storing CC numbers, if they store tokens that are only valid for them talking to their CC processor, then there's no valuable trove of CC numbers to be found by exploiting the database.

There's this really dangerous meme going around that if security isn't absolute it's worthless. That can be true if you assume a highly motivated attacker with government level resources at their disposal who is targeting you specifically, but that's not the attacker that actually causes most people problems. Instead, it's some bored Eastern European kid who doesn't have much in the way of job prospects and wants to make a quick buck, and figures that trying out some basic SQL inject exploits against a large number of sites will be likely to lead to some valuable information. If enough people remove that valuable information from their site or make it difficult enough to extract the contents by using good key derivation functions, there's less economic incentive to try that kind of wide-scale probing attack.

This is part of the principle of defense in depth. You are absolutely right, you should secure your password database so that it can't be stolen, and you should protect passwords in flight so that they can't be sniffed. But you should also have good, hard to break password encryption, so that even if you made a mistake and are vulnerable at that outer layer, your users are still protected from having their passwords revealed to attackers. Likewise, you should have both a firewall, and intrusion detection systems that live behind your firewall, and strong authentication and encryption for all services behind that firewall so that even if someone circumvents the firewall, they still can't get into any of your systems.

So yes. Prevent your password database from being stolen. Use strong passwords. Don't reuse passwords between sites. But also, use strong password hashing that's hard to crack so that your users are still secure even if all of the above suggestions fail.

I'm not disputing any of that :) My original post was pointing out all the other parts, to help people remember that there's more to it than choosing the perfect encryption algorithm.

If the password encryption is the last thing that protects your users, it should be the last thing you consider, not the first. Lazy people may ignore all the rest and assume strong password encryption will save their butts. It won't.

For the security experts in the audience, which of these statements are untrue?

1. bcrypt is preferable to salted sha1, md5, rot13, plain text, etc...

2. Because of the "cargo cult" effect, bcrypt is more accessible (more platforms, nicer wrappers, etc...) than scrypt or other "better" options.

2. Isn't really true. The fact that bcrypt is more widely available than scrypt is more due to age than a "cargo cult" effect. It was the best option available for many years, so it became widely available.

FWIW, I wrapped the c scrypt in go with the adaptive mem/cpu selection for parameters N, r, p. The standard go crypto scrypt library is not adaptive.


I wonder how many password leaks would have been a "non-issue", if we had adopted the habit of adding also fixed salt to the passwords 15 years ago?

The point in using fixed salt would be that quite often the password leaks seem to be result of SQL injection attacks and therefore the attacker only gains access to the database. Not the source code. The password hashes from database would be much harder to crack if we were also using a fixed salt string that was only stored in code.

Technically adding this kind of extra salt would be trivial on all platforms. I can't figure out any negative side effects related to it (Obviously it is not replacement to better password hashing schemes).

Your source code (or a running image of your program, that has the extra salt in memory) is rarely harder to get access to than your password database. Remember, all of your developers have access to it, and likely have it unencrypted on their machines, which may not be terribly secure. It would be a small enough extra value, and a large enough extra hassle (making it difficult to re-use the same encrypted password in other code, for instance use the same encrypted password for basic auth over HTTPS as you use for your webapp in a different area) that I don't think it's worth it.

I also worry that promoting this practice would make some developers think that they didn't need a random, per-password salt as well. It's hard enough to convince developers about the value of using strong key derivation functions and a per-password salt, and if you added a global salt I'd worry that some would think that the per-password salt is not necessary. Then once the fixed salt were compromised, brute forcing individual passwords would become trivial.

Also remember, the attacker could easily create an account with a known password, and they can get the normal salt from the password database. Now they just need to brute force the single "fixed salt", and you're back to where you started.

No, it's much better to encourage people to just use strong password hash functions, like bcrypt or scrypt, with good, random, per password salts, than to try to apply a little security through obscurity by adding yet another salt that is likely easy to find.

Determining the fixed salt by using brute force is not feasible if some modern hash function is used. The fixed salt can be easily quite long, think about something like 200+ bits.

If they have access to your database then they most likely can also get access to your source code.

That wouldn't be the case in many scenarios I can think of:

1. DB backup is compromised (from an offsite backup storage, think S3).

2. Developer laptop lost, with a copy of the database locally. (source code may be available, but the site-specific key probably wouldn't be. As that's a config var on the server).

3. SQLi that dumps hashes to the page.

I'm a big fan of making the job harder for attackers. There are lots of attack vectors where a 'pepper' value would be useless. But not all of them. And it's nearly free to implement. Might as well.

No, any scheme by which the salt is a secret is a failure.

Agree, BUT, people should not use absolute statements (like "just use" or "don't use") cause sometimes (sadly) it easily leads to a religious approach.

The point of "just use" in this case is the implied, but often unstated, first part: don't learn crypto, just use _____. Because when you think you've learned "enough" crypto to know what to do (whether writing a library or just picking one), is exactly when you're the most dangerous. Either really learn crypto (i.e. the 10000 hours way)--and don't build any cryptosystems until you have--or just avoid the problem-space altogether, and do exactly what the experts tell you. Which, in this case, is "use bcrypt."

The experts don't say "use bcrypt" because it's the best pluggable key derivation function. They say it because it's a satisfactory key derivation function, with viable APIs for every language you can think of. It's the McDonalds of crypto packages: a known quantity, anywhere you want it. Which is exactly what you want to point at if you want people doing as they're told ("just use...") instead of trying to learn what a "pluggable key derivation function" is... and following that path until, inevitably, they become dangerous.

An analogy, to take that last idea further: imagine if there was a universe like that of Harry Potter, with young witches and wizards capable of learning powerful, dangerous magics. But these magics require no wands, nor any other implement or component that could be taken away from the magician; once the spells are learned, they are irrevocably in the magician's possession, even if the corresponding safety lessons for use of that spell are never absorbed. And then, imagine that most learning of magic occurred through autodidacty...

Dan did such a great job on the Stanford Crypto course, it's worth learning the basics to learn the important bits.


I just have a 30 seconds mind-trip in this dystopian universe and it was horrifying :)

If I were building a tiny number of auth servers relative to my frontend servers, I'd potentially just use an HSM and symmetric crypto vs. a computationally hard KDF. Less subject to resource consumption DoS, somewhat more flexible for even regular authentication, and allows you to do things like hold credentials for services which don't support oauth using the same system.

I especially like it when "don't use MD5!!!1!!" cargo culting causes someone to switch off _md5crypt_ to plain, unsalted, sha256 as a password hash or kdf.

So basically this post is dumb and wrong, and the author is clueless.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact