Please stop hashing passwords

ttflee · on April 22, 2014

The tl;dr from the article:

> Stop associating “hashed” with “secure” when it comes to passwords. If you’re storing user credentials MD5-ed without salt, you’ve put an thumb tack in front of a steamroller - just a minor annoyance that won’t offer you any real security. You have to do it right or hashing means nothing.

> Start relying on key derivation functions for password storage.[1] Are they perfect? Will they be the best practice in five years? Probably not, but they’re your best option when rolling your own user credential storage.

> Don’t sacrifice security for performance. A slightly slower malloc in OpenSSL would have been much preferred, in my opinion, than the disaster that is Heartbleed.

- [1] There’s a fair bit of dispute over the ideal KDF for storing passwords, but generally speaking, any of the three covered in this topic are popularly accepted options.

DiabloD3 · on April 22, 2014

Re: slower malloc, there is no actual proof of, say, modern glibc's malloc implementation is slower in any useful measurement of the term. Heartbleed, however, didn't happen because someone failed benchmarking 101, it happened because someone failed bounds checking 101.

kzrdude · on April 22, 2014

The point that I and others want to make, is that regardless of the ultimate cause of failure, we want to criticize the whole design approach. Security in layers needs conservative coding style, both safe and sound internal APIs, bounds checking and an overly cautious memory allocator.

It's not just about finding and fixing bugs, but how to improve the whole process.

chrisrohlf · on April 22, 2014

It is frustrating that the confusion around heartbleed and the OpenSSL free lists persists: http://blog.leafsr.com/2014/04/11/my-heart-is-ok-but-my-eyes...

keithwinstein · on April 22, 2014

I agree with the basic idea here ("Stop associating 'hashed' with 'secure' when it comes to passwords), and I agree with the recommendation to use scrypt or bcrypt when necessary because "they’re your best option when rolling your own user credential storage."

But let's go for a part 3: please try not to store user credentials at all, if you can avoid it! It's not a great idea for every random Web site to create its own notion of identity for every user -- which requires a means of proving that identity, usually a password, and a way for that means to get compromised.

When it's feasible, better would be to outsource the job of verifying user credentials to some third party and only accept online cryptographic proofs of identity, whether that's via:

* an SSH client keypair (e.g. git push to GitHub)

* requiring requests be signed with a PGP keypair (e.g. uploading to Debian/Ubuntu)

* Mozilla Persona/BrowserID

* a TLS client certificate

* Facebook Connect

* Google Accounts

* OpenID

There's another benefit here -- no matter how many times you iterate a KDF, your website is still going to be vulnerable to an online compromise where the badguy just grabs passwords as users log in. (I understand systems like Meteor are nudging sites into using SRP-in-JavaScript, which is great, but in an online compromise that can just change.)

And while it may be good to use bcrypt vs. SHA-1, after all it's the users whose interests are ultimately at stake if the password DB is revealed, and yet the users generally have no way of knowing (cryptographically) that any given site really is using bcrypt. That suggests that the security dust is being applied in the wrong place -- the party that actually cares should be able to verify that the right thing is happening.

It would have been great if Mozilla had had more resources to stick with Persona (and of course if Facebook and Google could have afforded to endorse it).

Unfortunately even sites that are founded by ex-Facebook employees (like Quora) don't actually trust Facebook Connect enough to rely on it -- Quora uses FB Connect to let you "Sign Up with Facebook" but then requires you to establish your own Quora password just in case Facebook screws them over someday. (Then they store the password via unknown means...)

sillysaurus3 · on April 22, 2014

If a website's only option for credentialing is Facebook Connect, I won't be using that website. Ditto for Google Accounts.

Maybe it's just me. But if many others on HN feel the same way, then that could be relevant for a new startup who wants to gain traction among a core group of initial users. E.g. I never would have used Dropbox or Airbnb if their only option to login was Facebook.

keithwinstein · on April 22, 2014

Let's talk about why -- do you place greater trust in AirBnB (or random websites X, Y and Z) to keep your credentials safe than you do in Google?

Do you not want the identity provider (e.g. Google) to know every place you log in? (This was something Persona solved, alas...)

Do you not want to be reliant on a megacompany like Google whose spam filters might someday hit a false positive, causing Google to ban your account, and there's nobody to call and you're locked out of everywhere?

Or is it about separation, i.e. you don't want any single notion of your identity to have too much power if compromised (and you're careful to use unrelated credentials, e.g. distinct passwords, on every website)?

Or something else?

Are you ok with the way that GitHub and Ubuntu outsource the storing of credentials to authenticate a "push" by checking against a public key, for which only the user holds the private key? What if more services worked this way?

sillysaurus3 · on April 22, 2014

Do you not want the identity provider (e.g. Google) to know every place you log in? (This was something Persona solved, alas...)
Do you not want to be reliant on a megacompany like Google whose spam filters might someday hit a false positive, causing Google to ban your account, and there's nobody to call and you're locked out of everywhere?

Or is it about separation, i.e. you don't want any single notion of your identity to have too much power if compromised (and you're careful to use unrelated credentials, e.g. distinct passwords, on every website)?

Indeed, those are some excellent reasons to avoid any centralized login system. :) Most people won't care, but early adopters might. Startups don't need to care about early adopters after the 'early' stage, but the early stage is critical, so it's just something to keep in mind.

Are you ok with the way that GitHub and Ubuntu outsource the storing of credentials to authenticate a "push" by checking against a public key, for which only the user holds the private key? What if more services worked this way?

That'd be lovely. Unfortunately the key management problem hasn't really been solved: there's no way to make it easy for average users to create a key and use it on a bunch of different devices. "What you know" (a password) is still way more convenient than "what you have" (a keyfile), unfortunately.

I don't think there's any way to solve that without using a third party to sync keys across your devices. Something like that might be able to be done securely, but it'd require a lot of thought and care. (Ultimately we have to trust the service provider with our credentials anyway, so trusting them to sync keys doesn't seem like too far of a stretch.)

ronaldx · on April 22, 2014

I feel like you already answered your own question. These reasons may not apply to you, but they certainly apply to me.

No, I do not trust megacorps to keep my credentials safe.

I do not wish to use the same credentials for every site, for several reasons: in particular, worse consequences of account compromise; worse consequences of terms of service changes.

No, I definitely do not wish megacorp identity provider to be able to link information from my accounts across different websites.

No, I definitely do not want to be reliant on a megacorp. Amongst other reasons, I do not believe that a single entity can provide for all cases and competition in solutions is a good thing.

I don't want my account at random website X to be dependent on external services. Random website X will be slower and less reliable, and I will be less able to access support.

I do not want any single notion of my identity to have too much power if compromised.

I would rather accept the risk that random website Y is storing my password in plaintext rather than accept the risk of using a megacorp (or other external service).

roel_v · on April 22, 2014

It's about not being at the whims of another entity which doesn't give a flying fuck about you, and will (often without or with very little warning) change API, have 'maintenance' (either 'planned' or 'unplanned'), or just call it a day and close up shop (of their external sign in, that is - or maybe even all together). So then you need to provide a fallback on your own anyway, and now you have two systems to maintain - so better to just use only your own.

pdkl95 · on April 22, 2014

> Google, Facebook

Why you list some of the reasons against using those businesses, it's important to remember that some of don't have accounts with those businesses. Never have, and never will. So requiring FB or G accounts to participate ends up being is a rather offensive "we don't serve your kind here" door-slam.

You can offer them as an alternative, but requiring a troublesome 3rd-party like that is only going cause problems.

> do you place greater trust in AirBnB (or random websites X, Y and Z) to keep your credentials safe than you do in Google?

YES. By a wide margin. Some random website I can give a unique email and password, so if they do bad things, the damage is limited. I wildcard all the smail at my domain to the same place, so if I see spam addressed to e.g. "airbnb.domain@mydomain.net' I know who sold the address to spammers and have an easy regex to filter on, if necessary.

With google/etc, I wouldn't have that kind of granularity. Of course, there are all the usual reasons like not wanting google to know every login I make, but that topic is already covered.

> checking against a public key

That would be outstanding. Yes, we have not solved the key-management problem yet, but that's no reason not to try. With a problem this complex, it is going to take a number of attempts anyway before we get it right. Waiting for a "perfect" solution, on the other hand (i.e. "Web-of-Trust is too complicated!") will just leave us with the current mess.

TheLoneWolfling · on April 22, 2014

I use KeePass synced via a Truecrypt encrypted volume on Dropbox, with a copy backed up elsewhere. Separate password for every account.

I have a couple of problems with identity providers. One is that they are, by necessity, a single point of failure. And because of that, they are more jucy targets. When someone exploits the provider, all of a sudden they can access any website that I have used that provider for. When the provider goes under or decides to stop operating the service, I'm probably SOL. If the service has an outage, I cannot use any website using it.

I also have issues with companies selling my personal info. At this point, yes. I do put more trust in a random website keeping my credentials safe for that website than Google keeping my credentials safe for every website that Google has in its clutches.

Public/private key crypto can work, but has issues of its own.

Silhouette · on April 22, 2014

I'm with 'sillysaurus3 on this one: I have no interest in using a site that only uses the likes of Facebook and Google for authentication.

You've just listed four good reasons to reject centralisation of credentials with those organisations, and any one of them would be sufficient disincentive in my case.

I can think of at least one more good reason: If I'm dealing with an organisation, particularly one I'm paying for their services, if I have unique credentials with them, and if anything goes wrong on their side, then it's going to be their responsibility to fix it and make good any damage. If I'm dealing with them and using a third party for credentials, and that third party gets compromised in some way, then it seems likely it will end up being my responsibility to sort everything out and pay any related costs.

moron4hire · on April 22, 2014

I like to spread a little trust around, rather than dump a lot of trust on one, monolithic provider.

riquito · on April 22, 2014

I don't want that if Google closes my account I'm utterly screwed up. I'd like to be able to give a fallback e-mail address for example.

cruise02 · on April 22, 2014

You can offer to let users log in with Facebook Connect and Google Accounts, etc. If I'm going to try out a new service, I'd rather use one of my existing log-ins than trust a brand new start-up to store my information securely. It's not their core competency.

Silhouette · on April 22, 2014

Can you not just use a password manager and generate unique credentials for the new service, and give them a unique e-mail address as well if you're worried about spam?

Or are you worried about whatever data you put on the service after you've logged in?

cruise02 · on April 22, 2014

Sure, I could do that. Or I could just log in with Facebook, Google, or OpenId if they're available. I'd rather have more choice in the matter.

Signing up with a third-party log-in doesn't really do much to protect the data you enter into a site after you've logged in.

lazerwalker · on April 22, 2014

Anecdotally, the problem isn't that sites (e.g. Quora) don't trust FB Connect, it's that users don't.

For every project I've worked on that had both FB Connect and in-house login, almost no users used FB Connect, no matter how the two options were visually presented. I've seen iOS apps get absolutely HAMMERED in their App Store reviews simply because they don't have any login other than FB and users refuse to use it.

I agree this is a problem, but if I'm a line-level engineer or product guy, my short-term goal isn't to come up with a high-level solution, it's to do what works and won't fuck with my acquisition funnel. Which is currently roll your own.

raverbashing · on April 22, 2014

I think I know why is that

Users don't trust the app won't do anything else beside authentication, like: posting to your wall, spamming your friends, reading all your information, etc, etc

drdaeman · on April 22, 2014

> please try not to store user credentials at all

Sorry for nitpicking, but in my understanding of terminology, non-anonymous authentication always require some sort of credentials and a public key/OpenID/Google account/whatever is one, just as well as an old good username-password pair.

And in case of verifying certificates/signatures, we're not outsourcing jobs to any third party, but doing the verification ourselves. This is important distinction between user-owned keypairs and Persona/Facebook/Google/OpenID - I'm not sure entrusting user identities and authentication to a third party, instead of establishing secure means of authentication, is a wise decision.

jimmaswell · on April 22, 2014

If you use one account for everything there's a single point of failure, how is that better for the user's security? One account gone and everything's gone if you can't get it back.

oleganza · on April 22, 2014

The problem with expensive KDF on the server is that you are opening yourself to a higher risk of DoS. More computationally intensive KDF - higher protection against crackers, but easier for someone to DoS you.

Better approaches:

1) Let the client do expensive KDF (e.g. in JavaScript if it's a web app) and send you only the resulting key. You can then store SHA256(key) instead of KDF(password, salt).

2) Use public keys. This requires some fancy infrastructure like a keychains, keybase.io, PGP software etc, but it is the most secure solution. User saves only his public key on the server and signs authentication token when he wants to get in. Server has absolutely no secret material to leak, while the user has complete control over safety of his keys. This is how SSH works. The reason it is not widely used on the web is because we don't yet have nice infrastructure: nice UI for keychains and protocols for one-click sign-in via crypto signatures. We all should work on this to make it better for our kids ;-)

tptacek · on April 22, 2014

People have been "worrying" about DoS attacks from password hashes for years. Strangely, they haven't become an issue in practice. My presumption is that this is because if you're going to launch an application-level denial of service attack against a particular application, it is already trivial to come up with one regardless of how you store passwords.

oleganza · on April 22, 2014

People haven't ever deployed reaaaally hardcore KDFs on their servers yet. I myself would like my password stretched with >100 Mb of memory and taking about a second on 8 threads. This kind of setup might be too expensive for a shared server.

tptacek · on April 22, 2014

The whole point of adaptive KDFs is that you can set them wherever you'd like, so it's always going to be possible to take one of them and give it an unreasonable parameter. But you're not seeing anyone credibly advocating that anyone do that in the real world.

matthewmacleod · on April 22, 2014

Neither of these is a better option, in my opinion, because the complexity of setting them up and ensuring security far outweighs the risk of a DoS attack on your infrastructure.

It's relatively trivial to rate-limit access to particular endpoints if you are particularly concerned about DoS attacks, which means you've got a simpler system overall – an simplicity in security is reaaaaaallly worth it!

natefinch · on April 22, 2014

If someone can DoS your login api, they can do it to another api. Rate limit it. It's not hard. It's actually even easier because people login so rarely, you can really ratchet down the rate and not mess up the casual user's experience.

mschuster91 · on April 22, 2014

> 1) Let the client do expensive KDF (e.g. in JavaScript if it's a web app) and send you only the resulting key. You can then store SHA256(key) instead of KDF(password, salt).

So $hacker has to simply do a rainbow-table attack... no security gained there, except for a vastly bigger key space.

_wwz4 · on April 22, 2014

The reason for his suggestion was to prevent an expensive hashing algo from being a means for creating a DOS, not to make the rainbow table attack any harder... although if you offload an even more expensive operation to the client than you were going to perform on the server, it does make the attack harder.

oleganza · on April 22, 2014

I assume that key was generated through an expensive KDF with a proper salt and is at least 32 bytes long. So "rainbow table attack" is out of question here. That's why I said sha256(key), not sha256(key + salt).

jessaustin · on April 22, 2014

Except an attacker wouldn't have to perform that expensive operation: she could just iterate over the range of the KDF. Unless there were rate-limiting!

taejo · on April 22, 2014

Getting them to iterate over the range of the KDF is enough, isn't it?

jessaustin · on April 22, 2014

Well I guess if the range is large enough it might not be feasible. But the hypothetical system is not rate-limiting (if it were, doing the KDF server-side wouldn't be a big deal), and it is not storing an individual salt, so time is the only thing standing in the way of this attack.

taejo · on April 24, 2014

Time is the only thing standing in the way of any attack (well, time and memory, I guess)

taejo · on April 22, 2014

The vastly bigger key space is all you need to make the rainbow-table attack completely beyond the realm of possibility (isn't it?)

mcescalante · on April 22, 2014

After reading this, the title is linkbait IMO (the author calls it a "sensationalist title")

Wohui · on April 22, 2014

It is, insultingly so.

natefinch · on April 22, 2014

My guess is that anyone who still does this does not actually read technology blogs or Hacker News. The answer for a long time now has been "use scrypt or bcrypt". I think that answer still holds (IANAC).

rlpb · on April 22, 2014

I get the impression that, even on HN, after some high profile database breach, people still ask "were they using hashed passwords?" rather than "were they using a key derivation function and if so which one with what parameters?".

This tells me that the HN crowd still hasn't got it.

My own attempt at this message: http://www.justgohome.co.uk/blog/2014/02/salting-hashing-and...

natefinch · on April 22, 2014

Certainly, it's a message that bears repeating. New devs enter the workforce every day.

vertex-four · on April 22, 2014

Or some variety of PBKDF2 if adhering to standards is more your thing, or you intend to market the security of your system to enterprise customers. The likes of Python's Django use this by default.

tptacek · on April 22, 2014

Enterprise customers do not care about PKCS "standards" and don't as a rule really know much about password hashing. A good reason to use PBKDF2 instead of bcrypt is if your password hash needs to double as an actual key derivation function. Other than that, I can't think of any authentic good reasons to use it.

It doesn't matter, though. PBKDF2, bcrypt, scrypt: all fine. Just stop using salted, peppered, or lightly-battered hashes.

natefinch · on April 22, 2014

Yes, right. Sometimes I think people (myself included) leave out PBKDF2 because it's hard to spell/pronounce. :)

tragic · on April 22, 2014

I might remember it now that I know KDF = key derivation function.

Since the GP mentions Django, it's worth remembering in this connection that there are extra security concerns with expensive KDF algorithms. There was a vuln in Django where people could submit 'passwords' of arbitrary size - a 20 character password isn't a problem, but 1MB worth of text being expensively hashed a few times at once could be used as a nice DoS vector.[0] So there's more to the problem than just bunging in PBKDF2/bcrypt and forgetting about it.

[0] https://www.djangoproject.com/weblog/2013/sep/15/security/

kzrdude · on April 22, 2014

Hopefully the people behind password-hashing.net will give us an up to date recommendation when their contest concludes.

tim333 · on April 23, 2014

For quite a while pleantyoffish.com sent me regular emails including "remember you password is"... in plain text. They seem have stopped now. Some evil hacker could have stolen my ever so valuable list of potential dates.

drdaeman · on April 22, 2014

Just curious. Isn't this is kind of KDF "abuse"? Because, for storing password information in databases, KDFs aren't used for key derivation (the generated key material is never used as a key for any cipher), but just as slowly performing [salted] hash functions.

If so, why say "don't use hash functions and use KDFs" when KDFs are used as hash functions in such cases? Maybe it's better to say something along the "don't use fast hash functions, use slow ones" lines?

tptacek · on April 22, 2014

A hash function is a primitive. A KDF is a construction. The construction best used for storing passwords happens to have basically the same goal as the one used to stretch a low-entropy secret into a higher-entropy secret.

drdaeman · on April 22, 2014

Thanks, that clarifies it.

But would it be (more) correct if we'd say "key stretching function" instead of KDF?

kitd · on April 22, 2014

I wouldn't say that KDFs used this way is "abuse" any more than hash functions used for secure storage is abuse. It all comes down to what you mean by "hashing".

The problem is that the same word "hashing" is widely used (incl by experts) for 2 different things, normal insecure hashing and cryptographic hashing. As he points out, the 2 have different purposes, ie CRC for large files & git blobs, etc for the first, and cryptographically secure storage of small strings for the 2nd.

KDFs meet all the requirements of cryptographic hashing. Maybe a new term is needed for the process of cryptographic hashing, like "scrambling", or something that suggests secrecy, rather than just generating a unique value.

po · on April 22, 2014

This article shows PBKDF2 as being barely any slower than sha2 but makes no mention of how many rounds it is configured for. The note at the end implies that something is misconfigured. I'm not a rubyist, but something is definitely wrong there. PBKDF2 is maybe not the best choice but is certainly safe and better than sha3 if configured correctly.

If the author is reading here, it's probably best to take PBKDF2 out of the graph until you get it properly configured. Otherwise, people may choose sha3 as their password hashing function.

nichochar · on April 22, 2014

You don't talk about using a pepper also, combined with a Salt. A pepper is a constant secret you append before hashing, increasing entropy and globally making the guessing much harder. Webapp2 from google has a nice implementation, coming from werkzeug to be found here http://webapp-improved.appspot.com/api/webapp2_extras/securi...?

nialo · on April 22, 2014

This is deeply pointless. Any attacker that gets the password database is practically guaranteed to get the "pepper" as well. Just Use Bcrypt, pick as high a work factor as you can afford, and worry about something else instead.

(if you have a place to keep this extra secret where such an attacker can't get it, why not just put the passwords there?)

mnw21cam · on April 22, 2014

As a counterargument, yes I can think of circumstances where an attacker gets read access to the password database, but not the application configuration (or code, depending) that contains the pepper.

Using a pepper is another example of security in depth. It doesn't protect against the situation where your attacker has root access on your server, but it increases the work that an attacker has to do (or the luck that the attacker has to have).

danielweber · on April 22, 2014

(if you have a place to keep this extra secret where such an attacker can't get it, why not just put the passwords there?)

The argument is that you can put the key into the code base or into the deployment environment. Passwords are user changeable and can't go there. And, just maybe, you only lose one or the other but not both.

(NB: I'm not defending the pepper, but at least one very major software security consulting firm uses this argument. I don't know how much time they spend doing pen tests and getting a feel for when the pepper would have actually saved anyone's ass, and when I pressed I only got a vague mumble about customer confidentiality.)

raverbashing · on April 22, 2014

Yes, but let's imagine this:

Salt is not hidden, if you get the passwords, you get the salt

But then you might suspect user X is using a weak password (that you know)

Then you use that to bruteforce the pepper (a defence would be to have a big pepper)

Or you know, if you already got in and got the password hashes it shouldn't be too difficult to get the source code/config info.

danielweber · on April 22, 2014

Oh, another argument is that sometimes the attacker gets to your DB at time t1, and your codebase at time t2, and if you have a system that changes your pepper regularly the t1 may be out of sync at t2.

All in all, it seems like a lot of work (that could be spent elsewhere) for little gain. I've been burned enough to know that any tweaks to a crypto system to "make it better" sometimes shoot you in the foot instead.

adwf · on April 22, 2014

Correct me if I'm wrong, but I think a large proportion of database hacks are still SQL injection attacks.

If you assume that the password handling code is on a webserver and the database is on a separate server, there is no reason to assume that an attacker getting read access to the DB through an injection attack will be able to access a pepper on the webserver.

Defense in depth is rarely a bad thing. Especially when it doesn't actually harm the end-user experience.

tptacek · on April 22, 2014

First, no, SQL injection isn't the primary way people lose password databases. It's a significant component of those incidents, but just one of them.

Second, even with SQL injection, keeping a secret key somewhere on the disk is unlikely to be helpful, because SQL injection attacks usually cough up remote code execution, and even more frequently provide direct read access to the filesystem; on many hugely popular frameworks, "arbitrary file read" is itself remote code execution.

adwf · on April 22, 2014

Hence why I mentioned separate servers. Even if you did screw up the database permissions and allow an injection attack to have more than just read permissions, they'd still only have access to the DB server filesystem.

Unless there is some extra network vulnerability that lets them then hop to the webserver to read the secret pepper straight out of memory, I can't see why it isn't worthwhile. It may only provide a modicum of extra security, but it also is trivial to implement. Unless there is a cryptographic downside to adding the pepper (possible - outside my area of expertise), I can't see why not?

tptacek · on April 22, 2014

You're suggesting a separate server to store a hash key on? If you think the second server is that secure, just put the passwords on it.

adwf · on April 23, 2014

No I mean I already have two physically separate servers; one with the web service (Apache/Nginx + Rails/PHP/whatever), one with the database. The webserver handles the hashing of passwords as they come in, then sends it to the database for storage.

You're starting make me feel odd for not having the database on the same server as the web service now. I feel we've probably got some crossed wires here...

natefinch · on April 22, 2014

Yes, exactly.

cocoflunchy · on April 22, 2014

I see people saying "just use scrypt or bcrypt"!

I noticed in Django's admin this line (in the user section, with the default authentication app):

    algorithm: pbkdf2_sha256 iterations: 12000 salt: 8c8ttQ****** hash: pN+2tq**************************************

Am I fine? Is there something more secure? Why isn't Django's default scrypt or bcrypt? And what is pbkdf2_sha256 exactly?

schrodinger · on April 22, 2014

http://en.wikipedia.org/wiki/PBKDF2

I think it's fine, although some people believe [sb]crypt to be more secure. I think PBKDF2 is actually the default in ASP.NET now.

goblin89 · on April 22, 2014

PBKDF2 with SHA256 hashing is default password storage mechanism in Django. Recommending PBKDF2 is one of the things that tptacek criticized “Practical Cryptography With Go” for[0].

Edit: Django has support for bcrypt but it's not the default because of third-party dependencies[1].

[0] https://news.ycombinator.com/item?id=7596280

[1] https://docs.djangoproject.com/en/1.7/topics/auth/passwords/

po · on April 22, 2014

tptacek faulted 'Practical Cryptography With Go' for recommending PBKDF2 over bcrypt and scrypt however this is not what Django is doing.

Django chose PBKDF2 of the three primarily because it was easily implemented without pulling in third-party dependencies, which is very helpful for Django's deployment. Given the choice of the three, I think most all Django core developers would say that scrypt or bcrypt would be better but of the three options, PBKDF2 was deemed to be the most acceptable for driving adoption. Before PBKDF2 was added it was using a fast hashing algorithm and had no way to select one so all three were considered improvements and PBKDF2 was considered the 'safest' for adoption, not the optimal one. Note that it is now fairly easy to add bcrypt or scrypt into your django deploy if you choose to.

I'd also add that the parent article, talks about PBKDF2 but I am not sure it is using it correctly. It makes no mention of how many rounds it is configured for and I'm not sure what they mean by 'dumb iterations'. It should be able to make it as slow as scrypt or bcrypt or on a single machine benchmark. The advantage of something like scrypt is that it pulls in complexity on memory as well as CPU.

Edit For anyone interested in the decision making here, this was the summary ticket on Django password hashing and my summary at the time:

https://code.djangoproject.com/ticket/15367

https://groups.google.com/d/msg/django-developers/ko7V2wDVsd...

tptacek · on April 22, 2014

No, that's not accurate. I noted that the book didn't do a very good job of explaining the three functions or make a clear recommendation based on facts. That's different from dinging someone simply for using PBKDF2, which I wouldn't do.

po · on April 23, 2014

I think we're saying basically the same thing. You're not criticizing the use of PBKDF2 as a bad choice as the parent comment implies.

goblin89 · on April 23, 2014

I should've written “blindly recommending” instead of “recommending”.

(That's the passage I was referring to:

> If you're explaining crypto to a reader, and you're at the point where you're discussing KDFs, you'd think there'd be cause to explain that PBKDF2 is actually the weakest of those 3 KDFs, and why.

Also, I'm not arguing the sanity of Django developers' choice of PBKDF2 as default password encryption mechanism—IMO marginally better security wouldn't be worth the increased complexity of starting new project for newcomers.)

damon_c · on April 22, 2014

I found a decent, almost but not quite exact match for your question here:

http://stackoverflow.com/questions/4433216/password-encrypti...

raverbashing · on April 22, 2014

pbkdf2_sha256 means they're feeding your password first through sha256 then applying pbkdf2

It's to prevent a limitation in the password size. (I can't find the page explaining this though) https://docs.djangoproject.com/en/dev/topics/auth/passwords/

You can switch to other hashers, of course, as explained on the page.

Now, is it only my impression or in the benchmark there's a bug and the number of iterations for pbkdf2 is fixed to 1?

Tomte · on April 22, 2014

PBKDF2 is an older KDF that is probably a bit less secure (especially in the light of FPGAs), but well-known and standardized.

It's perfectly okay.

baby · on April 22, 2014

So several iterations of md5 + salt is not secure? I'm really wondering how much time it takes to find a password hashed by 5 iterations of md5 + a salt by exhaustive search.

tptacek · on April 22, 2014

With many, many iterations, simply iterating a decent hash function is a valid 80% solution to the problem. 5 iterations of a cryptographic hash function won't buy you anything at all, though.

TheCoelacanth · on April 22, 2014

It's not advisable to use MD5 for anything. It's security is severely compromised. Weaknesses in MD5 have already been exploited to create a fake root CA[1] and a fake Microsoft code-signing certificate[2]. There isn't practical publicly-known preimage attack, but there was a theoretical attack published in 2009[3]. Additionally, MD5 can be calculated far too quickly to use for password hashing. A commodity GPU from 2009 can calculate 200 million MD5 hashes per second [4]. For password hashing you should use something like bcrypt or scrypt which is specifically designed to be computationally expensive.

[1] http://www.win.tue.nl/hashclash/rogue-ca/

[2] http://blogs.technet.com/b/srd/archive/2012/06/06/more-infor...

[3] http://link.springer.com/chapter/10.1007%2F978-3-642-01001-9...

[4] http://bvernoux.free.fr/md5/index.php

triangleman83 · on April 22, 2014

Hashing a hash is not generally considered secure because you have to assume that if your system was compromised, the hackers will know what methods you used, including the list of salts. If the hash runs quickly then you didn't really cause them any more time/work.

danielweber · on April 22, 2014

Assuming "several iterations" means "a million or more iterations" then you have captured most of what bcrypt gets you. You've broken their rainbow tables and they have to brute-force to find users using "passw0rd". You can tune the "several iterations" the same way you can tune bcrypt.

That said, don't roll your own. You probably screwed it up somewhere. Just use the bcrypt library call. Or scrypt to let you roll +2 against GPU attacks.

baby · on April 22, 2014

That's right! KDF seems to be better than hashing in the case the server gets compromised.

cordite · on April 22, 2014

The recent post "Fairy Tales in password hashing"[0] in regards to scrypt seems somewhat relevant here.

[0]: http://vnhacker.blogspot.com/2014/04/fairy-tales-in-password...

tptacek · on April 22, 2014

No, it's not relevant to this discussion at all.

skrebbel · on April 22, 2014

Hardly, because it's not about scrypt itself but about how (according to the author) the library's API encourages incorrect usage.

A reader of your comment might assume that the article is about some perceived flaw in scrypt itself.

cordite · on April 22, 2014

I can see that perspective. I could have made it more clear.

I see it as it being more about incorrect usage or application, rather than a fault with scrypt.

eldelshell · on April 22, 2014

What I'm using for my latest project is a not a hash, but using an internal 128 long key to encrypt using PBKDF2WithHmacSHA1 and persisting the result to a blob.

tptacek · on April 22, 2014

If you have a secure place to put a key for reversibly encrypting passwords, just put the passwords in that place instead. I'm sure you'll be fine.

mnw21cam · on April 22, 2014

Might be nice to mark your comment as a joke - someone may take it seriously.

jakobe · on April 22, 2014

The biggest issue here is that users are allowed to pick their own passwords in the first place. Sure, you can require them to use passwords with a capital letter and with a number and with a punctuation character, but that will just make them pick "Password1."

Better: Use one time passwords sent via SMS. Or send a one-time-login URL via email.

If you do have to use a password, just generate a 10 digit numeric code. Sure, some of your customers might complain, but at least you aren't responsible for disclosing people's ebay password when your site gets hacked.

matthewmacleod · on April 22, 2014

That's not "some of your customers might complain" territory, it's "your business failed because nobody signed up" territory.

2FA basically ensures security via a second channel, and it's perfectly possible to store passwords in a secure format. I'm not convinced your ideas there are worth the cost.

jakobe · on April 22, 2014

> your business failed because nobody signed up

Why? Almost all websites require email confirmation; sending someone a login-URL via email actually has less friction because the password-choosing step is removed!

> it's perfectly possible to store passwords in a secure format

But it's very hard to do so. Even if you use scrypt, it is very hard to make sure your whole system is actually secure against password leakage.

The simple truth is that letting your users choose their own passwords is a liability; and I've decided to avoid this liability.

the_af · on April 22, 2014

Re: "Password1". There was an interesting paper, I think by someone from Microsoft, that argued that when users pick silly passwords they are actually being rational. They (the users) informally decide that the pain of overcomplex password schemes just isn't worth it. In other words, remembering passwords or using security-related programs and practices is a high price they have to pay everyday (while we computer literate people often disregard this cost, it is there), while the relatively uncommon security breach is something they often never see.

Maybe I'm misrepresenting what the paper states, but my takeaway from it was "don't assume users are dumb when they pick silly passwords. They simply are not willing to use an overcomplex system that for them turns out to be not worth the effort."

I just tried to find this paper online but I can't even remember the title :(

mnw21cam · on April 22, 2014

Correct horse battery staple. http://xkcd.com/936/

We are told to not re-use passwords. This is not helped by every single shopping web site out there requiring an account (and therefore a password) in order to buy something. Fair enough for big sites like Amazon - I'm actually likely to come back at some time in the future, although I dislike the way it tries to store my card number each time.

On most sites, requiring me to create an account discourages me from shopping there. I'm not likely to come back unless I suddenly have a burning need for another obscure once-in-a-lifetime widget, so why do I need an account? If I do come back, you still only need my card number and a delivery address.

As it stands, the sheer number of accounts that I have means that I invariably set an impossible to remember password and immediately forget it, relying on the password reset mechanism. This is not ideal.

Pxtl · on April 22, 2014

Honestly, I just wish I could elect one-factor non-password login on such rarely used sites. Just put a button next to the username box "login by email" and use my email address as my username. So I type in pxtl@myemailhost.ca and then click that button, get a link in my email to auth the session cookie, and I'm in. Hard implementation detail would be polling the server from the browser window to find out when I've authed the session from email, since I might want to auth from my phone.

Password reset without the password. If my email account is compromised then everything is screwed, but with password-reset emails that was already true.

Of course, this is potentially vulnerable to abuse... but again, password-reset emails have the same problem.

hibikir · on April 22, 2014

Have you seen what modern password hacking tools do? We don't see old school brute force anymore: Things users do are checked first. 3 words one after the other, one or two letters replaced, or sitting, right next to passwords. walking on a keyboard... those things are tested relatively early in the process.

So HorseBatteryStaple sucks as a password, along with anything else you can easily remember. If you want security, you probably want 2 factor authentication and a different password for every site, probably stored in something like a KeePass DB.

mnw21cam · on April 22, 2014

Testing several words after each other is an old-school brute force method. How hard a password is to crack is basically a measure of the amount of entropy encoded, and there is more than you think in a collection of several words. The comic uses four words for a reason, not three. Sure, replace a few characters if you wish - it does increase the entropy slightly.

Also, know your target. If your target is to secure your account against a web-based brute force, as depicted in the comic, the attacker is likely to be rate-limited by the server, and a reasonable password is likely to be sufficient. If the attacker gets access to the hashed password database, then that's a different matter, but if you have sufficient entropy in your password it can still be secure.

But my main point is this - why do I need an account and password for uncle bob's glass cutting tool emporium, when I am only likely to make a single order in my lifetime? If I don't have an account, and therefore have no password, then there is nothing to hack.