
Please stop hashing passwords - DanielRibeiro
http://blog.tjll.net/please-stop-hashing-passwords/
======
ttflee
The tl;dr from the article:

> Stop associating “hashed” with “secure” when it comes to passwords. If
> you’re storing user credentials MD5-ed without salt, you’ve put an thumb
> tack in front of a steamroller - just a minor annoyance that won’t offer you
> any real security. You have to do it right or hashing means nothing.

> Start relying on key derivation functions for password storage.[1] Are they
> perfect? Will they be the best practice in five years? Probably not, but
> they’re your best option when rolling your own user credential storage.

> Don’t sacrifice security for performance. A slightly slower malloc in
> OpenSSL would have been much preferred, in my opinion, than the disaster
> that is Heartbleed.

\- [1] There’s a fair bit of dispute over the ideal KDF for storing passwords,
but generally speaking, any of the three covered in this topic are popularly
accepted options.

~~~
DiabloD3
Re: slower malloc, there is no actual proof of, say, modern glibc's malloc
implementation is slower in any useful measurement of the term. Heartbleed,
however, didn't happen because someone failed benchmarking 101, it happened
because someone failed bounds checking 101.

~~~
kzrdude
The point that I and others want to make, is that regardless of the ultimate
cause of failure, we want to criticize the whole design approach. Security in
layers needs conservative coding style, both safe and sound internal APIs,
bounds checking and an overly cautious memory allocator.

It's not just about finding and fixing bugs, but how to improve the whole
process.

------
keithwinstein
I agree with the basic idea here ("Stop associating 'hashed' with 'secure'
when it comes to passwords), and I agree with the recommendation to use scrypt
or bcrypt when necessary because "they’re your best option when rolling your
own user credential storage."

But let's go for a part 3: please try not to store user credentials at all, if
you can avoid it! It's not a great idea for every random Web site to create
its own notion of identity for every user -- which requires a means of proving
that identity, usually a password, and a way for that means to get
compromised.

When it's feasible, better would be to outsource the job of verifying user
credentials to some third party and only accept online cryptographic proofs of
identity, whether that's via:

* an SSH client keypair (e.g. git push to GitHub)

* requiring requests be signed with a PGP keypair (e.g. uploading to Debian/Ubuntu)

* Mozilla Persona/BrowserID

* a TLS client certificate

* Facebook Connect

* Google Accounts

* OpenID

There's another benefit here -- no matter how many times you iterate a KDF,
your website is still going to be vulnerable to an online compromise where the
badguy just grabs passwords as users log in. (I understand systems like Meteor
are nudging sites into using SRP-in-JavaScript, which is great, but in an
online compromise that can just change.)

And while it may be good to use bcrypt vs. SHA-1, after all it's the _users_
whose interests are ultimately at stake if the password DB is revealed, and
yet the users generally have no way of knowing (cryptographically) that any
given site really is using bcrypt. That suggests that the security dust is
being applied in the wrong place -- the party that actually cares should be
able to verify that the right thing is happening.

It would have been great if Mozilla had had more resources to stick with
Persona (and of course if Facebook and Google could have afforded to endorse
it).

Unfortunately even sites that are founded by ex-Facebook employees (like
Quora) don't actually trust Facebook Connect enough to rely on it -- Quora
uses FB Connect to let you "Sign Up with Facebook" but then requires you to
establish your own Quora password just in case Facebook screws them over
someday. (Then they store the password via unknown means...)

~~~
sillysaurus3
If a website's only option for credentialing is Facebook Connect, I won't be
using that website. Ditto for Google Accounts.

Maybe it's just me. But if many others on HN feel the same way, then that
could be relevant for a new startup who wants to gain traction among a core
group of initial users. E.g. I never would have used Dropbox or Airbnb if
their only option to login was Facebook.

~~~
keithwinstein
Let's talk about why -- do you place greater trust in AirBnB (or random
websites X, Y and Z) to keep your credentials safe than you do in Google?

Do you not want the identity provider (e.g. Google) to know every place you
log in? (This was something Persona solved, alas...)

Do you not want to be reliant on a megacompany like Google whose spam filters
might someday hit a false positive, causing Google to ban your account, and
there's nobody to call and you're locked out of everywhere?

Or is it about separation, i.e. you don't want any single notion of your
identity to have too much power if compromised (and you're careful to use
unrelated credentials, e.g. distinct passwords, on every website)?

Or something else?

Are you ok with the way that GitHub and Ubuntu outsource the storing of
credentials to authenticate a "push" by checking against a public key, for
which only the user holds the private key? What if more services worked this
way?

~~~
sillysaurus3
_Do you not want the identity provider (e.g. Google) to know every place you
log in? (This was something Persona solved, alas...)

Do you not want to be reliant on a megacompany like Google whose spam filters
might someday hit a false positive, causing Google to ban your account, and
there's nobody to call and you're locked out of everywhere?

Or is it about separation, i.e. you don't want any single notion of your
identity to have too much power if compromised (and you're careful to use
unrelated credentials, e.g. distinct passwords, on every website)?_

Indeed, those are some excellent reasons to avoid any centralized login
system. :) Most people won't care, but early adopters might. Startups don't
need to care about early adopters after the 'early' stage, but the early stage
is critical, so it's just something to keep in mind.

 _Are you ok with the way that GitHub and Ubuntu outsource the storing of
credentials to authenticate a "push" by checking against a public key, for
which only the user holds the private key? What if more services worked this
way?_

That'd be lovely. Unfortunately the key management problem hasn't really been
solved: there's no way to make it easy for average users to create a key and
use it on a bunch of different devices. "What you know" (a password) is still
way more convenient than "what you have" (a keyfile), unfortunately.

I don't think there's any way to solve that without using a third party to
sync keys across your devices. Something like that _might_ be able to be done
securely, but it'd require a lot of thought and care. (Ultimately we have to
trust the service provider with our credentials anyway, so trusting them to
sync keys doesn't seem like too far of a stretch.)

------
oleganza
The problem with expensive KDF on the server is that you are opening yourself
to a higher risk of DoS. More computationally intensive KDF - higher
protection against crackers, but easier for someone to DoS you.

Better approaches:

1) Let the client do expensive KDF (e.g. in JavaScript if it's a web app) and
send you only the resulting key. You can then store SHA256(key) instead of
KDF(password, salt).

2) Use public keys. This requires some fancy infrastructure like a keychains,
keybase.io, PGP software etc, but it is the most secure solution. User saves
only his public key on the server and signs authentication token when he wants
to get in. Server has absolutely no secret material to leak, while the user
has complete control over safety of his keys. This is how SSH works. The
reason it is not widely used on the web is because we don't yet have nice
infrastructure: nice UI for keychains and protocols for one-click sign-in via
crypto signatures. We all should work on this to make it better for our kids
;-)

~~~
mschuster91
> 1) Let the client do expensive KDF (e.g. in JavaScript if it's a web app)
> and send you only the resulting key. You can then store SHA256(key) instead
> of KDF(password, salt).

So $hacker has to simply do a rainbow-table attack... no security gained
there, except for a vastly bigger key space.

~~~
crusso
The reason for his suggestion was to prevent an expensive hashing algo from
being a means for creating a DOS, not to make the rainbow table attack any
harder... although if you offload an even more expensive operation to the
client than you were going to perform on the server, it does make the attack
harder.

~~~
jessaustin
Except an attacker wouldn't _have_ to perform that expensive operation: she
could just iterate over the range of the KDF. Unless there were rate-limiting!

~~~
taejo
Getting them to iterate over the range of the KDF is enough, isn't it?

~~~
jessaustin
Well I guess if the range is large enough it might not be feasible. But the
hypothetical system is _not_ rate-limiting (if it were, doing the KDF server-
side wouldn't be a big deal), and it is _not_ storing an individual salt, so
time is the only thing standing in the way of this attack.

~~~
taejo
Time is the only thing standing in the way of any attack (well, time and
memory, I guess)

------
mcescalante
After reading this, the title is linkbait IMO (the author calls it a
"sensationalist title")

~~~
Wohui
It is, insultingly so.

------
NateDad
My guess is that anyone who still does this does not actually read technology
blogs or Hacker News. The answer for a long time now has been "use scrypt or
bcrypt". I think that answer still holds (IANAC).

~~~
vertex-four
Or some variety of PBKDF2 if adhering to standards is more your thing, or you
intend to market the security of your system to enterprise customers. The
likes of Python's Django use this by default.

~~~
NateDad
Yes, right. Sometimes I think people (myself included) leave out PBKDF2
because it's hard to spell/pronounce. :)

~~~
tragic
I might remember it now that I know KDF = key derivation function.

Since the GP mentions Django, it's worth remembering in this connection that
there are extra security concerns with expensive KDF algorithms. There was a
vuln in Django where people could submit 'passwords' of arbitrary size - a 20
character password isn't a problem, but 1MB worth of text being expensively
hashed a few times at once could be used as a nice DoS vector.[0] So there's
more to the problem than just bunging in PBKDF2/bcrypt and forgetting about
it.

[0]
[https://www.djangoproject.com/weblog/2013/sep/15/security/](https://www.djangoproject.com/weblog/2013/sep/15/security/)

------
drdaeman
Just curious. Isn't this is kind of KDF "abuse"? Because, for storing password
information in databases, KDFs aren't used for _key derivation_ (the generated
key material is never used as a key for any cipher), but just as slowly
performing [salted] hash functions.

If so, why say "don't use hash functions and use KDFs" when KDFs _are_ used as
hash functions in such cases? Maybe it's better to say something along the
"don't use fast hash functions, use slow ones" lines?

~~~
tptacek
A hash function is a primitive. A KDF is a construction. The construction best
used for storing passwords happens to have basically the same goal as the one
used to stretch a low-entropy secret into a higher-entropy secret.

~~~
drdaeman
Thanks, that clarifies it.

But would it be (more) correct if we'd say "key stretching function" instead
of KDF?

------
po
This article shows PBKDF2 as being barely any slower than sha2 but makes no
mention of how many rounds it is configured for. The note at the end implies
that something is misconfigured. I'm not a rubyist, but something is
definitely wrong there. PBKDF2 is maybe not the best choice but is certainly
safe and better than sha3 if configured correctly.

If the author is reading here, it's probably best to take PBKDF2 out of the
graph until you get it properly configured. Otherwise, people may choose sha3
as their password hashing function.

------
nichochar
You don't talk about using a pepper also, combined with a Salt. A pepper is a
constant secret you append before hashing, increasing entropy and globally
making the guessing much harder. Webapp2 from google has a nice
implementation, coming from werkzeug to be found here [http://webapp-
improved.appspot.com/api/webapp2_extras/securi...](http://webapp-
improved.appspot.com/api/webapp2_extras/security.html)?

~~~
nialo
This is deeply pointless. Any attacker that gets the password database is
practically guaranteed to get the "pepper" as well. Just Use Bcrypt, pick as
high a work factor as you can afford, and worry about something else instead.

(if you have a place to keep this extra secret where such an attacker can't
get it, why not just put the passwords there?)

~~~
danielweber
_(if you have a place to keep this extra secret where such an attacker can 't
get it, why not just put the passwords there?)_

The argument is that you can put the key into the code base or into the
deployment environment. Passwords are user changeable and can't go there. And,
_just maybe_ , you only lose one or the other but not both.

(NB: I'm not defending the pepper, but at least one very major software
security consulting firm uses this argument. I don't know how much time they
spend doing pen tests and getting a feel for when the pepper would have
actually saved anyone's ass, and when I pressed I only got a vague mumble
about customer confidentiality.)

~~~
raverbashing
Yes, but let's imagine this:

Salt is not hidden, if you get the passwords, you get the salt

But then you might suspect user X is using a weak password (that you know)

Then you use that to bruteforce the pepper (a defence would be to have a big
pepper)

Or you know, if you already got in and got the password hashes it shouldn't be
too difficult to get the source code/config info.

~~~
danielweber
Oh, another argument is that sometimes the attacker gets to your DB at time
t1, and your codebase at time t2, and if you have a system that changes your
pepper regularly the t1 may be out of sync at t2.

All in all, it seems like a lot of work (that could be spent elsewhere) for
little gain. I've been burned enough to know that any tweaks to a crypto
system to "make it better" sometimes shoot you in the foot instead.

------
cocoflunchy
I see people saying "just use scrypt or bcrypt"!

I noticed in Django's admin this line (in the user section, with the default
authentication app):

    
    
        algorithm: pbkdf2_sha256 iterations: 12000 salt: 8c8ttQ****** hash: pN+2tq**************************************
    

Am I fine? Is there something more secure? Why isn't Django's default scrypt
or bcrypt? And what is pbkdf2_sha256 exactly?

~~~
goblin89
PBKDF2 with SHA256 hashing is default password storage mechanism in Django.
Recommending PBKDF2 is one of the things that tptacek criticized “Practical
Cryptography With Go” for[0].

Edit: Django has support for bcrypt but it's not the default because of third-
party dependencies[1].

[0]
[https://news.ycombinator.com/item?id=7596280](https://news.ycombinator.com/item?id=7596280)

[1]
[https://docs.djangoproject.com/en/1.7/topics/auth/passwords/](https://docs.djangoproject.com/en/1.7/topics/auth/passwords/)

~~~
po
tptacek faulted 'Practical Cryptography With Go' for recommending PBKDF2
_over_ bcrypt and scrypt however this is not what Django is doing.

Django chose PBKDF2 of the three primarily because it was easily implemented
without pulling in third-party dependencies, which is very helpful for
Django's deployment. Given the choice of the three, I think most all Django
core developers would say that scrypt or bcrypt would be better but of the
three options, PBKDF2 was deemed to be the most acceptable for driving
adoption. Before PBKDF2 was added it was using a fast hashing algorithm and
had no way to select one so all three were considered improvements and PBKDF2
was considered the 'safest' for adoption, not the optimal one. Note that it is
now fairly easy to add bcrypt or scrypt into your django deploy if you choose
to.

I'd also add that the parent article, talks about PBKDF2 but I am not sure it
is using it correctly. It makes no mention of how many rounds it is configured
for and I'm not sure what they mean by 'dumb iterations'. It should be able to
make it as slow as scrypt or bcrypt or on a single machine benchmark. The
advantage of something like scrypt is that it pulls in complexity on memory as
well as CPU.

 _Edit_ For anyone interested in the decision making here, this was the
summary ticket on Django password hashing and my summary at the time:

[https://code.djangoproject.com/ticket/15367](https://code.djangoproject.com/ticket/15367)

[https://groups.google.com/d/msg/django-
developers/ko7V2wDVsd...](https://groups.google.com/d/msg/django-
developers/ko7V2wDVsdg/N8yYKROKj-MJ)

~~~
tptacek
No, that's not accurate. I noted that the book didn't do a very good job of
explaining the three functions or make a clear recommendation based on facts.
That's different from dinging someone simply for using PBKDF2, which I
wouldn't do.

~~~
po
I think we're saying basically the same thing. You're not criticizing the use
of PBKDF2 as a bad choice as the parent comment implies.

~~~
goblin89
I should've written “blindly recommending” instead of “recommending”.

(That's the passage I was referring to:

> If you're explaining crypto to a reader, and you're at the point where
> you're discussing KDFs, you'd think there'd be cause to explain that PBKDF2
> is actually the weakest of those 3 KDFs, and why.

Also, I'm not arguing the sanity of Django developers' choice of PBKDF2 as
default password encryption mechanism—IMO marginally better security wouldn't
be worth the increased complexity of starting new project for newcomers.)

------
baby
So several iterations of md5 + salt is not secure? I'm really wondering how
much time it takes to find a password hashed by 5 iterations of md5 + a salt
by exhaustive search.

~~~
triangleman83
Hashing a hash is not generally considered secure because you have to assume
that if your system was compromised, the hackers will know what methods you
used, including the list of salts. If the hash runs quickly then you didn't
really cause them any more time/work.

~~~
danielweber
Assuming "several iterations" means "a million or more iterations" then you
have captured most of what bcrypt gets you. You've broken their rainbow tables
and they have to brute-force to find users using "passw0rd". You can tune the
"several iterations" the same way you can tune bcrypt.

That said, don't roll your own. You probably screwed it up somewhere. Just use
the bcrypt library call. Or scrypt to let you roll +2 against GPU attacks.

------
cordite
The recent post "Fairy Tales in password hashing"[0] in regards to scrypt
seems somewhat relevant here.

[0]: [http://vnhacker.blogspot.com/2014/04/fairy-tales-in-
password...](http://vnhacker.blogspot.com/2014/04/fairy-tales-in-password-
hashing-with.html)

~~~
skrebbel
Hardly, because it's not about scrypt itself but about how (according to the
author) the library's API encourages incorrect usage.

A reader of your comment might assume that the article is about some perceived
flaw in scrypt itself.

~~~
cordite
I can see that perspective. I could have made it more clear.

I see it as it being more about incorrect usage or application, rather than a
fault with scrypt.

------
eldelshell
What I'm using for my latest project is a not a hash, but using an internal
128 long key to encrypt using PBKDF2WithHmacSHA1 and persisting the result to
a blob.

~~~
tptacek
If you have a secure place to put a key for reversibly encrypting passwords,
just put the passwords in that place instead. I'm sure you'll be fine.

~~~
mnw21cam
Might be nice to mark your comment as a joke - someone may take it seriously.

------
jakobe
The biggest issue here is that users are allowed to pick their own passwords
in the first place. Sure, you can require them to use passwords with a capital
letter and with a number and with a punctuation character, but that will just
make them pick "Password1."

Better: Use one time passwords sent via SMS. Or send a one-time-login URL via
email.

If you do have to use a password, just generate a 10 digit numeric code. Sure,
some of your customers might complain, but at least you aren't responsible for
disclosing people's ebay password when your site gets hacked.

~~~
matthewmacleod
That's not "some of your customers might complain" territory, it's "your
business failed because nobody signed up" territory.

2FA basically ensures security via a second channel, and it's perfectly
possible to store passwords in a secure format. I'm not convinced your ideas
there are worth the cost.

~~~
jakobe
> your business failed because nobody signed up

Why? Almost all websites require email confirmation; sending someone a login-
URL via email actually has less friction because the password-choosing step is
removed!

> it's perfectly possible to store passwords in a secure format

But it's very hard to do so. Even if you use scrypt, it is very hard to make
sure your whole system is actually secure against password leakage.

The simple truth is that letting your users choose their own passwords is a
liability; and I've decided to avoid this liability.

