
Storing Passwords in a Highly Parallelized World - hynek
https://hynek.me/articles/storing-passwords/
======
tptacek
It does not really matter what password hash you choose in 2016. Argon2 is
great, and if you have a library for your platform that does it, use it. But
if all you have is PBKDF2, please don't freak out and come up with some
complicated alternative to storing passwords altogether.

The real, big risk is not using a password hash at all, and instead using
"salted hashes".

~~~
joveian
Unless you generate a high entropy password for the user, in which case you
can just use one round of a hash function and be done. For web authentication,
most users will just have their browser remember it anyway and if not they can
write it on a pice of paper.

~~~
sasas
Are you referring to the comment -

> The real, big risk is not using a password hash at all, and instead using
> "salted hashes".

If so, your statement is incorrect. Without salting you are vulnerable to
trivial rainbow table attacks [1]

[1]
[https://en.wikipedia.org/wiki/Rainbow_table](https://en.wikipedia.org/wiki/Rainbow_table)

~~~
cakoose
I think the parent was assuming a lovely world where all passwords would be
large randomly-generated strings. In that case, salting is no better than just
increasing the size of the randomly-generated string, right?

~~~
joveian
Yes, thanks. And the thing is they don't even have to be that long. Still
inconvenient to enter a bunch of those every day, but the same is true with
any password and most of us don't do that - we maybe set a master password and
our web browser remembers them.

Salting is one of the best and easiest measures if there is a possibility of
lowish entropy passwords, but if not it doesn't help (edit: on further thought
I think there is a fairly substantial entropy range where the only viable
attack is a very large scale batch attack and salting would prevent that,
although the cost/benefit on actually doing that makes it an unlikely attack
anyway). Iteration and memory hard functions also effectively add just a few
bits of entropy. But when users choose passwords, a huge portion will be low
enough entropy that nothing you can do will help enough.

A second factor does help in that case, and can provide additional protection
to high entropy passwords (as long as the second factor is distinct from the
system that stores the high entropy password, which in theory it should be to
be called a second factor).

technion: yeah, my suggestion won't work if there is any way for the user to
set the password.

------
icebraining
I don't have anything to add to this discussion, I just wanted to thank the
author; this blog is essentially what I've always wished my blog to be: clean,
with few but very informative updates, especially for us Pythonistas. The
articles about deployment have been particularly useful. Thanks!

~~~
hynek
Aw thank you!

------
advisedwang
Is there any reason to prefer Argon2 over scrypt? The article dismisses it as
not popular, but doesn't say why we should invest in making Argon2 popular
over scrypt.

~~~
jbandela1
One of the nice things that Argon2 has is server relief. Instead of doing the
entirety of the computation on the server you can have the client do the most
demanding parts and send those intermediate values to the server which can
then finish the much easier computatation.

~~~
fryguy
Really, any password hash `PH(m,p)` can have server-relief by adding a regular
cryptographic hash function `H` and making H∘PH your hash function. By this,
you store `p||H(PH(m,p))` in the database, and have the client compute
`PH(m,p)` and send it to the server rather than sending the password `p`. You
still have to send the salting information to the client though or use .

I don't see the server relief in Argon2's latest specification document
though.

~~~
creshal
I.E., cram-md5 and related password schemes. It's a shame it fell out of
favour, I'd sure prefer something like this over sending plain text passwords
via TLS…

~~~
fryguy
Ideally we eventually get rid of passwords and use secret/private keypairs in
the browser, or have some browser-based SRP.

------
worried_citizen
Can someone help me understand why the author of the python library chose to
raise an exception on verification failures as opposed to returning false?

~~~
herge
On that topic, if it's raising an exception in the middle of testing the hash,
could this permit a timing attack?

~~~
hynek
It isn’t raised in the middle of testing the hash. The testing is completely
done in the Argon2 C library and the bindings raise an error if it returns an
error. The Python library doesn’t do anything smart at all except calling C
functions on strings.

------
falcolas
Given the recent ACM article about faster-than-cpu persistent memory on the
horizon, how does that affect memory-hard hashing algorithms?

A machine with a measly 256GB of memory can do around half a million Argon2
hashes simultaneously (using the Python library defaults for memory complexity
of 512k), and that doesn't include the memory built into high end GPUs, or
what could be added to ASICs or FPGA boards.

~~~
romaniv
I do not think ACM article is very relevant here, because what is speaks about
is, essentially, the fact that what we traditionally thought of "disks" is now
getting as fast as RAM, because persistence is achieved via new technologies.

 _A machine with a measly 256GB of memory can do around half a million Argon2
hashes simultaneously_

Not necessarily. There is always the limitation of memory bandwidth. Although,
I do not see saturating that bandwidth as a design goal of Argon2.

~~~
tracker1
precisely.. if a hash is _too_ intensive, it's a clear vector for DDOS
attacks.

------
kevindeasis
I'm skimming info from all over the place. Does anyone know the performance of
argon2 over the other stuff? I'm figuring out whether to use bcrypt or argon 2
for a side project.

[https://www.npmjs.com/package/bcrypt](https://www.npmjs.com/package/bcrypt)

[https://github.com/ranisalt/node-argon2](https://github.com/ranisalt/node-
argon2)

edit: found it. I'm still reading

[https://password-
hashing.net/submissions/specs/Argon-v3.pdf](https://password-
hashing.net/submissions/specs/Argon-v3.pdf)

------
viraptor
Can you unroll argon2 for cpu-vs-memory tradeoff like scrypt? Both are
supposed to be memory-intensive, but in scrypt you can get 2x slowdown by
using 2x less memory and just computing the missing bits.
([http://blog.ircmaxell.com/2014/03/why-i-dont-recommend-
scryp...](http://blog.ircmaxell.com/2014/03/why-i-dont-recommend-scrypt.html))

~~~
fryguy
From the white paper [https://password-
hashing.net/argon2-specs.pdf](https://password-hashing.net/argon2-specs.pdf):

> Argon2i is more vulnerable to tradeoff attacks due to its data-independent
> addressing scheme. We applied the ranking algorithm to 3-pass Argon2i to
> calculate time and computational penalties. We found out that the memory
> reduction by the factor of 3 already gives the computational penalty of
> around 214. The 214 Blake2b cores would take more area than 1 GB of RAM
> (Section 2.1), thus prohibiting the adversary to further reduce the time-
> area product. We conclude that the time-area product cost for Argon2d can be
> reduced by 3 at best.

------
vog
Password hashes focused too long on computation time, neglecting memory usage.
It's great to see some progress on that topic.

On the other hand, we still see password databases stored in plain MD5,
sometimes even without salt. So in addition to provide better password hashes,
making them mode widespread is important, too.

~~~
jtheory
One thing I wonder about, as password hashes get both processor and memory-
hard -- are we presenting a trivial DoS attack on our servers, by basically
letting any IP submit a request for a server to dedicate this (non-trivial)
level of RAM and processor to a given task (hashing a random string to verify
that this login is not correct)?

I suppose the answer is yes, but it's worth it. It's possible to fix this sort
of hole (partially, at least) by capping the number of hashes processed
concurrently. But most simple implementations will just assume "we're not
likely to have more than X users every signing in concurrently, so we can set
the work factors based on that plus some headrooom".

~~~
nly
If you're not throttling the number of incorrect password guesses per IP, or
per IP per account, or both, you're opening your users up to a dictionary
attacks anyway. If I can get a box close your yours (~5ms is doable if we use
the same colo), then it's completely feasible.

You're right however that this is going to be painful, because most web
frameworks don't have any of this throttling architecture (like persistent in-
memory hashtables) in place, and as soon as they switch to Argon2/Scrypt, it's
going to be an easy DoS vector... particularly for services running on weak
VPSs.

It's another reason why solid, secure password authentication protocols, that
do client-side hashing, will eventually happen. Even in fancy zero-knowledge
asymmetric protocols however, online rate-limiting is essential.

~~~
jtheory
Rate limiting per-IP assumes an attack from a single IP, or a very small range
of them (so, only defends against a trivial DoS, not DDos... which are sadly
easy to set up these days).

Per-IP per-account as well doesn't work if the attacker has a large list of
usernames. Even brute-force "dictionary" attacks can dodge simple limiters by
submitting one password with 2 million diff usernames, then a second password
with 2 million usernames, etc..

I'm not saying these are _bad_ (though if someone can trivially stop your real
users from signing in by hitting the limit on their accounts, that's just a
DoS in another shape). But we're agreed already... these are non-simple
problems, really.

------
seanwilson
It's sounding like it's too difficult to keep up with password security to try
to do it all yourself. You'd be better outsourcing it such as with OAuth if
you can.

~~~
continuational
While it certainly would be nice to outsource this problem,

    
    
        ph.hash("s3kr3tp4ssw0rd")
        ph.verify(hash, "s3kr3tp4ssw0rd")
    

Is pretty damn easy.

~~~
ajross
You're missing the point (which is hilarious, because the very first sentence
of the linked article spells it out explicitly). The APIs for password hashing
are simple, and always have been (c.f. man 3 crypt).

It's the problem area that is complicated and rapidly changing. And use of
APIs and mechanisms that just a few years back were Best Practices is now
discouraged. That's not something that can be treated by a cute Python API.

~~~
creshal
The best practice a few years ago was "use the best PBKDF available", which
hasn't changed – just what algorithm to use. This isn't different from any
other part of crypto – we have a constant algorithm churn as we identify and
replace problematic algorithms.

And OAuth is no exception: If you implemented OAuth five years ago, it was
OAuth1, which was obsolete and ripe for replacement… _four_ years ago.

~~~
ajross
Uh... the specific advice I was replying to was _precisely_ to use a specific
python wrapper around Argon2, which is certainly not isomorphic to the "best
PBKDF available" (even if it might be right now).

And the OAuth point is misdirected: we're talking about hash choices here,
which OAuth is not. An OAuth1 client, while perhaps obsolete for other
reasons, _would certainly_ be protected against the need for a change in
password hash, because all that stuff happens upstream and it never sees the
password. That is by design, therefore remote authentication against highly-
trusted password verifiers[1] is a good idea, which is what the great-
grantparent post was saying.

[1] Which, sure, has its own list of worries unrelated to hash behavior.

~~~
creshal
So what is your point then? No matter what you do – whether you verify
password yourself or externally – the state of the art constantly changes and
you have to update your code _either way_. There's no escape from that. The
reasons change, but not the code churn.

~~~
ajross
Uh... no. That's exactly wrong. If bcrypt or whatever gets broken tomorrow,
and my OAuth provider needs to reset all the passwords in the database (Linode
literally did this last week, though not because of a hash change, it's a
common kind of thing), _I don 't need to change a line of code._

Ergo, if you're worried about changes to password hash vogues, OAuth is a good
idea. Thus the point way upthread, which both you and the other poster seem to
have missed.

~~~
creshal
Because it's a nonsensical point. You trade one very simple thing to worry
about – password hashing algorithms, for which you can count all variations in
the past ten years on one hand – with a massively complex system like OAuth
(1.0? 1.0a? 2.0?), which in turn relies on TLS for transport security, for
which best practices change every two or three months.

If bcrypt gets broken tomorrow, your passwords are safe until you have a data
breach. If OAuth gets broken tomorrow, you're _immediately_ at risk.

------
zeveb
One concern I have with Argon2 is that it uses BLAKE internally. This is
understandable, since BLAKE was developed by one of the conveners of the
Password Hashing Competition, but it seems to me that using SHA3 internally
would be a better choice, in order to leverage the industry-wide investments
in development & testing which we're likely to see.

Assuming that it's well-designed, Argon2's structure shouldn't rely on BLAKE.

~~~
dchest
SHA-3 (Keccak) is not a better choice, because it has a considerable
performance difference between hardware and software implementations, while
BLAKE is very fast in software (or current general-purpose CPUs), which is
good for most defenders (as they use general purpose hardware) and bad for
attackers (as the cost of specialized hardware is increased). None of the
Password Hashing Competition submissions used Keccak, and even the authors of
Keccak recommended against using it for password hashing.

~~~
pbsd
Gambit used Keccak.

~~~
dchest
Ah, true, I forgot about it. With the following remark in specs
([https://password-
hashing.net/submissions/specs/Gambit-v1.pdf](https://password-
hashing.net/submissions/specs/Gambit-v1.pdf)):

 _Keccak is known to be very fast in hardware, which opens up the path to
highly optimized cheap circuits._

(although with strange follow-up: "but the same can be said about modern CPUs
and GPUs, which closes the gap")

As I wrote when discussing candidates, "I like the simplicity of Gambit, but
it would look a bit silly if we selected an algorithm as a winner of
competition with a notice that it better be used in the future, when Intel
adds a SHA-3 instruction."

------
zkhalique
When can we expect something for PHP?

~~~
hynek
Cf
[https://twitter.com/ircmaxell/status/685207462992035841](https://twitter.com/ircmaxell/status/685207462992035841)
& ff

------
whyagain2015
(Serious question) why don't people just use firefox Sync or Chrome password
store? Why worry about all these? In both cases passwords are salted+encrypted
and uploaded.

~~~
Buge
Firefox Sync and Chrome password store are for users storing passwords. They
are not hashed (because that would make it impossible to log into a site,
because you need the actual password to log in, and a hash is irreversible). I
don't know what you mean by salted, because as far as I know, salts are only
used with hashes and these are not hashed.

Bcrypt, scrypt, and argon2 are for server storage of login information. The
server does not store the actual password, it just stores a hash.

