
Enough with the Salts: Updates on Secure Password Schemes - Jonhoo
https://www.nccgroup.trust/us/about-us/newsroom-and-events/blog/2015/march/enough-with-the-salts-updates-on-secure-password-schemes/
======
aidenn0
Not mentioned in the article, is why salting is important even if rainbow
tables aren't part of your threat model:

There is more than one reason, but they all boil down to one fact; two
accounts with the same password will hash to the same value if you do not have
a salt.

As an example, if an attacker gets a dump of all the hashed passwords, the
weak passwords will be immediately apparent just from an analysis of
duplicates.

~~~
firebird84
I almost feel as though this is at least half of the point of using salts.
We're trying to make things slow, so forcing an attacker to crack the same
password twice (or more) in cases where it's reused seems beneficial and very
cheap. Weak passwords will still be cracked first but it will still require at
least SOME work.

~~~
hinkley
That was the reason salts were used in /etc/passwd. You couldn't tell if John
and James had the same password, and you couldn't tell if John used the same
password on multiple machines. Or more critically, if the root password was
the same on the entire cluster.

As you two have already discussed, it mostly provides herd safety. You can't
target the 'weak ones' because all the passwords look roughly the same.

------
nostromo
People need to read this article very carefully before forming any sort of
take-aways.

The title and the start of the article suggest that salts are irrelevant. Then
only a few paragraphs later, the author states that salts are indeed crucial.

Later in the article, we learn that this article isn't really about salts at
all, but the importance of a high work factor / rounds.

The problem with the way this article is presented is that a casual reader
will be left with the idea that salts are no longer important, which is false.

A better title would be something like, "Salts are important, but so is a high
work factor"

~~~
tptacek
Careful. Salt _quality_ is almost totally unimportant. And there are no well-
known, well-reviewed password hashes that aren't randomized. The lesson I'd
take away from an article like this is: "don't DIY your password hash".

Historically, every discussion I've seen that tried to acknowledge the
"importance" of salts devolved into people defending their DIY "secret-salt"
schemes. "Important salts" seem like an invitation to reject bcrypt, scrypt,
and PBKDF2.

I think part of the problem is the term. I think we should stop talking about
"salts", and instead just say whether a given password scheme is _randomized_
or not.

~~~
markdown
Google AppEngine is part of the Google Cloud Platform. It's a pretty big deal.

If you want to use it, (at least with Python) you need to roll your own. You
can't upload scrypt, bcrypt or PBKDF2 with your app, since code with C isn't
kosher on there.

~~~
sdevlin
Isn't PBKDF2 in the stdlib as hashlib.pbkdf2_hmac?

~~~
bjterry
Google has had support for PBKDF2 in PyCrypto since around August 2012[1].

1:
[https://code.google.com/p/googleappengine/issues/detail?id=5...](https://code.google.com/p/googleappengine/issues/detail?id=5303)

------
zaroth
Salt is a requirement but not a panacea. AshleyMadison's salted MD5 proves
that well enough. The point of a salt is so that same password does not result
in the same hash for two different users. The key point is that attack time
scales linearly with user count, which makes a difference in the actual crack
rate _when the breach is large_.

When cracking passwords, to crack the most hashes, your inner loop is
iterating the _salts_ from the database while the outer loop moves down the
dictionary. Crucially, that means the salt should always be the first bits fed
to your hashing function, or else the cracker can optimize away some hashing
cycles across users. HMAC will do the right thing here if salt is passed as
the 'key' (although the way it expands the key to the block size is somewhat
flawed). Feeding the salt to the hashing function first approximately halves
the crack rate versus appending salt at the end, by preventing any reuse of
hash function state in the inner loop. [1]

[1] -
[http://article.gmane.org/gmane.comp.security.phc/2325/](http://article.gmane.org/gmane.comp.security.phc/2325/)

------
nsxwolf
Serious question: Why isn't there a "best practice" for this? A library
everybody uses to store passwords in the most secure way possible, and when
more secure ways are found, the library is updated and not your code?

Everything is so low level, manual and error prone for such a basic part of
app security.

~~~
mkozlows
The library can't be updated automatically, because suddenly all your stored
passwords wouldn't work.

If it's a big monolithic library like Spring, they just deprecate APIs and add
new ones (which is in fact what they do); if you're in a microframework
language like Node, you're on your own to rip out your md5 module and switch
it to bcrypt.

~~~
masklinn
> The library can't be updated automatically

Why not?

> because suddenly all your stored passwords wouldn't work.

Passlib handles that with a concept of "deprecated algorithms"[0]: a
cryptcontext has a list of algorithms it can accept as inputs to validate
against, and a subset of these can be marked as deprecated[1]. The passwords
hashed with deprecated algorithms will validate but they'll be flagged as
needing an upgrade. The `verify_and_upgrade` method will return both the
validity of the password and a new hash if it should be upgraded, so the
normal pattern is:

    
    
        valid, new_hash = pass_ctx.verify_and_update(password, old_hash)
        if valid:
            if new_hash:
                # store new hash for user
            # password was valid
        else:
            # password wasn't valid
    

[0] [http://pythonhosted.org/passlib/lib/passlib.context-
tutorial...](http://pythonhosted.org/passlib/lib/passlib.context-
tutorial.html#deprecation-hash-migration)

[1] in recent versions you can even set the _deprecated_ list to "auto", this
will consider all algorithms but the default (the one used if you encrypt with
the cryptcontext without requesting anything specific) to be deprecated. It's
a thing of beauty.

------
ixtli
Interesting side note: one of the most successful altcoins (a few years ago,
back when people cared about this stuff) was called litecoin. The major
difference is that it used scrypt instead of SHA because it was hard to
parallelize. The idea was this would keep rich groups from taking over the
network by throwing money at hashing clusters.

I built a machine in a milk crate that had 5xR9s to mine it. I broke even and
sold it after I got bored :) It was fun for a month or two.

~~~
timo_h
This was probably the most notorious misuse of scrypt.

Memory usage of scrypt was tuned to take 128KB, which made Litecoin mining ~10
times faster on GPUs than on CPUs.

------
davidrusu
Is anyone here using Scrypt in production?

I had do decide on a hashing scheme recently and ended up going with Bcrypt
just because of how new Scrypt is.

~~~
Karunamon
I know Keybase ([http://keybase.io](http://keybase.io)) uses scrypt, but
that's the only one I know of off the top of my head.

~~~
malgorithms
Details, in case it's helpful/interesting to anyone:
[https://keybase.io/docs/api/1.0/call/login](https://keybase.io/docs/api/1.0/call/login)
. Also of note, the hashing is done client side, not server-side.

~~~
hinkley
Doesn't that mean the hash is the password, not the password? If I steal the
password file what keeps me from just logging in as everybody?

~~~
cpeterso
The client's hash is the password but the password file contains salted hashes
of the client's hash. An attacker that has the client's password hash can
replay it, but they don't know the user's password. An attacker with the
password file doesn't know any clients' password hashes or whether any of the
password hashes are reused by different users.

~~~
sarciszewski
Yo dawg, I heard you liked hashes...

Meme reference aside, this is actually a sane way to do things.

------
zeveb
I never understand why PBKDF2 gets so little love. It's easy to use, it's easy
to implement correctly, it's standardised. scrypt is cooler, but PBKDF2 is
just fine.

~~~
dchest
PBKDF2 is actually a crappy standard:

\- The work factor depends on the length of hash. For example, if you compute
and store 64 bytes of PBKDF2-HMAC-SHA256, an attacker needs to only compute
the first 32 bytes to verify the guess, and with PBKDF2 it means doing half
the work.

\- In most common use (with HMAC), it inherits HMAC "vulnerability": a
password longer than the hash function block is equal to the its hash (see
this Twitter thread:
[https://twitter.com/dchest/status/421595430539894784](https://twitter.com/dchest/status/421595430539894784)).
For example,
`plnlrtfpijpuhqylxbgqiiyipieyxvfsavzgxbbcfusqkozwpngsyejqlmjsytrmd` and
`eBkXQTfuBqp'cTcar&g*` have the same PBKDF2-HMAC-SHA1 hash. (Scrypt is also
vulnerable, BTW, as it uses PBKDF2-HMAC-SHA256.)

\- PBKDF2 is commonly used with hashes which are fast on GPU or custom
hardware and require tiny memory, thus making defense/attack cost ratio very
bad. See scrypt paper for cost estimates
([https://www.tarsnap.com/scrypt/scrypt.pdf](https://www.tarsnap.com/scrypt/scrypt.pdf)).

BTW, we had Password Hashing Competition ([https://password-
hashing.net](https://password-hashing.net)), the (tweaked) winner of which
will hopefully be widely used in the future.

~~~
CodesInChaos
It's difficult to implement efficiently:

* If you use the underlying hash as a black-box, each HMAC invocation compresses 4 blocks. An optimized implementation will only require 2. Many implementations (e.g. the .NET one) don't include that optimization and thus suffer from a 2x slowdown compared to the attacker's implementation.

* If you use HMAC naively, it'll hash the whole password on each iteration, opening a DoS attack. Django made this mistake (see [http://arstechnica.com/security/2013/09/long-passwords-are-g...](http://arstechnica.com/security/2013/09/long-passwords-are-good-but-too-much-length-can-be-bad-for-security/) )

It abuses the underlying hash:

It xors together the outputs of different iteration numbers. This can
interfere with the feed-forward of certain hash constructions. As far as I can
tell this does not cause security issues, but it's still very dirty.

\---

Personally I'd go so far as to say that it's worse than PBKDF1 with a
sufficiently wide underlying hash.

------
lucb1e
Reading this title I was wondering what news they have over
PBKDF2/Scrypt/Bcrypt and whether it would really be better. Read through it
quickly and was disappointed. Unlike they claim, this advice was popular even
in 2010: [https://security.stackexchange.com/questions/211/how-to-
secu...](https://security.stackexchange.com/questions/211/how-to-securely-
hash-passwords)

------
Illniyar
"The multiple salts don’t really do much since they’re all likely known to an
attacker or can be quickly calculated given knowledge of the other two
component"

I've never heard before that you can calculate the sitewide salt (i.e. in
memory abd unknown to attacker) from having both the user password and the
per-user salt. How is this done?

~~~
zaroth
The second half of that sentence is false, assuming the site salt has enough
entropy, i.e. 32-bytes from a CS-PRNG.

However, if your server is compromised, pulling the site-wide salt (aka
'pepper') from memory is presumably trivial, so you typically assume it is
known to the attacker.

~~~
Illniyar
I'm not really familiar with the field, is it really trivial to compromise
both the database and the application server? And is it really that trivial to
gain data out of memory?

Is there a good resource that reviews attacks that actually occurred and
analyzes what and how it could have been prevented?

------
zkhalique
These days, what is wrong with rolling your own encryption of passwords using
key strengthening and salting?

Like this:

hash = sha1 applied recursively 4071 times ( password . salt ) . salt

Serious question. Don't just say religiously, "just use bcrypt". Tell me what
is really wrong. What attacks will succeed today? Any crypto enthusiasts in
the audience?

~~~
mkozlows
The non-expert answer is: Because crypto math is difficult for randos to
analyze, and it's not obviously the case that running an algorithm 4,000 times
is 4,000 times harder to break than running it once.

And from a purely pragmatic perspective, why even consider it? You have a
bcrypt/scrypt library, writing your own code is harder and almost certainly
worse, so there is literally no reason to spend even a second thinking about
it.

~~~
zkhalique
Yes, there are reasons to think about it. For example if I want to understand
the principles that make the usual algorithm strong.

There is a reason why Apple makes every app implement its own receipt
validation scheme. So a dedicated attacker can't crack all apps at once. Same
here.

------
mtgx
Aren't some new password hashing schemes coming out now?

[https://password-hashing.net/](https://password-hashing.net/)

~~~
sarciszewski
There's still some tweaks to be made to Argon2 before it's finalized. Mostly
GPU resistance.

Then there's the standardization for the crypt(3) format.

------
coldcode
Of course all this might go out the window once quantum computers become
mainstream.

~~~
dchest
How so? Currently known symmetric crypto algorithms are secure against attacks
by quantum computers. More than that, hash functions, which are building
primitives of password hashing functions, are used in "post-quantum crypto",
e.g. for signatures ([https://en.wikipedia.org/wiki/Post-
quantum_cryptography#Hash...](https://en.wikipedia.org/wiki/Post-
quantum_cryptography#Hash-based_Cryptography),
[http://sphincs.cr.yp.to/](http://sphincs.cr.yp.to/)). (Although more research
should be done on memory-hard functions.)

