
Why I think using sha256crypt or sha512crypt is dangerous - bellinom
https://pthree.org/2018/05/23/do-not-use-sha256crypt-sha512crypt-theyre-dangerous/
======
tialaramex
Article opens by claiming "DES-Crypt has a core flaw in that, not only was the
algorithm reversible ..."

That's not true in any meaningful sense and points to the fact that the author
hasn't any clue what they're talking about. Which doesn't magically invalidate
their experiments and data, but it does mean you should probably take their
opinion with an extra table spoon of salt.

DES of course is reversible, it's an encryption algorithm, so if you have the
key you can just reverse it, that's the whole point. But what DES-crypt does
is something else, First it takes the DES algorithm and tampers with the
internals, it has 12-bits worth of random data for this, a salt (today
obviously we use larger salts, but this one already makes time-space tradeoffs
painful, 4096 times worse than normal against a "good" DES crypt). Then it
encrypts a bunch of zero bytes using the password, and then it encrypts the
result, and that result and so on, 25 times in total.

This isn't a reversible algorithm. You can't take some encrypted DES bytes,
and then run DES backwards 25 times to find out what the key was knowing the
original was all zero bytes, you quickly get conflicting possibilities.

What you _can_ do, in reasonable time, is try all the plausible passwords in
turn, for a particular salt value. This is what DES crackers do, and for
passwords like "ksdcunfm" it works pretty well on commodity hardware, which is
why we don't use DES Crypt any more.

But that's not reversing the algorithm.

~~~
atoponce
> Article opens by claiming "DES-Crypt has a core flaw in that, not only was
> the algorithm reversible ..."

> That's not true in any meaningful sense

That's exactly right, and when I published the article, I recognized that I
needed to change it, but didn't get to it. I've fixed it. Thanks.

> and points to the fact that the author hasn't any clue what they're talking
> about.

Ouch.

------
slrz
How is the timing here relevant to attacking shacrypt? The run time is a
function of the length of the _attacker-provided_ string containing the
password guess. That's totally fine, the attacker already knows that string.
The output is a fixed-length string that is then compared to a stored (also
fixed-length) hash using a constant-time comparison function.

The run time does not depend on the length of any secrets here. I don't see
the issue. How can I use that to determine the length of the actual password?
That length isn't even available to the password verification system anymore.

~~~
blackflame7000
I think the author is arguing that if someone were to send a 1GB password that
it could cause a DOS attack on the server performing the sha-hash of the
password. Seems to me the simplest solution is to limit the length of password
fields to something reasonable.

~~~
colejohnson66
I always thought you’re supposed to hash on the client side first to avoid
sending the password to the server, and then hash again on the server. Even if
the password is a billion characters long, the server will only see a 60(?)
byte array and hash that.

~~~
dbt00
no that's not how it works.

If the client could simply send a hashed password over as an authenticator,
then hashing the password database would provide no protection against theft,
because you could simply use the stored hash as the password. This solution
would only protect users from password reuse from theft.

Hashing with a negotiated salt and computing client and server side protects
the password in transit but means storing the password in plaintext on the
server. We use TLS (or SSH, or other encrypted protocols) to protect passwords
in-flight instead, and hash passwords on the server to protect them from
theft.

(Remember, the original attack password hashing was protecting against was a
world readable file that contained all usernames and passwords, /etc/passwd on
Unix. Eventually the world moved to "shadowed" passwords, where the hashes
were stored in a root-readable-only file called /etc/shadow.)

There are systems where the user enters a key locally and it's not transmitted
over the wire. Kerberos and SRP are two of them.

~~~
lucideer
> _If the client could simply send a hashed password over as an authenticator,
> then hashing the password database would provide no protection against
> theft, because you could simply use the stored hash as the password._

I'm not sure I understand this sentence. Are you reading the parent message as
proposing hashing client-side using the same algorithm as the server-side hash
and comparing directly on the server against the database-stored hash. This
would be outrageously stupid of course, but I'm certain the commenter was not
proposing that.

They said:

> _hash on the client side first to avoid sending the password to the server,
> and then hash again on the server._

Note, hashing _again_ on the server. To describe this slightly differently,
think of the clientside algorithm as effectively being a password generator,
with the user inputting their own memorable password as a kind of encryption*
key for their "real" auth API password—the main advantage is fighting password
reuse.

*encryption!=hashing but I'm just using the term here to illustrate intent.

------
peeters
> What this clearly demonstrates, is that the only factor driving execution
> time, is the number of iterations you apply to the password, before
> delivering the final password hash.

Isn't the reason this is true that bcrypt just truncates the input at 72
characters? And if that's correct, aren't there basically a few options, any
of which basically invalidate the argument?

1\. Enforce a password length limit. Fixes the unbounded growth of the
shacrypt CPU time while simultaneously ensuring bcrypt is not ignoring input.

2\. Pre-hash the input to bcrypt by passing it through SHA512. Makes bcrypt
now grow in the exact same way as shacrypt.

3\. Don't enforce a password limit, but truncate the password. Fixes the CPU
time.

I don't know enough about bcrypt in practice to know which of those options is
most commonly advised, but it just seems like a bit of a strawman here.
Passing 4096 bytes to bcrypt is the same as passing 72 bytes to bcrypt because
it ignores the last 4024 bytes right?

~~~
minitech
> Pre-hash the input to bcrypt by passing it through SHA512. Makes bcrypt now
> grow in the exact same way as shacrypt.

The pre-hash is only applied once, so the change in runtime is linear and very
small compared to bcrypt.

~~~
peeters
Edit: I stand corrected, the summary in the link below didn't agree with the
more detailed steps it went into below. While the hash is used as the input to
the next round, it is (for reasons beyond my understanding) repeated if
necessary to match the length of the original password. But it still seems
like if you're prehashing in both cases the performance issue goes away.

Is that true? I thought shacrypt stretching worked by passing the output of a
round of hashing as the input to the next round of hashing, so it would go
from 4096 bytes to 64 bytes (e.g.) after the first round.

That's how it's explained here:
[https://www.vidarholen.net/contents/blog/?p=33](https://www.vidarholen.net/contents/blog/?p=33).
I'd love to be proven wrong.

So it seems shacrypt and bcrypt would be equivalent. Do the expensive work
from 4096 to <72 once, then subsequent rounds are all on the same size of
input.

I think my point stands, this is comparing apples and oranges unless you pre-
hash the input to bcrypt and include that in the benchmark.

------
waynecochran

        Passive observation of execution times can predict password length (smaller concern).
    

Seems like this can radically narrow the focus of a dictionary attack. Also,
if you are just trying to get a foothold into a machine, you can focus your
energy on short passwords. Of course, I am not sure of all the scenarios where
you are privy to how long the encryption took.

~~~
atoponce
> _Seems like this can radically narrow the focus of a dictionary attack._

It's leaking information, and we don't like this, but it's not fatal. 96% will
have passwords less than 16 characters (see
[https://blog.cynosureprime.com/2017/08/320-million-hashes-
ex...](https://blog.cynosureprime.com/2017/08/320-million-hashes-
exposed.html)). This is within the first "step" in the sha512crypt hashing
process, and longer passwords may have enough entropy, so learning their
length is not providing any practical advantage in cracking the password,
should the database be compromised.

The larger concern is CPU load. There are load jumps between 1-15, and 16-23
characters, and 24-80, etc. Even though 96% of users will fall in the 1-16
range, if you put a minimum length requirement of perhaps 12 characters, you
may see that you need to decrease your sha512crypt cost to handle your tested
load, because more of your users are passing 16 characters in length (yay)
than normal (although most will like do _exactly_ 12 characters).

Does that make sense?

------
LinuxBender
The DoS aspect is mitigated via pam and user management controls in the OS
(for creating /etc/shadow entries).

scrypt and bcrypt are a long ways out for supported Linux distros. This
depends on glibc to make changes that they have stated they won't make. So for
now, sha512 shadow hashes are the best most people will have in their
datacenters.

Perhaps wrap that rascal with MFA / 2FA to be the first line of defense as a
mitigating or compensating control.

~~~
derefr
> This depends on glibc to make changes that they have stated they won't make.

I'm not sure I understand what part of Linux's account security model relies
on glibc to make changes. Can't PAM pull in whatever libraries it likes?

If not, then why hasn't (any popular distro of) Linux moved toward the
Windows/macOS model of unifying account management + AAA onto a single
"directory services only" code-path, where a system that's not "part of a
directory" is actually just part of its own "directory of one", with your
system running its own little LDAP+Kerberos daemons or what-have-you that your
local accounts reside in?

In that sort of model (which, again, is what Windows and macOS both use
exclusively nowadays), the kernel, C library, etc. aren't involved in deciding
AAA policy; they just implement the simple AAA _mechanism_ of "let users with
the right ACL token do the things the ACL token says they can do" and then
rely on one or more [local or remote] directory-service servers to actually
authenticate users and issue those tokens to them. Drastically smaller attack
surface.

~~~
LinuxBender
There is talk of extending crypt in glibc to support external libraries [0]
and that would do exactly as you suggest.

[0] -
[https://sourceware.org/bugzilla/show_bug.cgi?id=16814](https://sourceware.org/bugzilla/show_bug.cgi?id=16814)

And certainly, if you move the authentication to another service, such as I
mentioned with MFA, or LDAP+Kerberos as you mentioned, that could address
this, assuming that system supports scrypt / bcrypt.

~~~
derefr
I’m still confused—why is the idea to extend crypt(3), rather than for PAM to
just _call something besides crypt(3)_ , pulling in a better hashing library
at the libpam level rather than at the libc level?

~~~
mjevans
Lots of services call crypt and it expects a variable length string which
describes the stored secret to compare the actual password against. It is
//very// possible to update this one place and 'solve' support for MANY
dependent programs and libraries.

(towards the bottom of the manual page: [http://man7.org/linux/man-
pages/man3/crypt.3.html](http://man7.org/linux/man-pages/man3/crypt.3.html) )

------
john37386
Sha is a cryptograpic algo used for secure hashing hence the name SHA. The
main goal of hashing is confirming the integrity of an object. This mean the
object you read now is the same object as when it was stored. The goal was
never to encrypt even if its possible to do so.

All hashing functions have the same weakness which is vulnerable to rainbow
table attack. I agree with the OP that sha is dangerous if used in the wrong
way.

~~~
danbruc
_The main goal of hashing is confirming the integrity of an object. [...] The
goal was never to encrypt even if its possible to do so._

That is not true, cryptographic hash functions are just cryptographic
primitives with certain guarantees about their behavior, they do not imply any
usage. There is nothing fundamentally wrong with using a cryptographic hash
functions as a building block for an encryption algorithm. The important thing
is that almost no use case of cryptography requires exactly and only the
guarantees provided by cryptographic hash functions. That is why they are most
of the time used as building blocks of more complex algorithms and not on
their own.

Ensuring that a file was not modified while you did not look is just an
exceptional case where just using a cryptographic hash functions alone may be
good enough. But even in this case you might want to consider that an attacker
may be able to change the file and its hash together and maybe you actually
need an authenticated hash that can not be forged by an attacker. In that case
a cryptographic hash functions would again be part of the algorithm but it
would not be the only building block.

