
A password hash storage scheme that prevents efficient password cracking - monort
https://github.com/PolyPasswordHasher/PolyPasswordHasher
======
akerl_
I'm curious what threat model this protects against that isn't also trivially
covered by bcrypting hashes and then encrypting them with a symmetric key that
has to be provided to the system at boot time by a human or keywhiz or
similar. It seems they're trying to protect against an attacker who has DB
access but not RAM access, which is a very important vector to consider but
this seems to be an overly complex way to handle that.

~~~
leni536
> boot time by a human

From the paper:

> The server will restart periodically. All data kept in memory is lost at
> this point and the server must restart _using only the data on disk_ – which
> is attacker visible.

So I guess your solution is correct and trivial but the author's solution
doesn't use data from outside, not even a user provided symmetric key.

~~~
akerl_
It does use data provided from outside: you either need to provide enough
valid passwords for it to recreate the secret, or leak data from the stored
secrets, as described here:

[https://github.com/PolyPasswordHasher/PolyPasswordHasher#if-...](https://github.com/PolyPasswordHasher/PolyPasswordHasher#if-
the-attacker-cant-check-individual-accounts-how-does-the-server-check-the-
first-account-after-rebooting)

~~~
leni536
They leak data but their threat model involves that the attacker can get this
leaked data too (they leak it to the hard disk?).

>provide enough passwords

Why don't they just store some dummy users' (with zero privileges) credentials
directly on the hard drive? They simply can provide valid passwords from it
after reboot and your attacker can have these passwords since they can't do
anything with them. Maybe their leak strategy involves something like this?

------
sillysaurus3
The reason I had difficulty using scrypt is because I didn't want to DoS my
own server. Script asks you to choose a "work factor," and the higher the work
faster, the harder it is to reverse the hash. Trouble is, if you set it too
high, it's a bit painful to change it later. If you make a computationally
difficult hashing scheme, then a lot of simultaneous logins can therefore
force your server to use all of its CPU.

Does this suffer from the same issue?

It isn't necessarily an important issue. Maybe nobody will figure out they can
take down your service that way. Maybe rate limiting is enough. Maybe you can
solve it by having servers dedicated to hashing.

I was just curious about the performance characteristics:

 _Suppose that three people have passwords that are each randomly chosen and 6
characters long. A typical laptop can crack those passwords in about 1 hour.
If you take the same passwords and protect them with PolyPasswordHasher, every
computer on the planet working together cannot crack the password in 1 hour.
In fact, to search the key space, it would take every computer on the planet
longer than the universe is estimated to have existed._

The contradiction is that if users can log in to your service quickly, then
they can try a lot of passwords quickly. So is this more expensive than
scrypt?

If an attacker has read-only disk access, they can swipe any unencrypted data.
Scrypt protects you from this, and it's straightforward about the requirements
and computation costs.

In practice, what kind of CPU usage would a PolyPasswordHasher scheme need?

~~~
nly
Online DoS shouldn't _really_ be a problem, since you should be rate limiting
logins on a per-IP basis to prevent online brute-force anyway. Of course,
typically nobody does that.

'Server relief' is dumb... correctly implementing a 'hashcash'-like scheme to
prevent excessive CPU resource commitment is non-trivial. It will require
either persistent pre-login server state, and/or cryptographic IP/session
binding, both of which open up a second DoS vector.

Implementing Scrypt etc in the browser doesn't completely prevent DoS, and
sacrifices your ability as an operator to ensure passwords are being
appropriately salted and munged (because of third party clients, unauthorised
API users etc). Plus, attackers can just start using malicious webpages or
botnets to crowdsource their hashing efforts which, because it scales
horizontally, could actually _speed things up_ for an attacker if you're
server isnt doing any rate-limiting on the final hash+authentication stage.

~~~
saurik
Limiting logins per IP address is a luxury that cannot be afforded by
developers of applications targetting some Eastern countries or which are
expected to be primarily used by mobile devices, due to pervasive usage of NAT
and shared IP addresses. You are simply trading one form of DoS for another
:/.

~~~
gkanapathy
Seems like it should be possible to rate limit _failed_ logins, and of course
failed logins on a per-account basis.

~~~
zkhalique
Either way, how are you going to rate-limit failed logins while at the same
time not allowing a DDOS of a user login? If I am using a botnet to keep
trying passwords for Sarah Palin, how are you going to know when the real
Sarah Palin logs in from a new computer? Sarah Palin will never be able to log
in again from a new computer, unless she uses a key from her old one.

~~~
Ntrails
I would assume that you'd simply do (increasing) timed lockout periods by
user/ip combination.

At some point you have to accept that administrators will need to do some
work, and if 200 IPs are trying to log into the same account 5 times every 15
minutes you should probably email the user and lock the account.

------
jamesrom
I haven't read the full paper, so I don't have a complete understanding of
this scheme...

But what stops a malicious attacker from registering a bunch of dummy
accounts, and then using the known passwords to reconstruct the Shamir Secret?

~~~
azinman2
Similarly what do you do for the first couple of users that need to
legitimately login but you don't know the secret yet?

(I also haven't read the paper, but the concept is intriguing).

~~~
mvksaa
There is the option to "leak" a portion of the users' salted hashes, so that
the users can be verified upon reboot. This leaked information harms overall
security but, even with this option, the scheme is still better than regular
salted-hashes.

------
cmrx64
[https://password-hashing.net/](https://password-hashing.net/)

~~~
qrmn
This is a very solid point. We already had an open cryptographic competition
to select a password hashing standard, and we have a winner: Argon2 is being
developed as a new standardised password hash (i.e. slow-hash).

If you're going to introduce a new one, it should go through at least the same
level of cryptanalytic scrutiny before anyone uses it.

~~~
monort
This is not a password hashing algorithm, but storage scheme. It's purpose is
orthogonal to Argon2.

~~~
Beltiras
Complementary?

------
mcherm
This seems useless to me because of the well-known tendency for people to use
the same username and password on multiple systems. A hacker need simply try a
series of known username and password combinations until they have generated
enough successes to crack the system.

------
sakopov
Are incorrectly stored passwords really the problem with a lot of the last few
major security breaches? For instance, I read that Sony hack was an act of
social engineering. Anthem was hacked via phishing attack. Ashley Madison
hackers got in via SQL injection. I guess is all of this complexity really
worth it for an average company when it doesn't even seem to be the root of
the problem.

~~~
mikeash
It's not really about keeping people from breaking into _your_ site, but
rather keeping them from using the data they get from your site to break into
_other_ sites.

Basically, a lot of people will use the same credentials on both (for example)
Ashley Madison and their bank. If AM does a bad job storing their passwords,
then an SQL injection there can be used to reconstruct the users' passwords
and then log into many of their banks.

~~~
pakled_engineer
HSM are freely around for Ashley Madison to buy with their multi millions that
are optimized for hi speed crypto functions with physical keys that can't be
extracted by a remote attacker stealing the db. Since they don't bother to use
easily found current solutions don't see why any of these corps with massive
breaches would bother to use this beta software solution either.

------
ars
It seems to me that after a new reboot it's impossible to register new
accounts, or change any passwords because the server does not yet know the
"line".

You would need enough logins for it to calculate the true line before it could
create any new passwords.

------
dvh
What's wrong with PBKDF2?

~~~
brohee
Plenty, PBKDF2 is ASIC friendly and not memory hard, things addressed by
better schemes...

------
rstuart4133
It doesn't look like an improvement to me.

During normal operation it has a black box that validates passwords quickly.
The PloyPasswordHash is the key to how it does it - but really it doesn't
matter. All that matters is a password (or a hash of it) goes in, the blackbox
maybe reads some data from the (effectively public) password database, and a
yes/no comes out at the cost of stuff all CPU time.

The point is - if you can reproduce the blackbox once per GPU node on a
password cracker it's no better than any other scheme.

So the security of the scheme rests on the attacker not being able to
reproduce the blackbox. If it were a real blackbox sitting somewhere secure
then maybe - but it's not. They make a large point of saying this scheme has
no secret hardware sauce - is it's just software, in reality just another
process running on the web server. If you managed to get a copy of the memory
image of that process it's game over. But we can't use that weakness to attack
it because it is acknowledged in the paper.

During normal operation the blackbox creates it's secrets by effectively
combining a _lot_ of passwords (say N) from users. Thus you need N valid
passwords before you can crack one. I think they achieved that, and it does
indeed make the it near impossible to recreate the blackbox by brute force -
if all you have is the public password data base and a copy of the code.

But it's not that simple. Firstly they acknowledge a naive implementation
doesn't work because an attacker can just create N accounts and feed the
passwords to the blackbox so it can initialise itself. In addition there is a
bootstrap problem - no user can validate their password until N password have
been gathered by having N attempted logins.

They get around that by having N (or close to N) special accounts which they
call protector accounts. These accounts are presumably created by the server
admin and thus can be trusted. They are used for both bootstrapping (implying
they are available at boot time) and preventing the flood attack (by insisting
some of the special accounts must be part of the N).

And so we get to the real issue: the passwords for those accounts must be
entered when the machine boots. Effectively they are the secret the blackbox
needs to do it's stuff. Replacing the PolyPasswordHash with a process that
demands a single secure password be entered when the machine boots would work
just as well. Eg, that password is an extra salt fed to a cryptographic hash
along with the password word and on disk salt, and the result must match the
hash stored on disk. Without knowing that salt there is no way to validate the
passwords.

Sure entering N passwords isn't hard, but would you enter N passwords when
there is a method offering equivalent security when you enter just one? And
even that isn't the real issue - the real issue is fuck all corporations on
the planet have employees willing to be woken up a 2 AM in the morning when a
server boots to enter just one password.

