
New developments in password hashing: ROM-port-hard functions - colinprince
http://www.openwall.com/presentations/ZeroNights2012-New-In-Password-Hashing/
======
DenisM
The problem with blind hashing is that database transaction log likely has
enough data to link blind hash back to the user record. Further, databases
such as PostgreSQL or Oracle have version stamps on all records, which allows
precise calculation of record age, allowing them to be linked together. MSSQL,
depending on your index structure(I.e. if you use heaps), migh also betray the
fact that records were created at the same time due to ordering of records.
Most certainly, page allocation algorithm is likely to betray some of the
same, albeit with less precision.

Bottom line is that if you're worried about database file or backup leaks,
blind hashing is not adding meaningful security. It only helps with SQL
injection and text dump vectors.

Edit: Duh, it's covered further in the slides!

~~~
nn2
Don't use a off the shelf database. They are overkill for this.

This only needs a very single table key->value store where every transaction
is only a single access. I would just map it to a hash table mapped directly
to the raw disk. Doesn't need transactions with a log if done right.

The hash spreads out the allocation, no information leaks for the allocation.
If you don't have a log there are no time stamps.

Modern disks are big enough that you dont need to worry about resizing the
hash table. And the random salt makes collisions unlikely enough.

------
StavrosK
Unfortunately, I feel that I'm missing something. Where does ROM come into the
equation, and why is low memory usage bad?

~~~
DenisM
The threat model is that attacker stole your database backup and is now trying
to brutforce passwords in it, by computing all plausible permutations of
characters, hashing them and seeing if they hit a match. Because attacker is
attacking a mass of passwords, he is likely to use a parallel computer, such
as a GPU or an ASIC, to decrease the cost of he attack.

The defense against this scenario is to make hashes expensive to compute and
hard to parallelise. If an algorithm is using a lot of RAM it's hard to
parallelise because a piece of RAM can be used by only one computation at a
time, hence this algorithm is hard to attack, but your own server is computing
things sequentially as users come online, so a single piece of RAM can be
freely reused. This is why low RAM usage is bad.

I did not quite get the idea behind the ROM thing, but I think it means you
have a large multi-GB piece of strongly random data, which is used during hash
computation either entirely, or piecemeal where pieces are picked
pseudorandomly and not predictably. A GPU cannot store multi-GB chunk of data,
so it will have to travel externally to fetch required data, which is easy to
do for a sequential machine, but creates a huge bottleneck for a GPU, making
parallelism hard again.

~~~
StavrosK
Thank you, that's more or less what I understood too, then. It seems to me
that not every server can spare the quoted tens of GB of RAM just for
authentication, that's why it seemed odd to me.

~~~
DenisM
If it costs you $X to add 32Gb of RAM to your machine, it will cost an
attacker $X * 1000 to add 32Gb of ram to a GPU - GPUs do not come with 32Gb of
RAM today, so it has to be custom hardware.

To sum this up, it's one of those ways to make sure that attacker's cost grows
much faster than defender's costs, but it's not without cost to the defender.

