Hacker News new | past | comments | ask | show | jobs | submit login
A mode for password hashing (bramcohen.com)
37 points by janvdberg on Nov 18, 2014 | hide | past | favorite | 31 comments



Wishy washy, hand wavy, security mumble jumble:

> This operation is repeated some number of times until enough work has been done, and then output is extracted. It’s reasonable to set the number of rounds to be be double the number of blocks stored in bulk memory. It might be okay to go far lower, but that feels sketchy. In particular, going lower might result in custom hardware being able to effectively pipeline the next block being uninitialized (which might be able to produce a small speedup on a general CPU anyway, further research needed).

"Some number of times", "reasonable", "feels sketchy", "further research needed"...

> There’s also an encryption function used, probably AES128...

"Probably AES128"... If there's one thing that you shouldn't tolerate in security, is relying on your feelings about mathematics. You can DO the math, your feelings on what the result might or might not be are not very interesting.

The scrypt paper does all the research and math that this article doesn't do. It even includes dollar cost estimates for breaking passwords hashed with various algorithms. Yes, the scrypt algorithm itself already does what you need. The paper is highly recommended for developers who ever touch passwords.

http://www.tarsnap.com/scrypt.html

I also consider PBKDF2 acceptable for general password hashing.


PBKDF2, bcrypt, and scrypt are the three that most people find acceptable.

This blog post isn't meant to be an in-depth research paper, its tone is more, "hey, I have a neat idea, what does the community think?"


PBKDF2, BCrypt, and SCrypt were also "Hey, I have a neat idea. What does the community think?"

Just some people are more precise at writing, and offer better mathematics than other people.


"Offer better mathematics" is the whole point, don't you think? This is ultimately math. If the math is lacking in a proposed mathematical construct then what makes it worthwhile?


Scrypt also came with an academic paper in PDF format. Maybe one will follow when the ideas are more solid.


I don't have time to even follow well thought academic ideas. Why should I be bothered trying to develop ideas of other people who don't put forward even a little bit of effort?

The concept is similar to SCrypt. He wants to utilize a memory-hard problem to make the difficulty of creating ASICs more difficult. However, this blog post has no ideas on how VLSI circuits work, nor any crypto calculations, nor any math or proofs that describe his thoughts.

He has a supposition: that using maybe AES (or any block cipher) as a random-number-generator, he can randomly traverse memory to increase the difficulty of hashing. (I'm assuming he's just using any ol' block cipher for this process). He leaves a component of the calculation in one memory block to keep it "inside the cache", presumably to keep it working quickly on modern CPUs.

Maybe. I dunno. He has no example code, which makes this sort of thing _very_ hard to discuss. I don't even know if I'm reading his post correctly.

But assuming this is how he's doing it (and mind you, I really don't know what he's talking about because he's so imprecise), my $0.02 of "serious" criticism is that its not obvious that these calculations cannot be pipelined and simplified in an ASIC or FPGA... at least any more than SCrypt (or BCrypt) was.

And remember, SCrypt has ASICs coming out of the pipeline, and also ended up being much much faster on R9 290x GPUs than your standard CPUs.

Honestly, the X11 proposals from the alt-coin community are better researched and more precise than what this blog-post proposes. This proposal here is so obscure and wishy-washy that I can't even discuss its merits.


> I don't have time to even follow well thought academic ideas. Why should I be bothered trying to develop ideas of other people who don't put forward even a little bit of effort?

You shouldn't and I'm not saying you should. I'm suggesting everyone is taking this more seriously than it needs to be.

"Might be a novel idea, here's some things to consider, needs work," is appropriate.

But this idea is clearly stuck in rough draft hell and needs some serious fleshing out before everyone pulls out their magnifying glasses.

Going balls-to-the-wall with scrutiny and criticism before it's warranted does nothing constructive and only discourages people from tinkering. Bad for the community, etc.


A bit irrelevant, but I remember reading about storing password hashes independent of the account information. I can not find it now..

The idea is that you have a passwords table and whenever someone sets a new password / registers, you insert something like "hash(userid + password + salt)" into the table. There is no link between the account and hash, when the user logs in, you need to check the existence of the computed hash on the table. If someone dumps the database, at best, they have some password hashes with no way to connect them to user accounts.

Password changes would require asking the old password but I guess there is no way to implement 'forgot password' functionality here? How would you invalidate old passwords?

Is this something viable? Anyone uses anything like this? I thought about adopting it on some new auth systems we are working on but losing password recovery is downer.

edit: Changing the salt on account information would take care of invalidating old hashes I guess..


If the hash is stored with an account, then an attacker attacks each account by performing the hash on each potential password (dictonary, exhaustive search, etc) and checking to see if it is equal to the stored one.

If the hash is stored independently in another table as you suggest, then an attacker attacks each account by performing the hash on each potential password and checking to see if it is in the set. If that set is small enough to fit in memory (and they could likely store billions) then a simple contains check doesn't buy you very much. It is simpler to raise the work factor of the hash.

If someone is using dedicated hardware it may make it a bit trickier to implement, but it also makes things trickier for you to manage. Better to use a hash designed to thwart such an attacker (like scrypt) up front.


You are right, it's just a weak secondary measure but it does not really make it any more complex for the developer while causing some headache to the attacker :)

In any case, one can seed the hashes table with a couple billion random hashes, making the lookup slower for both parties. It would hurt the attacker more, given the nature of brute force.


This sounds like a good idea at first even though it does have the side-effect of creating a lot of orphan password hashes in your table. Let's step into the shoes of a criminal that has access to your database and hashing algorithm and see if it makes his life that much more difficult.

The first thing he would do is a dictionary attack from a list of common passwords: Pick one user, create the hash and search the hash table for a match. That is one step more than only having to check the hash in the same row. I'd say it is slightly, but not significantly, more difficult.



> Changing the salt on account information would take care of invalidating old hashes I guess..

Any chance you would have time to elaborate a bit? I like the idea of having a table full of hashes with no (clear) link to accounts.


User table has two columns: userid; salt.

Password table has one column: hash(userid+password+salt).

Login process: given userid and password, look up salt. Check if hash(userid+password+salt) appears in Password table.

Password reset process: Given a userid and new password, change salt to a new random value and save to User table. Insert hash(userid+new_password+new_salt) into Password table. The old hash remains in the Password table and could still be used to log in, but in theory is useless because the salt has been destroyed.

Implementation detail: Make sure the old salt really is destroyed.

EDIT: After thinking about this, I think the reason not to do it is that it doesn't really win you much protection over the usual approach -- a single table with userid; salt; hash(userid+password+salt).

In that scenario, if your database is stolen then the attacker will have to brute force each user password individually and compare it to the stored hash for that user. You hopefully used an expensive hash() algorithm so that will be a slow process. The only thing the separate table adds is an extra table lookup during each brute force attempt, which compared to the cost of hash() is basically free.

So yeah, I don't think this really gets you anywhere.

EDIT2: In fact, you get almost exactly the same effect by making your bcrypt()/scrypt() parameters one notch harder. Crypto primitives FTW.


I see, thanks for taking the time to explain!


It's interesting, but it has issues that I think make it more worthwhile to just focus on not getting the tables dumped in the first place.


If we followed that logic then why hash and or salt at all? Instead just "focus on not getting the table dumped in the first place."


The greatest benefit to hashing is to protect the user's privacy from yourself (and other authorized administrators of the application) and as a guarantee that after the point hashing is implemented nobody can get easily expose plaintext version.

Even if all of these fancy hashing algorithms are implemented, it only makes it slightly more difficult to crack weak (common, short, dictionary word) passwords.


If I'm an admin I could just dump the raw HTTP POST requests as they enter, bypassing hashing entirely.


What issues?


It frustrates me to read stuff like this because as much as it is technically sound, and will make a hash marginally more difficult to crack, I worry that complex criticisms of hashing algorithms can have the effect of discouraging application developers from using any hashing on passwords at all.

Now I am going to mention something that will be very unpopular, but it is 100% true, so try to open your mind a little before reading.

Let's compare not hashing, the worst hashing algorithm, and the best hashing algorithm. The worst will be something like MD5 with no salt, and the best using the most advanced hashing, random salts, and CPU-heavy calculations. We'll compare the effect of database compromise by a cracker on a user with a weak password (4-character dictionary word) and one with a strong password (15-character random a-Z, 0-9, and symbols)

* No hashing - weak: cracked; strong: cracked

* Worst hashing - weak: cracked; strong: safe

* Best hashing - weak: cracked; strong: safe

The best hash will protect more medium strength passwords than the worst but there is no hashing algorithm strong enough to protect the weakest passwords, and MD5 with no salt still protects the strongest.

But there are other benefits of using any hashing over none: It protects your users from yourself and other admin-level people on your team and guarantees that your application cannot expose plaintext passwords even after you leave the project.

tl;dr The difference between no password hashing and weak password hashing is significant and the difference between weak hashing and strong hashing is small.


Your comparison is invalid because with 'worst hashing' a strong password is still cracked; there's simply too many workable attacks, the algorithm is too fast, and FPGA implementations are trivial.

A password function is the last line of defense, and never foolproof. Looking at the history of successful password exfiltrations over time, attackers have about a month or two before somebody notices they got away with your data. So your password functions should ensure that passwords are strong enough that with current-day technology they can withstand at least that much time out in the wild; otherwise, don't even use a hash.


This is pretty much why HSMs and good PAKE protocols are important. Key stretching only buys you a degree of computational hardness, whereas not storing or exchanging password derived material at all makes offline cracking provably hard.

Hashing is just a stopgap measure that hopefully gives your users time to do something before they're compromised. It's hard to say how effective it is in practice when a new db dump goes to auction.


I particularly like reading this, and then comparing to something like this: http://www.tarsnap.com/scrypt/scrypt.pdf


Whatever people think of this idea, it's probably a bit late to be coming up with new designs. Any future adoption of a different password hashing algorithm will probably be determined by the PHC[0]... at least, I'm not aware of a more comprehensive ongoing effort to improve the state of the art.

[0] https://password-hashing.net/


I'm not sure what I think of it, but don't dismiss it because it's not established by X or Y person.

Judge it on its merits, if any. If we can't poke holes in it, then maybe its worth those same people's time to look at.


PHC isn't about pedigree, it's a formal contest that was open for submissions for a while now. Now they're in the elimination phase.


Has he coded it? In what way is it better than [bs]crypt?


[deleted]


that's seem fit....


Is there any way to measure empirically the safety of a hashing algorithm?


Get a large site to adopt it, leak the password hash database. Voilà!




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: