Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

"It’s important to note that salts are useless for preventing dictionary attacks or brute force attacks. You can use huge salts or many salts or hand-harvested, shade-grown, organic Himalayan pink salt. It doesn’t affect how fast an attacker can try a candidate password, given the hash and the salt from your database."

Why are you storing the whole salt in your database? Isn't it much more common to keep half of it in a configuration file? I know Django has a SECRET_KEY parameter for this sort of thing, and hopefully other frameworks do also.

For that matter, why is authentication being handled by the web server? If you've got data worth stealing (billing, emails, medical), you can afford to spring the extra few hundred for a proper authentication server.

Gawker's password handling (7-bit salt, in the database, digested with crypt) seems like the worst possible implementation of secure password storage.



That salt is a public value. The security of salted password schemes is meant not to depend on the secrecy of the salt.

Every time this topic comes up, 15 people chime in with various schemes in which some of the "salt" is derived from the hostname and some of it is stored in an encrypted vault and some of it is inferred from the color of the user's eyes. This is why Coda is making fun of "Himalayan pink salt".

To understand how irrelevant these details are, consider AES encryption. In addition to hiding an AES key, you can also hide portions of the AES CBC IV (a public value). You could use a random number of rounds. You could mix in tweaks with the round key. All of these things are possible, but (a) nobody analyzes AES based on those random hacks, because they don't fundamentally alter the properties of AES, and (b) nobody does those things, because they are silly.

Gawker could use the best conceivable practices in cryptography to obscure passwords; they could be using Colin Percival's scrypt function (which nobody uses yet) to be (in some way) provably resilient to hardware-assisted cracking. You could still level this criticism at them, for "not doing something to further obscure the encryption they used".

This is not a new argument; what I am re-explaining here is Kerckhoffs' principle.


Chrome OS is using scrypt!


> That salt is a public value. The security of salted password schemes is meant not to depend on the secrecy of the salt.

I don't understand why you'd add extra content to the password unless you keep it secret. The whole point of a salt/nonce is to prevent attackers from attacking the digest, right? You need some per-user data, to defend against rainbow tables, and some per-site data, to protect weak passwords.

My fundamental objection to schemes like bcrypt/scrypt is that they impose a heavy performance penalty on authentication to avoid a relatively rare case; besides, any theoretical entity capable of reversing a typical salted-password implementation is also capable of reversing bcrypt/scrypt.


"Reversing" bcrypt? "Reversing" salted hashes?

You're exactly the guy I'm talking about. "Oh, I use AES, but I don't just use AES; I store the secret IV for AES in a cookie so even my server can't decrypt it unless the client comes back with the IV so it's like two guys in the silo with the missile keys". Seriously, I just found that piece of code yesterday. Did you write it? Stop writing that stuff.


I can probably top you: some many years ago, I discovered that Network Solutions (the original domain name provider) was using the first two characters of the password as the salt for their user accounts, presumably so that the salt could also be secret. Of course, this had the side-effect of making the first two characters of the password visible in plain text if you looked at the hashes. I reported this as a bug and never heard back. I've also never done business with Network Solutions since then.


So the salt was just added to the hash without hashing the concatenated hash + salt, i.e was this the method they chose?: saltedhash(password)=hash(password).salt

A part from the obvious flaw of including part of the password in plain text, how secure is this method compared to the following method were the salt is not a secret and were the concatenated hash + salt is hashed?: saltedhash(password)=hash(hash(password).salt)


Using two-character salts might be as big of a wtf here.


Perhaps in retrospect, but this was standard for the Unix crypt() function and passwd files of the time: http://linux.die.net/man/3/crypt The dangers of rainbow table attacks weren't well considered at the time.

The real error was misreading the man page and using the first two characters as the salt, which is then published as the first two characters of the hash. It's sort of an easy error to make, because to decrypt, you do use the first two characters of the hash. Understandable for a beginner working on a school project, but pretty ludicrous for a large company holding control over most of the domains on the internet at the time.


No; I use standard, time-proven, secure designs. SHA1(salt + password) is sufficient in almost all cases, and if anybody is capable of deriving the original input from the digest, they can do so no matter what digest algo is used.

I know there's lots of ways to screw up security, but most of them derive from lazy people taking shortcuts. They run the httpd, database, and authentication all off the same server so a vulnerability in one compromises all. They store secrets in the database because figuring out secure storage would take half an hour of research.

Replacing a poorly-implemented SHA1-based system with a poorly-implemented bcrypt-based one won't help security.


SHA1(salt || password) is an incompetent design that is debatably even easier to crack than the Gawker hashes. The insecurity of that construction is why we have PBKDF2.

If the entire knowledge you have of cryptography comes from _Applied Cryptography_ --- wait; let me extend that: if you even feel the need to cite _Applied Cryptography_ --- you should be careful debating crypto constructions. You're not going to end up happy.


I'm curious what the attack is that makes that easier to crack than the Gawker way. (I'm sure you're right, I just didn't know that it would be easier.)


SHA1 might (I think it is, but I'm not sure) be faster than DES; among other differences, DES crypt(3) has to run the DES key schedule before producing a hash. Data slips through SHA1 like a greased seal.


Looks like you're right.. these benchmarks show SHA1 about 5x faster than DES: http://www.cryptopp.com/benchmarks.html


Ah, alright. I was thinking there was some kind of length-extension weakness (which doesn't apply here, which is why I was confused) or some other attack in the cryptographic sense. Thanks.


SHA1(salt || password) does have a length-extension property; you can easily compute SHA1(salt || password || junk || arbitrary-data) for any given hash. That's not very useful for password hashes, but is devastating for the kinds of SHA1-based authentication schemes that people who write their own password hashes seem to come up with.


Yeah, by "doesn't apply" I meant "that's not very useful for password hashes".


A machine with a few GPUs can compute hundreds of millions of SHA1 hashes per second.

http://www.win.tue.nl/cccc/sha-1-challenge.html



I don't have a dog in this race either way, but I'm curious what you dislike about Schneier's book.


It's not deep enough to provide true knowledge of cryptography and cryptographic attacks while it also doesn't give practical advice on what to actually do in situations that require cryptography (read: always use high-level primitives). Applied Cryptography is pretty good (if outdated), I think, if you're seeking to gain a beginner-level knowledge of cryptography. Practical Cryptography, on the other hand, is a far better choice for what to do when actually using cryptography (although even that is outdated now).


If you think _Practical_ is outdated (and I'm not saying it isn't), you should come over and let us buy you coffee sometime.


Haha I'll take you up on that when I get back to Northwestern. I'll admit that the only reason I think it's outdated is because I was just teaching basic cryptography to the network security students and ran across a few things that made me think "Hmm, Niels/Schneier should really include this in their next printing." Some things I'm thinking of are EAX/GCM instead of the conventional CTR.


Quoting me: "Lots of random facts about crypto trivia. Not a lot of context. Even less information about how to actually safely use crypto primitives. You'll come out of it knowing how to get CAST or IDEA into your code --- two ciphers nobody uses anymore --- but not how to properly choose an IV for CBC mode."


> No; I use standard, time-proven, secure designs. SHA1(salt + password) is sufficient in almost all cases, and if anybody is capable of deriving the original input from the digest, they can do so no matter what digest algo is used.

Anyone can create an input that hashes to a given value. The relevant factor is how long it takes to create that input. I hope you can see the difference between that process taking 3 seconds vs 40 years.


> they impose a heavy performance penalty on authentication to avoid a relatively rare case

You're authenticating over the internet. What is 1/10th of a second to authenticate the first time you want to log in relative to everything else? It's like complaining you have to put the key into your car before starting a five hundred mile road trip. Yes, it takes a few seconds. But worth it? Most definitely.


If you have 10 people logging in per second, you put a 1 second delay on future requests which is certainly noticeable.

Maybe the correct thing to do is to make the key derivation executed on the client side, but then this would erode the experience of mobile phone users.


There is no way to securely do this clientside. It's hard to imagine a situation in which login overhead is painful where scaling in general isn't already a huge concern; presumably, anything you do after login is going to be more painful than bcrypt.


I would imagine some kind of zero-knowledge proof would work here, but that would require more server interaction than just doing the hashes in the first place.


If you have ten people logging in per second, you've got to have more than one server. Distribute the login requests.

And if it really kills you to make it take a full second, then make it take 1/10th of a second: there, now your hashing is faster than the time it takes to dynamically generate a page.

People really need to learn that security doesn't come free, and some times you just need to bite down and say "You know what? I never plan on getting broken into, but just in case I do I'll take the tenth of a second extra computation in exchange for doing the right thing."


Basic version: The salt prevents an attacker from being able to pre-compute a mapping from password <-> hash value. It's most effective when it changes per-password and it doesn't matter if it's public.

http://en.wikipedia.org/wiki/Salt_(cryptography)


To make it even more obvious, it's to prevent this (I haven't done SQL in a long while so this may be broken):

  SELECT username FROM users WHERE password=HASH('secret');
I've seen systems where that statement will give a list of all usernames with the password "secret".


If you use a salt you could still do this 'attack.

    SELECT username FROM users WHERE password = HASH(salt||'secret');
This is academic because you already know the password ('secret').

Salts make rainbow tables (essentially precomputed hash values of (say) all english words) hard and infesiable.


The point of the salt is not to add extra complexity to the secret information; it is to make your transformation of the password into a hash different from everyone else's such transformation. This prevents attackers from getting a supercomputer or two, calculating a mondo rainbow table, and using it to crack every password hash in existence. The salt forces them to perform this computational feat separately each time they want to crack a password.


I may be mistaken, as I was only browsing. I was looking at Django's auth system last night (as all of this had me curious about the backend). I don't think that SECRET_KEY is used for part of the salt at all, I think it's primarily used for validation of site-generated data, signing requests, and cookie encoding/decoding.


You are correct, Django's hashes are stored as salt$digest, or maybe salt$function$digest nowadays. It wouldn't be very good if all your password hashes became invalid because you changed the secret key.


You're probably correct; when I used Django I had to reimplement the auth, since their built-in doesn't support roles, and I didn't look too closely at their implementation.


If you want to crack passwords ultrafast, you make a look-up table of the hashes of all probable passwords. Without a salt, this is rather a smaller table. A 7-bit salt makes the table larger, but not hugely larger, and not troublesomely larger.

The 128-bit salt used by bcrypt makes the table intractably hugely large. You cannot precompute it.

Of course, you know the salt (because it's stored right there in /etc/shadow), so you can still run through dictionary words and try them all. But bcrypt is designed to take arbitrarily long amounts of real time to do this.

So in the case of bcrypt, it's not really an issue that the salt is stored right there alongside the hashes password.


Those password crackers that are belting through billions and billions of passwords an hour with just a couple of video cards aren't using rainbow tables, are they? You could have zero bit salt: you're still boned.

Bcrypt is not better because it has a better salt. It's better because one iteration of bcrypt takes a long time, and millions of iterations take an intractably long time.


Bcrypt is better for BOTH those reasons working together; removing either the salt or the variable-cost key scheduler would be Bad.

Obviously, the variable-cost key scheduler is the central notion to the thing, but not having a large salt completely nerfs it.

bcrypt uses a 128-bit salt, and it uses it for good reason. See the paper, sections 6.2.1 and 6.2.2.


You're not wrong. But a 128 bit salt on SHA1 would do zilch to slow down an attacker. You see my point. I see yours.


Yes... the point I see you making is something like this:

Putting the 128-bit salt on there prevents a precomputed dictionary attack, a constant time operation.

But SHA1 itself is a constant time operation, so having a salt only slows down an attacker in a wall time sense, but not the more important time-complexity sense.

We all agree the salt is not particularly important for the constant time hash. (It's only practically important when the hash takes a lot of wall time relative to the wall time of a precomputed dictionary lookup, and constant time hashes gradually lose this edge due to Moore's Law.)

The point you see me making is that bcrypt is not a constant time operation (due to the variable-cost key schedule--2^cost, actually), and allowing people to use a constant time precomputed dictionary lookup by not having a large salt would make it as bad as no-salt SHA1.

So we all agree that the large salt is vitally important for the non-constant time bcrypt.

Not that either of these points are relevant to my initial assertion that giving the salt to an attacker is not something people worry about. The salt is there to prevent a precomputed dictionary attack, and a large salt does this no matter how well-known it is.


Why are you storing the whole salt in your database? Isn't it much more common to keep half of it in a configuration file? I know Django has a SECRET_KEY parameter for this sort of thing, and hopefully other frameworks do also.

How is that really any better? You should assume that if an attack can get to your database, they can get to your web servers and take all of that as well. Such a scheme certainly wouldn't have saved Gawker.


In Gawker's case this is exactly right. The database AND the source code was accessed so the config file defense would not have worked.


Config files are not typically world-readable; the httpd reads them before dropping permissions. Otherwise, a vulnerability in the httpd would allow access to everything in the config (remote server passwords, signing keys, etc).

There's no reason why compromising the database would allow attackers into the web server, unless you've configured SSH to allow signing in from arbitrary remote systems.


If you lose code execution on your server to an attacker, you're done. 100% fucked. Everything in your environment needs to get stripped down and rebuilt from trusted sources. Do not be one of those people who rationalizes "oh, I just lost uid=4294967294". Gawker lost root. So will you.


And if you lose root, the disk must be reimaged. Rootkits are too advanced these days to ever make the assumption that you have removed them.


It's possible that sometime not too far in the future that advice might need to be upgraded to "If you lose root, unplug and destroy the hardware and install a completely new machine from trusted sources".

http://www.theregister.co.uk/2010/11/23/network_card_rootkit...

(Why yes, I _did_ have that problem weighing on my mind while I investigated several machines that had a weekends worth of exposure to the recent Exim remote root exploit...)


Direct DB access isn't typically world-accessible either. It's not like I can do an XML request to get my hashed password on any site.

The point is, if someone has gotten enough access that they can actually get a raw copy of the database, it's just as likely they can get a raw copy of the config files, or /etc/shadow, or whatever else is on the host system.


> Config files are not typically world-readable; the httpd reads them before dropping permissions. Otherwise, a vulnerability in the httpd would allow access to everything in the config (remote server passwords, signing keys, etc).

I think you give people way too much credit. Maybe config files aren't "typically" world-readable but they ARE typically httpd-readable.

It's a mistake to assume, that because you might follow best-practices, the rest of the people out there do. You said yourself in a comment on this same post that many people make mistakes because they're too lazy to take that "half an hour of research." Let's make the assumption that that is typical.

> There's no reason why compromising the database would allow attackers into the web server, unless you've configured SSH to allow signing in from arbitrary remote systems.

Oh, you mean the default behavior? See above.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: