
On Passwords - tortilla
http://jasonseifer.com/2010/03/21/on-passwords
======
oomkiller
BCrypt is the way to go. I recently used the ruby-bcrypt library and its
really simple to use, the docs are also good on getting you started. I did
have a couple of questions though. Can I tune the :cost parameter, and not
have to recalculate the existing hashes? Also, how would I handle hashing SSNs
since the domain of the data is very limited, and could be easily calculated,
even with high complexity bcrypt?

~~~
ehsanul
Yes, the cost parameter can be tuned without rehashing. This is because the
cost is actually stored with the hash, so the == method of the Password object
knows what cost to use to check if a password matches. Don't take my word for
it, try it in irb (I just did). From the documentation:

 _In addition, bcrypt() allows you to increase the amount of work required to
hash a password as computers get faster. Old passwords will still work fine,
but new passwords can keep up with the times._

Obviously, old passwords would be weaker than new passwords this way. An easy
solution to this problem is to check the cost of hashed password in your
database when a user tries to log in. If the cost is lower than the cost you
want, and the password matches, then replace the hash at higher cost using the
password the user just gave you.

As for hashing SSNs, you can easily overcome the problem of a limited range of
data by the usual solution: salting. Bcrypt-ruby does salting automatically,
which you can observe by rehashing a password in irb multiple times, getting
different results (the salt is also stored with the hash, which makes this
possible). You can also add additional salt yourself if you really want to.

~~~
oomkiller
Still, if an attacker were to have access to the salt, I would be in the same
situation. I guess I could always increase the cost by a lot, but once they've
calculated all of the results, its too simple to get the SSNs.

~~~
ehsanul
You're right, salt just protects against rainbow tables, my mistake.
Increasing the cost so that a single brute-force attack would take a few
years/decades is actually fine. I don't see why it would then be "too simple
to get the SSNs", given that you're using a different salt with each.

Here's another idea: Use a secret salt, similar to AWS's secret id. Make it
long enough that it would pretty much impossible to brute-force (do the
calculations). Does that seem like a workable solution? Of course, if the
"secret" isn't secure, then well, you're in trouble, and if the secret is in
your code in plaintext... Yeah, it's an uphill battle. I'm sure someone more
experienced than me here can provide some solution.

------
jseifer
Author of the blog post here. I posted this mainly to get feedback on the
different password hashing schemes for open source software/frameworks with
the added benefit of some background as to why bcrypt matters. I defer to
fellow HNer tptacek's article about secure password schemes for a complete
explanation on why bcrypt is the best. It's linked in the post but here it is
again for reference: [http://chargen.matasano.com/chargen/2007/9/7/enough-
with-the...](http://chargen.matasano.com/chargen/2007/9/7/enough-with-the-
rainbow-tables-what-you-need-to-know-about-s.html).

------
mey
1) Taking longer is not an indication of strength of the algorithm. Just like
banging your head into a wall, multiple times isn't the best way to pass
through it.

2) bcrypt does not appear to be well documented, well analyzed, or well
maintained in the security community. Security through obscurity is not a good
thing, as you have no idea what your exploitation window looks like.

3) If your goal is to limit the effect of rainbow tables, and not storing the
password in the clear, and do not need to retrieve the original password, the
ideal solution is to store a hashed password on disk and a salt for that hash.
Thus your salt rotates for every password created/saved and you recovery of
all passwords requires a rainbow table for every password. Not a simple task.

4) Lastly, you should make yourself familiar with the security risk of any
system you use. For example, you shouldn't generally be using MD5 anymore...

~~~
Chronos
"1) Taking longer is not an indication of strength of the algorithm. Just like
banging your head into a wall, multiple times isn't the best way to pass
through it."

You do realize that this is how _all_ modern "cryptographically secure" hash
functions work, right? For that matter, so do the symmetric-key block ciphers
that hashes are closely related to. All of them simply use S-boxes to
obfuscate the numbers, followed by repeated "rounds" of simple bitwise
primitives (bitshift, and, xor, perhaps addition modulo a power of 2). There's
no _inherent_ mathematical reason why _combinations_ of these simple functions
should be strong, when each function individually is weak.

This is precisely why cryptography is such a tricky field to work in.

~~~
mey
Again, the "rounds" do not indicate execution time, thus execution time is not
a good determination of how "secure" a hashing function is.

There is inherent mathematical reasons why the combinations of the functions
should be strong. It's why s-boxes are accepted practice.

Agreed, cryptography is a tricky field. All the more reason to stick to
systems that are carefully reviewed by experts smarter them myself and you.

------
blueben
Or, instead of using this odd "security through wasting time" method, you
could just employ a large salt to make rainbow table usage infeasible.

~~~
blueben
Down-voted for pointing out that a salted hash is a better alternative? Weird.

~~~
Chronos
Suppose I have a processor that runs at 1GHz, and furthermore suppose that it
has a built-in hash primitive that can hash one salt+password in a single
clock cycle. Thus, I can try 1 billion passwords per second. That means I can
try every item in an English dictionary in merely 0.1 milliseconds, or a very
thorough password cracker's dictionary in 2 milliseconds. If a site has 1000
users, I can try the cracker's dictionary against _all_ of them in only 2
seconds. That's less time than it takes for me to browse through the list of
password hashes using the scroll wheel on my mouse. For password hashing
primitives, speed DOES matter. A lot.

What's more, if I could do 1 billion MD5s per second, then I could brute force
every possible 8 character base-64 password (for one user at a time) in merely
3.25 days. Even if that one user has chosen his or her individual password by
pulling it straight from /dev/random. Impractical against a site with 1000
low-value users? Sure, most black hats won't want to spend 5 to 10 years of
compute power on that. But that's an utterly practical attack against a
single, high-value account. If you're Twitter and someone just stole your
salted password hashes, you'd better ring up Ashton Kutcher in the next 24
hours and tell him to change his password _right now_.

Something truly scary: this level of computer power can be easily achieved
_today_ using parallelism. I just now ran a benchmark on my personal Linux
machine against 10,000 small files on a ramdisk, each of a size appropriate
for salted passwords and each with unique data. According to this benchmark,
my Athlon64 running at 800MHz can MD5 100,000 passwords per second using
md5sum, complete with the system call overhead of opening and closing ten
thousand files. Even so, my machine is only 4 orders of magnitude slower than
the monster I described, and one of those orders of magnitude should be
written off because my processor is years behind the times (sub-1GHz and
single core). Shave off another order of magnitude due to the useless system
calls, and one hundred modern $100 commodity processors could do this job
_today_ , and you could probably do it for half the price or less if you used
DSPs or video card GPUs. This is trivially within the range of a project like
Seti@Home or Folding@Home, or a dark-hat version of the same (running on a
botnet of stolen CPU cycles).

Addendum: also note, these figures imply that _one_ computer with a modern CPU
can dictionary attack a salted password hash file from a 1000-user site in
merely 200 seconds, i.e. about the time it takes to microwave a burrito. No
need to build a botnet first.

------
marshy
You guys r idiots. This article rulez!!!1

------
pan69
Somehow I find it very disturbing that these bloggers put a photo of
themselves on their blog. I've seen this a few times recently. Every time it
happens I just have to close the tab immediately. Therefore, sorry I have
nothing meaningful to say about the article. I didn't (couldn't) read it.

~~~
marshy
Whoa dude. You really need to get off the computer for a little while.

