Cheap GPUs rendering strong passwords useless?

tptacek · on June 6, 2011

NO THEY AREN'T.

This is a great example of a bona fide awful trade press article: some guy with a column-inch quota sees a piece by another trade press hack who is in turn digesting a blog post by someone who is "allergic to command line utilities" writing up someone else's GPU password cracking tool. And so, strong passwords are useless!

Except that this has nothing at all to do with why passwords are bad. The problem GPU crackers exploit was solved over a decade ago with adaptive hashing. GPU crackers cannot crack scrypted passwords of any real complexity in tenable amounts of time.

Once again: http://codahale.com/how-to-safely-store-a-password/

code_duck · on June 6, 2011

I really hated that phrase 'allergic to command line utilities'. It reminds me of this odd attitude, which used to be a bit more common, that ignorance about computers is virtuous or amusing.

tzs · on June 6, 2011

Bcrypt seems to have a big design oversight. Given a password stored with cost factor C1 that you would like to increase to cost factor C2, it seems to me that you have to have the plaintext password to do this. Thus, you have to wait until that user next provides the password, verify it, and then re-store it with cost factor C2.

smanek · on June 6, 2011

That's just not true. To oversimplify a bit, let's say BCrypt is a function b(data, cost) that returns a hash.

Right now, I have b(password, c1) = hash_a stored in my database. To increase the cost to c1+c2, I just replace that with b(hash_a, c2) = hash_b in my database. To verify a user's login, I just compute b(b(password, c1), c2) and compare it to the hash_b I have stored.

Am I missing anything?

zcid · on June 6, 2011

Or you can wait until the user signs in again and create hash_b at login with the plaintext password after you validate it against hash_a. I believe that is how the bcrypt documentation recommends you do it.

VMG · on June 6, 2011

That is much less preferable though.

tedunangst · on June 6, 2011

Yes. That's not how bcrypt really works.

smanek · on June 6, 2011

I've read the usenix paper, and I'm reasonably sure I'm right. Care to elaborate on what I'm mistaken about? I (and a few other people I've talked to about schemes isomorphic to this about) may learn something new :-D

tedunangst · on June 6, 2011

OK, so you can chain bcrypt in theory, but if you look at the API and how it stores rounds/salt, I don't think you'll enjoy implementing the code to do so. You can't just store hash_b, you'll need to store hash_a too. It's not worth it.

Basically, the assumption that bcrypt is just a function that returns a hash oversimplifies the detail of how salt is stored in practice.

tptacek · on June 6, 2011

I wrote a comment to the effect of "you can just chain bcrypt" but deleted it because I didn't want to wade into exactly this tarpit. If a client asked me how to upgrade the cost of their password hashes, I'd probably tell them to validate incoming plaintext and then create a fresh hash.

I agree: this is not a big deal in practice. bcrypt with a cost factor of 10 is so much slower than SHA1 that this is just not going to be an issue.

bradleyland · on June 6, 2011

That's no bug, its a feature!

Seriously though, this is a design decision that cannot be easily solved with any non-reversible encryption. If you were able to increase the cost factor, the you'd open the algorithm to attacks that reverse that process.

daeken · on June 6, 2011

Err, no, that's actually not true. Take the case of PBKDF2 -- if you apply 1000 rounds of PBKDF2 to password X to get X' then apply 1000 rounds to PBKDF2 to password X' to get X'', applying 2000 rounds of PBKDF2 to password X == X''. You could increase the cost factor (number of rounds) every time the password is used, for instance.

RyanMcGreal · on June 6, 2011

Unless you run C2 on top of C1.

tedunangst · on June 6, 2011

bcrypt doesn't work like that. The outline is:

    work * costfactor
    finish

The finish step is irreversible, so you can't unwind it and add more work later. In practice though, even low cost factors for bcrypt are tons more work than MD5 or SHA. It's not a big deal.

tzs · on June 6, 2011

I think what he means by run C2 on top of C1 is something like this. Let b(k,w) mean bcrypt with work factor w on input k.

So, initially, you are storing b(pass,C1). Running C2 on top of C1 would mean replacing that with b(b(pass,C1),C2).

Later, when you want even more work, another layer could be added: b(b(b(pass,C1),C2),C3) and so on.

tedunangst · on June 6, 2011

ugh, no thank you.

palish · on June 6, 2011

He didn't say "unwind it". He said "on top of it".

  bcrypt( bcrypt( "foo", c1 ), c2 )

keeperofdakeys · on June 6, 2011

I would say that, given how 'cheap', you could slowly increment the cost-factor over time. You could also ask a user to replace their password after a certain time period, or after the cost factor gets too 'low' - you can even allow the same password to be entered. Having a different cost-factor for every password is a bit of a logistical nightmare though.

The fact is that any significantly high cost factor is much better than md5, sha1 or any other normal hashing function, and should prevent password cracking for 5-10 years. This is a reasonable period of time.

tedunangst · on June 6, 2011

Dear down voters, parent is accurate (except possibly for saying "big").

Joakal · on June 6, 2011

Codahale's article is misleading with salts. Bcrypt can't also protect against brute-forcing or dictionary attacks. Please don't spread that link as an example of good password storage security.

Edit: Since I can't reply to poster. I don't see bcrypt using a unique salt for each password. It's the equivalent of exchanging a SHA256 + Same Salt for Bcyrpt + Built-in-Salt.

Does bcrypt use a unique salt per account?

tptacek · on June 6, 2011

No; you're incorrect.

You can find the answers to your bcrypt questions at the Usenix paper which is the first link in the article you're advising people not to read.

Joakal · on June 6, 2011

I did, it uses a preset list of salts at the time. That doesn't seem unique per account salting raising the possibility of collisions to me.

tptacek · on June 6, 2011

bcrypt uses 128 bit salts. Every bcrypt library I've seen will generate the salt for you.

    [8:48pm:~] RIDGELAND:tqbf [1:45]% irb -rrubygems -rbcrypt
    >> BCrypt::Password.create("test password")
    => "$2a$10$KyxzQ6ZlAAYdLWwpgCbyjuqo53.LzPbsaSK2BiTnzpYEpRVZOHAAy"
    >> BCrypt::Password.create("test password")
    => "$2a$10$d0uUieyk0NWsU7E9CmQNwOsveyznwKwplphsF73yILdEyStmoUGgG"

Notice how they aren't the same hash? And, how I didn't have to go mine pink salts from the Himalayas?

By spreading the notion that there is something wrong with Coda's article when you really aren't familiar with bcrypt, you're doing people a disservice. They are going to go use SHA1 with a big random salt instead, because they've heard of SHA1 before and it therefore sounds more secure. And they will thus be terribly vulnerable to brute-force attacks.

palish · on June 6, 2011

A preset list of salts would still show a different hash for 2 consecutive bcrypt executions. Your example proves nothing.

I'm not saying you're wrong. I'm saying you didn't demonstrate proof, so you shouldn't act like you did.

I'm extremely interested why bcrypt is much more secure than SHA1 + extremely long hash, so perhaps you could shed some photons on that?

Also, in order for bcrypt to generate a unique salt on two different computers, it needs a unique source of randomness. Obviously seeding by "current time" is insufficient, as this would generate the same salt for two different computers at the same time. So what does it use? Mac address of network card + current time?

tptacek · on June 6, 2011

I don't need to prove that bcrypt has a "preset list of salts". Read the paper. Or even just Coda's summary of the paper. What an extremely silly objection to raise. Why would any cryptosystem embed a preset list of nonces?

How frustrating to have to argue about a system with someone who clearly hasn't done any reading on it at all. It's a Usenix paper. It's not a EUROCRYPT paper full of math. It was written by systems programmers, then from the OpenBSD project. You think the OpenBSD programmers devised a password scheme with a "preset list of nonces"?

"MAC address of the network card plus the current time"? To get a random number?

on June 6, 2011

[deleted]

Xk · on June 6, 2011

[edit: palish deleted his comment; excerpts below for what I was responding to:

> Except random numbers aren't random. They only appear to be. If an online casino seeds its random number generator with current time, then you can do the same, and predict which cards it will deal.

> So here you are saying that bcrypt generates a random seed, and I'm thinking "how does bcrypt get around the aforementioned problem?"

> If you don't want to answer my specific question ("why is bcrypt stronger..."), then don't.

]

I'm neither tptacek nor a security expert, but I think I know bcrypt well enough to answer your questions. Apologies if anything I say isn't correct; someone correct me if that's the case.

Starting with the problem: SHA1+salts are well and good. The password isn't in plaintext, so I can't just read it. SHA1 has no real cryptographic attacks on it. The salts prevent rainbow table attacks.

Except there's a major flaw. SHA1 is fast. SHA1 is very fast.

When you hash your password with SHA1+salt (even if you use a 1024 bit salt) an attacker will get a GPU cluster and run billions of hashes per second until they bruteforce your password. Anything under eight characters will fall in under a day, easily.

Now, let's talk bcrypt. Based off of blowfish (which has a reasonably long key-schedule algorithm), bcrypt is designed to be very slow. Very very slow.

If you want to think of a very naive way of doing this, imagine my hashing function is simply salt||SHA1(SHA1(SHA1(SHA1(...(SHA1(password||salt)))))) It is easy to see how this would take significantly longer than simply salt||SHA1(password||salt). Now there's some other details that you don't seem to be interested in, but that's the main idea. If you care to read more, see the paper.

This means that if an attacker would try to bruteforce even the simplest passwords, it would take significantly longer. Look at the chart on page 11, for example. The nice thing about this is that you can, over time, adjust it to take longer and longer as computers get faster and faster.

So, bcrypt isn't some fancy thing which picks salts in some random way (although it does get salts; as for how, I don't know, but /dev/random would work just fine).

If you care about it and want to look in to it more, scrypt is also interesting -- it also uses as much memory as it can to stop simply throwing just CPU time at a problem.

palish · on June 6, 2011

Thanks for the analysis! Much appreciated.

djmdjm · on June 6, 2011

> Mac address of network card + current time?

you're trolling, right?

If not, begin your journey to enlightenment at http://www.cs.berkeley.edu/~daw/papers/ddj-netscape.html (15 years old now!), http://www.schneier.com/paper-prngs.pdf and the kernel source for your favourite OS' /dev/random (the OpenBSD one is pretty well commented)

caf · on June 6, 2011

Read the bcrypt paper: http://www.usenix.org/events/usenix99/provos/provos_html/ind...

The algorithm requires (cost, salt, password) as input, and the output hash includes the cost and salt. When generating a hash for a new password you should of course be choosing this salt randomly, and when checking a password be using the stored cost and salt.

In particular see the Implementation section:

An important requirement of any bcrypt implementation is that it exploit the full 128-bit salt space. OpenBSD generates the 128-bit bcrypt salt from an arcfour key stream, seeded with random data the kernel collects from device timings.

Joakal · on June 6, 2011

Ah. The original article was saying that it was all built-in (salt), only the password is the input. I misread that.

_3u10 · on June 6, 2011

This article is missing a bunch of facts, namely that NTLM passwords have been broken for 2 to 3 years long before the invention of cheap GPUs. Rainbow tables are a far more effective way to attack NTLM hashes. Cracking passwords is trivially parallelizable, if you let your hashes fall into the hands of the enemy then it's only a matter of money.

The fundamental issue is that of salting combined with not disclosing your password database. If you're using Windows you should enable the options for SSHA and disable the NTLM / LM hashes. Instead of using a password use a passphrase. I use 3 levels of password and at the top level you're looking at ~22 characters which contains characters outside the normal ASCII range so an attack is looking at ~65536 combinations instead of ~256. I doubt many attackers include the snowman character. :) (and no that char isnt in my password)

Running the math quickly yields a time to crack in excess of the lifetime of the universe, it would probably be easier to generate a hash collision than crack the password.

pixeloution · on June 6, 2011

GPUs are rendering MD5 and other weak hashing methods useless when a database is compromised. Using bcrypt [properly] can reduce the attempts the fast computer on the planet could make to a handful per second.

thirsteh · on June 6, 2011

Yep. Don't store passwords using (non-password) fast hashes like MD5 or SHA-256. Use bcrypt: http://codahale.com/how-to-safely-store-a-password/

Joakal · on June 6, 2011

His recommendation of not using salts is very misleading and should not be quoted due to poor advice as Bcrypt is also useless against dictionary and brute force attacks.

The aim of password storage security is to prevent pre-computed hashing reasonably enough. No amount of password hashing even if it takes 1000 years to hash a password if your users use '12345678' as a password.

Salting protects against pre-computed hashes (rainbow tables [0]). Otherwise I can pre-compute many bcrypt hashes and then quickly look up the password.

[0] https://secure.wikimedia.org/wikipedia/en/wiki/Rainbow_table...

tptacek · on June 6, 2011

No; you're incorrect.

All secure password storage schemes are randomized. If you insist on using the word "salt", then just know that bcrypt, scrypt, and the PBKDFs build the salt in. You cannot "pre-compute many bcrypt hashes and quickly look up passwords". So: if you're using a scheme that requires you to think about salts, you are virtually certain to be using an insecure storage scheme.

And, dictionary and brute force attacks are the entire problem bcrypt is designed to address. Go back to that article that you didn't really read, follow the first Usenix link, and read the Usenix paper it cites.

Joakal · on June 6, 2011

I'm concerned that you advocate ignorance of 'salt'.

Some functions[0][1] seem to optionally allow the use of salt, are you saying that users should just use passwords as input with bcrypt hashing?

In the article, it does not suggest only bcrypt as well [2].

[0] http://www.mindrot.org/projects/py-bcrypt/

[1] http://au.php.net/crypt

[2] http://www.usenix.org/events/usenix99/provos/provos_html/nod...

tptacek · on June 6, 2011

Bcrypt uses a 128 bit salt. The code I provided above didn't show it, because the library simply generated it for me.

The bcrypt libraries that require users to explicitly provide a salt are themselves doing their users a (slight) disservice, by making it seem as if there was a reasonable option for salt generation other than simply using a CSPRNG.

djmdjm · on June 6, 2011

> The code I provided above didn't show it, because the library simply generated it for me.

... and this is another reason why people should use one of the bcrypt() implementations for their language* - the API makes doing the wrong thing difficult or impossible.

* disclosure: I ported/wrote a couple of them

bigiain · on June 6, 2011

The problem is, I can onlybchoose the password hashing algorithm when I control the server side code. 99.9% of the time I don't, and I don't even get to find out how the people controlling the server side code are storing by password.

Quick quiz: how does HN store passwords? How about apple.com? Facebook? Twitter? Gawker? Sony Pictures? Perlmonks?

A suitably paranoid person would assume any password you've given to somebody else's website is compromised.

Do not reuse passwords ever. Don't even think "I'm just trying this new web service out, I'll use the same password I always use when trying new things out", 'cause you'll end up forgetting to upgrade that password when something gradually changes from "some new and maybe interesting website" to "somewhere that is an important part of my online reputation" or "somewhere I've given authority to charge my credit card".

(and, even more importantly, don't ever fall into that trap when developing server side code "Oh, I'll just do a quick login method for testing that just stores cleartext passwords, I'll fix that bit up before we go live..." Because one day that code _might_ end up live...)

Zak · on June 6, 2011

how does HN store passwords?

Unsalted SHA1.

    (def shash (str)
      (let fname (+ "/tmp/shash" (rand-string 10))
        (w/outfile f fname (disp str f))
        (let res (tostring (system (+ "openssl dgst -sha1 <" fname)))
          (do1 (cut res 0 (- (len res) 1))
               (rmfile fname)))))

...

    (and user pw (aand (shash pw) (is it (hpasswords* user))))

bigiain · on June 6, 2011

Kind of proving my point. You'd hope somewhere that calls itself "hackernews" and which may as well have an echo-bot set up quoting that codahale article about bcrypt any time someone writes "password" - might be handling passwords "properly". SHA1 isn't really all that different from Gawkers MD5 password ballsup...

Instead, all the top 25 passwords found in the recent Sony exposures from here: http://www.troyhunt.com/2011/06/brief-sony-password-analysis... are found in that well known rainbow table, Google...

iains-imac: bigiain$ echo -n 'bailey' | openssl sha1 b1f45ed147d6803ac1a2a91bdea1fab603f910a5

http://www.google.com/search?q=b1f45ed147d6803ac1a2a91bdea1f...

If HN ever gets 0wn3d, anybody with a (password) dictionary word or 8-9 char or less password will almost certainly be exposed. (even if the database doesn't get broken into, anybody who gets enough shell on the webserver to read /tmp/shash will be able to see cleartext passwords passing through...)

Zak · on June 6, 2011

There isn't a database per se - usernames and password hashes are stored in a file. If someone managed to read that file, it would not be difficult for them to obtain most of the passwords.

rimantas · on June 6, 2011

It may store it as plaintext as well, without HTTPS it does not really matter. I mean it is easier to obtain passwords with tools like firesheep instead of trying to break into DB. On the other hand, it is not like you have something secret there. The worst thing that could happen is someone pretending to be you. That said I'd love to have https option.

bcl · on June 6, 2011

There are at least a few obvious solutions. One is to use long passphrases. Another is to stop trying to memorize your passwords, write them down on a postit, encrypted file, etc. The main threat to your security is not someone digging through the postits in your wallet, it is the hash being stolen from someone's insecure database.

r00fus · on June 6, 2011

The bad part is that some sites (even banks!) force you to limit the number or type of characters. In one ridiculous case, I was forced to use an 8 character pwd with only lowercased letters.

The solution is to use tools like KeePass or 1Password to allow maximum length passwords (randomly generated also) wherever possible.

code_duck · on June 6, 2011

I recently signed up for the new payment system for my local water and gas company... you must use your phone number as your user name, and the password, to my suprise, can only be digits, up to 8 characters.

Pretty horrible. I wonder what level of idiot designs these systems.

cdr · on June 6, 2011

That "new payment system" is quite possibly interfacing with, say, mainframe code written in the 70s. That's often the source of banks' ridiculous account restrictions.

code_duck · on June 6, 2011

I can't see any legitimate technical reason to restrict passwords.

In my example, they could put a sensible account system on the front end, and do whatever they need to in the back end to link it to my phone number and a random 8 digit number, if that's what they need.

Same with banks. They can set up a real user name and password from you, store it in a modern system, and link it to the older system behind the scenes. Perhaps there's some arcane legislation that prevents this, but it seems unlikely.

cdr · on June 6, 2011

That's certainly true, but it doesn't usually seem to happen.

Joakal · on June 6, 2011

It's likely that they are storing passwords in plain text or encrypted. If my website can accept forum posts in UTF-8 of 256+ characters, then people can log in with UTF-8 256+ char passwords.

reemrevnivek · on June 6, 2011

I'm far more worried about the plaintext password being stolen from someone's insecure database. But that only reinforces your argument.

flashingleds · on June 6, 2011

Since this has attracted the attention of some very knowledgeable people, can anyone suggest a good textbook/site/paper discussing password security and encryption? Obviously I can google 'computer security' and find something, I just wondered if there was a well regarded standard reference (ala K&R or Art of Electronics). I don't understand the majority of the techniques being discussed here but would quite like to.

tptacek · on June 6, 2011

Sorry, key derivation functions are kind of a back alley in crypto research. Your best bets are Colin's scrypt paper and the original bcrypt paper; the bcrypt paper is on the link at Coda Hale's "How To Store Passwords" article, which you can find plastered all over this thread, and Colin's paper is at:

http://www.tarsnap.com/scrypt/scrypt.pdf

tedunangst · on June 6, 2011

http://tools.ietf.org/html/rfc2898

pragmatic · on June 6, 2011

This article uses NTLM passwords.

Can anyone compare this to the actual MSFT recommended Kerberos security protocol?

Also, how does one go about getting a list of NTLM passwords to work against? You steal the domain controller or somehow get access to it?

http://en.wikipedia.org/wiki/NTLM MSFT does not recommend using NTML which is very old.

http://msdn.microsoft.com/en-us/library/cc236715(v=PROT.10)....

Implementers should be aware that NTLM does not support any recent cryptographic methods, such as AES or SHA-256. It uses cyclic redundancy check (CRC) or message digest algorithms ([RFC1321]) for integrity, and it uses RC4 for encryption. Deriving a key from a password is as specified in [RFC1320] and [FIPS46-2]. Therefore, applications are generally advised not to use NTLM.<68>

brianbreslin · on June 6, 2011

can't you design your software to prevent brute force logins? i.e. lockout for x seconds if y unsuccessful login attempts are made?

city41 · on June 6, 2011

That doesn't matter if someone has a dump of the database. Which as Sony can tell you, is becoming more common.

pragmatic · on June 6, 2011

Yes, but once you are in to the point where you have the db...what else are you going after?

If you can dump the db then (most likely) you also have the credit cards, other information that is useful to you?

This assumes a hacker has a financial motivation.

ceejayoz · on June 6, 2011

> Yes, but once you are in to the point where you have the db...what else are you going after?

There are many conceivable attacks that would result in a read-only dump of password hashes. An administrator's account would be useful in such an administration, turning your read-only access to read-write access.

city41 · on June 6, 2011

I would hope the db didn't store credit card info or at least encrypted it somehow. Granted that's a pretty big hope. But if you can crack the passwords in the db you have, chances are those passwords will work on other sites. People often use one password for everything.

barrkel · on June 6, 2011

Passwords are frequently reused.

brianbreslin · on June 6, 2011

good point. i'll have to read up on how sony's db got extracted first then.

bonzoesc · on June 8, 2011

SQL injection, social engineering a password, somebody's crappy "secret question," insecure PHP, and hundreds of other ways. Closing off known avenues of attack and still mitigating against a leak with peer-reviewed math is what you have to do.

ndl · on June 6, 2011

Hopefully, the hashes are physically secure and accessed in a way that would require a great deal of human effort to steal. If that is the case, then yes, you can control the login process.

The technique in the article is relevant when one has the hashes and wants the plaintext (and according to some here, still easy to mitigate then) - if you're guessing a web login, different game.

drivebyacct2 · on June 6, 2011

NTLM passwords? Meh. An awful benchmark for evaluating modern passwords. Who uses NTLM besides Windows, honestly? (Apparently the person who seemed to have not liked this comment, very quickly, heh)

I'd be curious about the timeliness of GPU cracking a 8 character, alphanumerical mixed case password hashed with SHA1?

tptacek · on June 6, 2011

Like MD5, SHA1 is designed to be fast. For that matter, so is SHA256, and all the SHA-3 candidate functions. The right way to think about this is that all the strong crypto hash functions are unsuitable for use by themselves as password hashes.

Cryptographers have a mechanism for securely turning human-readable text into secret material: key derivation functions. All the modern KDFs are adaptive, meaning that they come with a "pain" dial that you can turn to force attackers to spend more time executing crypto code paths and less time comparing results.

You can build decent KDFs out of functions like SHA1, but if hardware GPU cracking is the future, you're best off with something like Colin Percival's "scrypt", which was explicitly designed to be memory-hard (requiring lots of state to compute) and hard to parallelize:

http://www.tarsnap.com/scrypt/scrypt-slides.pdf

shaggyfrog · on June 6, 2011

I know you're a domain expert and in this thread you've suggested both bcrypt and scrypt. Is there any advantage to one over the other? From my POV it seems that bcrypt has more implementations and seems better-known (in a good way).

tptacek · on June 6, 2011

scrypt is the sounder scheme, but bcrypt is far more widely available. Both are fine.

pauldino · on June 6, 2011

According to http://www.golubev.com/hashgpu.htm "Recovery speed on ATI HD 5970 peaks at 5600M/s MD5 hashes and 2300M/s SHA1 hashes."

So at peak rates I think that would come out to around 26 hours.

Combinations/SHA1HashesPerSecond/SecondsPerHour

(26 * 2 + 10) ^ 8 / (2300 * 1000000) / 3600 = 26

caf · on June 6, 2011

NTLM hashes are MD4, which is of a comparable speed to SHA1 for short input blocks, so the time would be around the same.

At any rate, you can download ighashgpu and try it for yourself.

drivebyacct2 · on June 6, 2011

Something tells me running ighashgpu through VirtualBox or wine wouldn't be wise.