Hacker News new | comments | show | ask | jobs | submit login
Turbo-charged cracking comes to long passwords (arstechnica.com)
50 points by sunsu 1400 days ago | hide | past | web | 47 comments | favorite

This entire debate around typed passwords feels like we're still trying to create "a faster horse" instead of "building a car".

We're talking about letting a machine/program know you are really you. Right now a lot people concerned with this topic also carry around advanced computers (smartphones) with HD camera's, GPS, Compas and Near Field or Bluetooth communication capabilities. Surely it would be possible to use something else than keyboard-input to let a machine know you are really you?

A login by 180° close up picture of your face combined with location and IMEI or something? Combining bio data with location and machine or what not... Any ideas?

You're confusing authentication with identification.

Many (most?) web services and software only require the former, not the later. They don't need to know that you're Joe Bloggs, only that you're the unidentified person that created the account and are the one entitled to use it.

This is a more and more important distinction as privacy issues grow.

I think you're right. This makes it simpler. All that is required is an authentication. Right now this is 'input through keyboard' because a keyboard was the only way to provide input. Thinking about other forms of input would help.

Mozilla Persona.

No password needed at all - your email provider verifies your identity to your browser, and then your browser uses the key generated from that to sign you in.

And how are you signed into your email provider?

Your identity provider doesn't need to be your email provider. You can use something like https://www.persowna.net/ (disclaimer: I wrote it), which can use any manner of authentication it likes. I may just add client SSL certificates as a way of authentication, that will be interesting.

You can also self-host and choose any authentication you want, however specific to you.

Also, this isn't directly relevant to what the GP said, but using a standardized, decentralized way to authenticate is a huge win. It effectively turns all sites into requiring arbitrarily strong authentication (at the very least, everything suddenly supports two-factor auth when it supports Persona), but you still don't have to trust a third party with your authentication (you can run it on your home computer).

It's Personas all the way down... ?

The problem with biometrics and the like is that you can always capture the data somewhere along the way and then fake a client to imitate it. So if your passwordish thing gets compromised, your account can be compromised.

And once that happens, you can't change a password that is your face.

Sure it's fast on algorithms that were designed to be fast (SHA1, MD5, various jokes thought up in five minutes, etc). However, what you want to look at is a proper method. From the performance table:

    bcrypt $2a$ 	3788 c/s 	1583 c/s 	3861 c/s 	626 c/s
That will take a while...

It's also still attacking based on a lot of assumed patterns. Any passphrase that is a quote from popular literature or movies is bad choice. At this point any grammatically correct phrase is probably a bad choice, as that's a much smaller pool than you get with and handful of truly random words (e.g. as you get with diceware).

I hope we see a much stronger push to two-factor auth soon. Even if it's imperfect (e.g. SMS messages) it's still a huge extra hurdle for the typical cracker.

I wonder whether "correct horse battery staple" is in the RockYou password list.

I just checked and it's not there, I think the leak predates the comic.

It's important to remember one key takeaway from this. It still applies to offline hashes.

That means, if your db is compromised, it's pretty much game over for your users (who will invariably reuse them elsewhere) if you used simple hashes - another key takeaway. Bcrypt (or better yet, scrypt) is still the better option for this very reason.

Edit: To clarify, it doesn't matter that you limit logins, lockout users after failed login attempts etc... (although, those are good measures to start) The password hashing scheme must withstand bruteforce cracking of this nature to a feasible degree regardless of what limiting protocols you have in place. Just in case...

passwords will continue to be the weakest link in the chain for many systems until we start forcing uniqueness:

every webmaster/admin cracks their collection of hashes against common wordlists, and if any given password appears too often (or at all, against a weak wordlist), user is forced to change it.

"But complex passwords are too difficult to remember"

Not necessarily:


"Must I have a different one of those for every service?!"

Nope. ebay, twitter,facebook:




also, as has been pointed out - bcrypt.

are there any hash ciphers that allow offline cost stretching yet?

That's not a good scheme to be promoting. If I have your twitter password, I have all your passwords. Also, by adding this restriction you are giving away the passwords of other users in the system.

No, that's the point of randomly modifying each password per system.

The example was over simplified, but i guarantee you if you got one of my passwords you wouldn't work out where the per-system bit is.

If I got two of them I might though :)

These days I just use a password manager with a big phrase for the master. I couldn't tell you what any of my other passwords are.

So exactly why was there a 15-char restriction in the old version? Wouldn't a dictionary attack be fast regardless of length?

55 chars. The MD5 and SHA-1 algorithms process data in 512-bit (64 byte) blocks, where the last bytes are the 0x80 0x00 ... 0x00 padding followed by the message size. A message with 55 bytes will end with [0x80 0x37 0x00 0x00 0x00 0x00 0x00 0x00 0x00], a message with 54 bytes will end with [0x80 0x00 0x36 0x00 0x00 0x00 0x00 0x00 0x00 0x00], and so on.

So the crackers are optimized for single-block messages (passwords), since making the length generic would slow things down. I guess they've added support for that now.

probably something making 16 byte memory accesses simpler/faster than bigger ones on GPUs (eg by fitting in some cache or something). GPGPU architectures can still be bit weird.

So, unless you use a common phrase (spelled correctly) - Long passwords are still a great idea because it would take forever to generate a 55 character random password. Is that correct?

Short answer: if you're talking about 55 perfectly randomly chosen mixed-case alphanumeric characters, yes. Otherwise, you're going to have to be more precise about your definition of "random password" if you're going to perform any analysis.

>>> from math import log

>>> log( 62 * * 55 ) / log( 2 )


55 character uniformly cryptographically chosen alphanum passwords contain over 327 bits of entropy. If an attacker were using an irreversible but thermodynamically ideal computer floating in space passively cooled to the temperature of cosmic background radiation... the amount of waste heat from her computer would be enough to boil away Earth's oceans. I'm too lazy to look up Bruce Schneier's calculation for counting to 2 * * 256, but counting to 2 * * 327 may very well require more energy than our Sun will output between now and when it goes cold.

Now, all of the basic qbit operations are reversible, so the above theoretical lower bound on waste heat doesn't apply to a quantum computer. However, if the hash algorithm doesn't have any weaknesses and has at least 326 bits of state in an optimal implementation, it will still take an attacker 2 * * 163 hash operations on a quantum computer, so at least 2 * * 163 times the clock cycle of the quantum computer.

Or... even given some rather weak assumptions about SHA-1, if your bank and FunnyCatPitcures.com both use salted SHA-1, and the stolen password hash is 160 bits, but the password itself is 327 bits, an ideal attacker is going to find your password along with about 2 * * 167 false correct answers. As long as your bank uses a different salt for your account than FunnyCatPictures.com did, then having your password hash from FunnyCatPictures.com doesn't help an attacker.

Basically, as long as your password has at least twice as many bits of entropy as the stolen salted hash, the number of false positives (hash collisions) from cracking the hash is more than the expected number of random guesses needed to get a false positive (hash collision) at your bank, so the attacker is better off just throwing random guesses at the bank.

How this affects popular premise that a passphrase is much better than a password?

The popular myth you mean. Human brain is crap RNG and producer of entropy no matter what you do - it will still be crap.

Phrases with good entropy are easier to remember than 'words' with good entropy.

  Horse.. in a box riding a fish!
Which one would you prefer to memorize and type in?

Some nitpicking: the passphrase is not as good as it seems. It consists of two parts ("horse in a box" and "riding a fish") both of which can be found in e.g. Google's n-gram corpus [1]. The Google's corpus has on the order of 2^32 n-grams so even assuming you chose these two at random, you get only 64 bits of entropy. There is also some punctuation (let's be generous and say it gives 10 bits), so the entire passphrase is about 74 bits of entropy. While very good, this is probably not enough to protect your bitcoin life savings or a truecrypt container that can implicate you in a serious crime (the reason being that Moore's law may well make 74-bit passwords easily crackable in 10 years or so)

For comparison, the first password you provided (16 random alphanumeric characters) has 95 bits of entropy.

[1] http://books.google.com/ngrams/graph?content=horse+in+a+box%...

Until you find out the service you use silently drops everything after the 8th character...

Passphrases do not need to be (completely) human-generated.

On a related note, has anyone analyzed the entropy of markov chain generated passphrases?

A while back I played with this a bit.




Summary: Highly entropic Markov-generated phrases are long and hard to remember.

What if I do:

    echo $( shuf -n 4 /usr/local/share/4000-common-words )

still won't help. This is security trough obscurity. It works only if the attacker don't suppose you use passphrase.

4000^4 gives 256000000000000 giving 3.3 bits of entropy per decimal digit it comes to 50 bits of entropy. Not too shabby but not that secure either. Your PCs rng may play you dirty tricks.

And of course there are all kind of legacy systems with password limitations to 32 or 16 character, but above 8 etc etc which would further reduce the pool.

Of course you could try your own password deriving mechanism. Take the first 16 characters of bcrypt(username,site domain) it will produce awesome passwords for any site that you will have easy time producing when needed. Until the hackers begin to suspect what you use if it becomes widespread.

(Disclaimer - not a cryptographer or security expert or particularly competent in anything)

> Take the first 16 characters of bcrypt(username,site domain)

Hey, that's the very definition of security through obscurity ;)

Here's a thought experiment I use when estimating security of similar password schemes: imagine you asked someone to come up with 1000 different mechanisms of generating passwords based on username and domain. Is your scheme is likely to be among them? If yes, this means it provides less than 10 bits of security.

What do you mean "not that secure either"? 8 random upper-lower-numeric characters are 47.6 bits of entropy, that is 47.8. I'd say that's decently secure, and the suggestion that you bcrypt your username with the site's domain for a salt is pretty much the definition of security through obscurity, like the other commenter said.

Please don't discourage good practices. Four random words is a lot better than "password123", though it would still take 1.5 day to crack it if it were stored as an MD5 hash. Six words would take 65 years at 1ghash/sec, which is pretty damn good, and better than a 12-char random password. 5 words would take 16 years, which seems like a pretty good compromise.

EDIT: Although, I don't like straight-up Shannon entropy as a measure of password strength.

I gave the bcrypt example as a anti-pattern if I didn't make myself clear.

Any password derivation scheme works brilliantly until you are the only one using it. The moment it becomes widespread and people begin to target it - it goes anywhere from significantly weaker to trivial to crack.

The problem with that is that password derivation is entirely reasonable and encouraged. It's what PBKDF2 does (the "KD" stands for "Key Derivation"). Securing your passphrase with a few thousand rounds of bcrypt and salting with the domain is a great way to strengthen it, since you don't have to trust any shitty MD5 password storage mechanism the site has, the attacker has to brute-force bcrypt to be able to get your master passphrase.

It is not security through obscurity. It works even when the attacker knows I use a passphrase and has a copy of my dictionary. It's about the same security as a random 8 character password with lower and upper case, numbers and choice of 10 symbols.

Maybe that's no longer secure enough; I don't know how fast password crackers are now. So use "shuf -n 5" instead.

It doesn't work when the attacker knows you use this scheme, your username and the site's domain. There's no secret in that scheme, therefore it's pretty much the exact definition of security through obscurity. Add a password to it, though, and you pretty much have SuperGenPass.

I think you replied to the wrong person here. I'm saying that the passphrase does not rely on security through obscurity. venomsnake's scheme does, though it appears he meant it to be exactly that (ie. an example).

Yeah, that was weird. You are correct in that comment, I posted the entropy equivalents in another comment in this thread (using 5 words is pretty secure).

oof. If you're trying to be secure, why oh why would you use a phrase popular or common enough that you could think of it off the top of your head. Think, people.

But you can use a phrase that is uncommon (even unique) but easy to remember. https://xkcd.com/936/

Diceware is pretty good for generating these types of passphrases http://world.std.com/~reinhold/diceware.html

Why would you use a password at all? Use keyfiles for everything encrypted and then use keypassx with dropbox sync or 1password/lastpass if you can afford it.

> Why would you use a password at all?

At all? You still need a password for your keyfile. And probably you need an OS login before that.

Call me naive. but I've never understood why it is so difficult to simply thwart brut-force attacks by setting limits on how many times a password can be entered. Wouldn't that simply defeat all efforts if, e.g., after 20 password attempts the account is locked and has to be manually unlocked through a human process or simply locked for a certain amount of time that would seriously thwart the effectiveness of additional attempts.

Oh, they certainly do that. This is mostly about offline attacks where someone has stolen a database of usernames and password hashes, and if they figure out the passwords from the has they can access the account on that website (if they were undetected), and they can also hope that users have the same login/password combination on other websites.

However, if you're going to use a brute-force attack on a website the way to do it is to try a common password on every username you can think of, not every password on a single username. And a way to prevent them from simply blocking your IP address, like using a botnet.

From the comments:

> This isn't an online attack. Hashcat and variants work by attacking the password hashes after they've been leaked/stolen somehow. Not by attempting billions of logins per second over the Internet (if any server allowed that... someone needs to change careers).

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact