Hacker News new | past | comments | ask | show | jobs | submit login

Well, calculating the "true" strength is difficult to do, because even though sophisticated tools are available to aid the process, the attackers are still human, and can input their own guesses that may or may not be more accurate. If the attacker knows (or can closely guess) the password rules used to generate your password, he or she has a better chance of getting a hit.

Let's look at a password like "My first car was a 1972 Monte Carlo". The password is 35 chars, 3 upper case, 6 special (spaces), and 4 numbers. The key space is all upper and lowercase english letters, all numbers, and all special characters. That's a key space of 95 characters, over 35 places. Objectively, there are 1.66 x 10^69 possible combinations. Given that the LinkedIn password crackers are slowed down at about 9 chars it seems like you're incredibly secure. But let's assume the attacker knows something about your password structure. Let's say they know that you use words (many people do, so it's a reasonable guess). Let's also assume that for numbers the attacker knows that years are popular for password numbers. Now instead of 35 chars, your password has 7 words and a date. We've changed the key space from 95 to about 100,000. (The exact number of words there are is a tricky number to pin down, but crackers have some good data on what the most popular ones are.) As for the date, there are really only a couple hundred interesting numbers, including all dates from this and last century, as well as common patterns.

Password strength is (key depth) ^ (key length). An uninformed attacker has 1.66 x 10^69 possible combinations (95^35), while an informed attacker has roughly 1.0 x 10^40 possible combinations (100,000^8). Obviously, the less an attacker knows (or can guess) about your password structure, the better chances your password has against being cracked.

Now, you asked about your password versus a random 8 char password. Let's take a "strong" password like "1~qQ%57h" This password also has upper and lowercase letters, numbers, and symbols. We can assume that there is nothing predictable about this password for this exercise. The password strength is 95^8, or 6.6 x 10^15, obviously much lower than the longer sentence, even if the attacker knows the sentence is 7 words and a date.

Now remember, our passwords are being matched against human crackers attempting to guess the ways our passwords are most likely put together. For now, most passwords are 6-12 characters. In fact, most websites only allow passwords of these kinds, so it makes the most sense for crackers to go after these passwords. But it's still an arms race. If we assume that webmasters see the light and allow (or enforce) long, sentence-like passwords, the crackers will adjust. It's plausible I think that 5-10 years from now, we'll see articles like this one that use sentence structure syntax as an attack method.

Until we discover and implement a better system that obsoletes passwords, the best we can really do is have long, complex, and unique passwords for everywhere we go, and have a system to manage them for us. I believe that something like LastPass or KeePass are the way to go for now.

*Disclaimer: This was written on a groggy Sunday morning. Do not rely on my calculations. Do not use any of the examples as passwords. Do please check my work.

Beautifully written. Also worth noting is that sites exist that only use lower(trunc(password, 8), so your first 8 characters should be sufficiently random. For the grandparent, that leaves "my first", which is especially weak in a dictionary attack.

I don't get it. Is there a reason for some sites to actually do that? (considering that they don't store your password as plaintext)

I guess if someone stole their database it would be impossible to know your real password, but still...

Or am I missing something here?

Some sites lowercase all passwords after they are input to "help" users who hit caps lock or are otherwise challenged by case sensitivity. Then you have DES crypt (as once used by Gawker), which only uses the first 8 characters of the password. A site which uses either or both of these methods may happily let you type in a password of any length or complexity, but the version they use will have significantly lower entropy. I've even seen sites silently strip special characters.

> I don't get it. Is there a reason for some sites to actually do that?

Yes it's to save space ...

No, wait.

It's so they don't use all the CPU power ...

No, not that either.

It's because the programmer didn't want to use their braincells.

Yeah, that would be it.

I've heard of sites truncating to only use the first 8-12 characters as well. So if you are going to use lots of words, put them after a highly complex first 8 characters.

One improvement: for most people, the risk is not that someone tries to crack your password, it is that someone uses rainbow tables to crack many passwords, one of which may be yours.

Rainbow tables have a degree of freedom: the function that maps hashes back to passwords. You should try and pick a password that that function will never generate. To get that, do something unique. Good options, I think, are including a foreign language word (neither English nor your native language, nor the site's language), reversing a word or a syllable inside it, and made up words that have Hamming distance greater than two to any other 'obvious' word.

Short (<= 8 characters) passwords, I think, are bad choices for that reason, even if they consist of ASCII gibberish.

Disclaimer: I have never looked what kind of code commonly used rainbow tables use.

I thought GPUs killed rainbow tables? (the storage space alone makes them impractical compared to cracking realtime)

Not that that says much, but I am not aware of that. More importantly, googling for "GPU vs rainbow table" leads me to phrases such as "a fully GPU accelerated set of rainbow table tools". Or has the term changed meaning?


And a lot of systems (i.e. linux) will reject it as a "dictionary word", even if such words don't appear in its dictionary. Such a password in an obscure foreign language isn't going to be cracked until someone starts doing a full-space search, which is still very difficult once you have more than 8 characters, yet it will still be rejected.

Can you explain more about this rainbow table function? From what I understand, rainbow tables are simply precomputed hashes of common passwords. What you're saying is that we should use passwords that aren't in a rainbow table, which by definition implies that the passwords are not common.

Rainbow tables are a clever way to implement a time/space trade-off for finding the inverse of a hash value in general by doing a lot of precalculation (see wikipedia, the core nice idea there is explained under "hash chains" in the Rainbow Table page).

Besides, rainbow tables are supposed to be pointless because everyone's supposed to be using salt with their passwords...

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact