Hacker News new | past | comments | ask | show | jobs | submit login

> 10% more security (6.55 vs. 5.95 bits per character)

That's not how this works. By your logic having a password consisting of 1,2,3,4 is only twice as secure as having just 1,2.




That's absolutely how bits of entropy work.


However symbol frequency is also significant for entropy.

Do you think 1 in 25 four letter passwords contain a backtick?

If you were brute forcing an ASCII password (no whitespace), would you naively cycle from ! up to ~ for each character?


The context is randomly generated passwords, so dictionary attacks (or other attacks that look at the plaintext from a Huffman encoding perspective) aren't really relevant.


That's most definitely not how security works. The strength of your password is not proportional to the number of bits of entropy it has.


The way you're phrasing this may be misleading.

The strength of a password / passphrase increases with the power of 2 raised to the bits of entropy.

That's an exponential proportion, rather than a linear one. But a proportion all the same.

Example:

Given mixed-case alphanumeric (62 characters) and an 8-character password length, the number of combinations is:

    62^8 = 218,340,105,584,896 (keyspace -- 218 quadrillion)
    l(62^8)/l(2) = 47.6 (bits of entropy)
A 10 character password (if randomly chosen from the same character set) has 10^17 possibly combinations (about 4,000x more), and 59.4 bits of entropy, 11.8 bits more. 2^11 = 2048.


In the context of randomly generated passwords, it's absolutely ok to think about it in terms of the logarithmic relationship between 1) entropy per symbol times number of symbols and 2) strength of the password.

He said 10% stronger (which I took to mean 10% more entropy), not 10% more time to crack.


> He said 10% stronger (which I took to mean 10% more entropy), not 10% more time to crack.

Hence the problem?

Yes, measuring "strength" by "bits of entropy" is technically correct (the best kind of correct...).

It's also exponentially misleading... possibly the worst kind of misleading?

Just look at the question: "Is there even a reason to include special characters in passwords? They add 10% more to security...". I don't know about you, but to me doesn't really portray an understanding of the fact that it takes twenty-five times longer to crack such a password for merely 8 characters, not merely 10%.


I mean, counting in entropy with the knowledge that the applied effects can be logarithmic is the standard way of discussing such matters. It's sort of the basis for the information theory that's underneath this type of work.

Edit: And the point of his argument is that more symbols of a smaller corpus of symbols can be equivalent if the entropy is equivalent.


According to KeePass2, the password: "12" contains 7 bits of entropy, but "1234" only contains 5 bits of entropy.

Is that right?


I wouldn't trust it. If you use the "Hex key - 128-bit" preset, it returns a different amount of bits every time you click it. Here are 3 samples:

    3f38ba8a6ce3aa800f007c2e431df7fd  124 bits
    9339bf587ee11b12d207df846a879cf4  129 bits
    8ca4354a9038df590fecec1f964062fd  121 bits


Due to missing or repeated characters from the set of the hex alphabet?


which doesn't make sense.

I randomly generated an 8 character alphabetical (all lower case) password "jraxxhwr". According to keepass it has 32 bits of entropy, but the entropy should be 26^8 = 37.6 bits because the search space is all 8 character letter permutations. There's no way you can reduce the search space from 37.6 bits to 32 bits unless you have an oracle that says which characters I used.


It does make sense, because the keepass entropy estimate presumably (like the excellent zxcvbn) tries to approximate the empirical distribution, not the theoretical uniform one.

In theory, "68703649" and "12345678" are equally likely to be pulled from the hat, but in practice one is a much better password than the other. You can reduce the search space by trying the passwords with higher (empirical) probability first.


> the keepass entropy estimate presumably […] tries

KeePass sources are available [0], you can see the specific algorithms it uses in [1].

[0]: https://sourceforge.net/projects/keepass/files/KeePass%202.x...

[1]: https://fossies.org/windows/misc/KeePass-2.42.1-Source.zip/K...


Thanks. I've looked at the code, and it does not seem to try to estimate the empirical distribution (doesn't appear to be using dictionaries, for examples).

Then the discrepancy maybe comes from the number of glyphs within certain categories, or their repetition?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: