encode_base64((u_int8_t *) encrypted + strlen(encrypted), ciphertext,
4 * BCRYPT_BLOCKS - 1);
Edit: I think I know what's going on with the key length, too. It just so happens that len(base64.decodestring('x'72)) == 54 -- just below the key size limit. But I don't see where the key is getting base64 decoded in the code. Digging...
Edit #2: The b64d('x'72) == 4 behavior I saw was coincidental. What happens is that subkeys can be up to 72 bytes long but will not affect all bits of ciphertext. This is valid and by design:
/* Schneier specifies a maximum key length of 56 bytes.
* This ensures that every key bit affects every cipher
* bit. However, the subkeys can hold up to 72 bytes.
* Warning: For normal blowfish encryption only 56 bytes
* of the key affect all cipherbits.
As to why, I've never encountered a software implementation that precisely matched its design paper. Usually has a lot to do with publication deadlines. Papers don't change, software does.
Yeah, I thought about that. But I tested it and that wasn't it:
>>> hashpw(chr(ord('x')+128), s)
> subkeys can be up to 72 bytes long
That makes a lot more sense.
The author of this assinine blog post only contacted me a few days ago and obviously couldn't be bothered to wait for a response before proceeding to imply malicious intent for what is clearly a trivial difference between academic paper and practical implementation. I hope he retracts it.
Assuming that this was not intentional -- and considering that the paper clearly supports this -- then it is not totally harmless. The additional likelihood of collision is increased by ~4.16% (8 / 192) which could be a Big Deal (TM) given the right circumstances, although it's still infinitesimally small.
But what worries me more is that you say "this assinine [sic] blog post". The author of the post found what could be an issue, even if it's not in your code, and you write it off completely. That is simply not an appropriate way to handle issues with cryptographic software.
The guy found a design bug/implementation discrepancy in a bit of software with high security requirements.
He didn't know how to evaluate the security impact or what to do about it. But he knew enough to know he should take it seriously. This is commonly referred to as "being paranoid" and is a natural reaction given the situation. Commendable even.
How many people before him spotted this and didn't speak up because they (rightfully) figured they'd be ridiculed by the OpenBSD hyenas?
And, I disagree with your security analysis. Truncating an otherwise secure 192 bit hash to 185 bits is not dangerous.
I agree, especially given the relative sizes involved. But until recently there was debate about the safety of truncating hash values in general. Noting that NIST didn't specify any form of truncated SHA-1, people would say there was no guarantee that the "entropy was evenly distributed".
You are talking about percentages of quantities denominated in atoms-in-the-planet-core.
This is a common fallacy. There is no relationship between the cardinality of the possible hash values and the number of atoms in some planetary body.
In many (perhaps most) contexts, the critical security factor is the square root of that number. Shall we then look for another smaller planetary body with which to consider that number?
You can get advice about truncating SHA512 from Furguson and Schneier; that book is 5 years old. Since we appear to agree about the important bit, I'm disinclined to argue about how many atoms are in the core or whatnot. I fully own up to not actually having counted them. :|
Seriously? One doesn't have to hang around those lists long to see how they treat those they disagree with.
You can get advice about truncating SHA512 from Furguson and Schneier; that book is 5 years old.
Sure, but that wasn't written in the late 1990s when bcrypt was presented and we certainly can't expect the blogger to have read up on the modern advice. Bcrypt doesn't even use SHA-512, it uses the block cipher Blowfish in a suboptimal custom mode to construct its own 192-bit narrow-pipe hash function. When Microsoft rolled their own password hash functions with DES just a few years earlier (LanMan, NTLM) it was thoroughly pwned by Schneier, Mudge, Wagner, L0pht, etc. But you know this!
As for the last part of your comment, we appear to be searching for reasons to disagree with each other, when in fact we agree that this finding has no impact on the security of bcrypt. Let's not find artificial reasons to disagree.
I was wrong about bcrypt being a 192-bit narrow-pipe hash. While some parts of it are, reading the paper again the real work happens in the key expansion function having an internal state size of 33344 bits.
No, commendable would have been waiting a reasonable amount of time for the author to respond, and not including suggestions that the bug (that they didn't analyse the impact of) was an intentional backdoor.
... ridiculed by the OpenBSD hyenas?
Cheap, ad hominem, etc.
Noting that NIST didn't specify any form of truncated SHA-1
Not SHA-1, but SHA-224 and SHA-384 are truncated hashes specified by NIST.
In many (perhaps most) contexts, the critical security factor is the square root of that number
Not in this case, unless you have corpora of ~2^80 bcrypt hashes and want to find a collision with one of them.
All that said! Bcrypt is still secure by all means and better off than damn near anything else out there.
Edit: note, such theoretical flaws in bcrypt have not even been proposed. Didn't want to make this seem worse than it is.
But instead of spending time talking about the interesting finding, this blogger clubbed their own reputation to death on it. I'd like to presume they were just spectacularly bored, but to steal some of the author's own words, that's "vastly too big a discrepancy to be explainable by a simple inadvertent bug".
I'll leave the rest of the analysis as to the effect of the problem to your much-more-knowledgable hands.
No it didn't. And I quote:
"Except that the problem is not in the python wrapper."
It explicitly says very much the opposite, pointing out that the problem isn't the wrapper. I can't understand how you could read the blog post and come away with such an impression.
OK, so maybe someone accidentally introduced an off-by-one error into the python wrapper. Except that the problem is not in the python wrapper. You can find bcrypt test vectors on the web, and they are all 60-byte strings.
That is a very big discrepancy between the actual behavior of the code and the description given in the literature. It's vastly too big a discrepancy to be explainable by a simple inadvertent bug.
Now, some people might say I'm being excessively paranoid, but I don't think so. The higher the stakes in the internet security game get, the more incentive there is for attackers to try all kinds of sneaky and nefarious tricks to introduce weaknesses into people's defenses, and one of the easiest ways to do that is to publish some plausible-looking open-source security code that actually has a hidden weakness built in to it [!!!] and hope that nobody notices.
Give me a break. And lest the tl;dr's think I'm cherry picking: this stuff is the bulk of the post.
Ed: fixed ems.
Let me highlight some things from your quotes.
1. "Except that the problem is not in the python wrapper"
- This directly says that the Python author is not being malicious---how can he be since he isn't at fault for the error?
The essayist continues to explain why the Python author is not at fault.
2. "You can find bcrypt test vectors on the web, and they are all 60-byte strings."
- This bug is wide-spread.
3. "That is a very big discrepancy between the actual behavior of the code and the description given in the literature. It's vastly too big a discrepancy to be explainable by a simple inadvertent bug."
- The bug is too large to be an inadverdent bug introduced independently by every author.
4. "Now, some people might say I'm being excessively paranoid, but I don't think so. <snip>"
Here the author is only defending his rationale for looking for bugs in an open source project. He isn't implying that he has now found one that shows the authors are malicious.
Edit: thx :).
That's a pretty weird design decision to use a 128 bit salt and then chop bits off the actual hash value to make the result a "more manageable" length.
The 128 bit salt wastes 4 bits in the base64 encoding. The 31 character base64 discards 8 of the 192 bit output bits (31/4*24 = 186).
If they'd just used "only" a 126 bit salt they could base64 encode it in 21 chars with no wasted space. That would allow them to store the full 192 bits in 32 chars with no wasted space.
So they threw away 8 hash output bits in order to save 2 salt bits.
Now for the request (sigh, you knew it was coming): at my day job, I have to develop on Windows and it's not feasible to install Visual Studio 2008 Express on my web server. Is there any other practical way to get py-bcrypt running in that environment?
If you can create a build environment on your workstation, IIRC there is a bdist_win (or similar) setup.py target that will build a .msi package that you can load on your webserver. That's about the limit of my Windows Python knowledge right there :|
I also don't know if that would pose as a significant security threat - sure, you would be taking one character off of the number of characters that need to be brute forced, but it is only one. I'm not informed enough to give an accurate opinion.
I do know, though, that jumping to conclusions before a thorough explanation is provided is silly... Hence why I'm not suddenly jumping to the use of cryptacular or others.
Reducing a key by 8 bits reduces the key space by 255/256, regardless of key size.
If Openwall and py-bcrypt are using JtR code for actually validating them, that's a questionable bit of software engineering. JtR may not be doing the same type of input validation that one would want in your authentication code. More evidence for this suspicion is that the input length disparity the blogger Rondam describes.
I usually include a verbose, independent scheme string next to encrypted db columns, so a) data is self-documenting for future owners, providing fair detail to work around forward breakage/compatibility and b) have multiple methods living in the database during upgrades.
Have an extra column describing the hashing scheme used for the password?
Isn't that what the $2a$12$ is for?