Note the title is wrong but the linked response from Facebook describes what they actually do quite succinctly. So if you're coming straight to the comments, do take the 15 seconds to click through and read it!
You don't have to wonder if it "seems like going to far" because you can calculate almost exactly the reduction in the search space / entropy.
But don't just go based on the theoretical key space reduction. You have to look at the kinds of passwords that people actually choose and see if an online attack which gets on the order of 10 tries against a password before an account is locked would actually be marginally more effective because each submission is technically trying 4 passwords instead of 1?
The reality is the 4 additional tries that Facebook is giving you for free are not passwords that would be in the attacker dictionary in an attack which is allowed only 10-100 tries.
Whereas the 4 tries are probably at around 1% of the time, in other words billions of logins per year, result in a valid login for a user who knows the right password but simply failed to type it correctly.
Another key point to consider is if you give users these 4 extra tries for free on each submit, you cut down drastically (power law on the types of errors that users make) on the number of failed login attempts overall. When this happens you can now reduce the account locking threshold without catching a larger percentage of your real users in the "your account has been locked" trap. This results in a significant net improvement for security against online attack.
If an attacker knew it consisted of only [A, a] at random it would take 2^50 guesses. But now it only takes a single guess, just all A's. It's a toy example, but Facebook definitely isn't just giving four extra guesses for free.
I'm sorry, this is incorrect. Facebook does not store additional hashes. If the provided string does not match, they also try the following 3 variations;
1) String with case inverted -- effectively undo capslock
2) String with first character lowercased, if it was uppercase
3) String with last character removed
So your example, a single guess of all 'A's would not match. You can try it yourself if you like!
EDIT: So it's actually 3 extra tries, not 4 extra tries.
Seems very reasonable to me. Similarly, I just had a friend comment that they like that Facebook tells them when they try to log in with an old password. But technically that makes it less secure for people who reuse passwords (ie everyone, to a first order approximation).
It's not case-insensitive. Try it. I knew they had the case-flipped version but I didn't know about the others. Why would a user's password have an extra character on the end anyway?
>Why would a user's password have an extra character on the end anyway?
That seems the strangest of them all: "If the login fails, try stripping a character off the end and see if that matches". I can understand the "First character inadvertently capitalized one" somewhat, but the CAPS LOCK thing seems unforgivable. I just did that yesterday with Apple and it simple suggested to me caps lock might be on. If your test for that is (password didn't match AND all alpha characters in the submission are capital letters) you don't even need to store a second hash for it.
I know there will always be trade-offs between security and UX but, if this image is accurate, that seems like going too far in favor of usability. No matter how small, multiplying the numerator in the probability of matching a password is a dangerous decision.
One additional character may be for "I fatfingered a key as I pressed Enter".
You don't need to store a second hash for any of these; just transform the sent password with (invert all characters, invert one character, delete last character), rehash it and compare it to correct.
Whilst it does increase the numerator, it also allows Facebook to decrease the number of obviously wrong password attempts permitted before taking additional precautions. Since these are overwhelmingly likely to be user error, I think it's a perfectly reasonable tradeoff.
How can they get away with not storing at least a second hash? I don't see how an all-caps password can be matched to an any cased password, unless you only store the hash to the all-caps password.
Note that this behavior is OS specific. Windows will invert case (shift gives lowercase), while macOS will be all-caps (shift doesn't do anything).
Frankly I can't remember the Windows approach ever being useful to me, and more often than not it bites me in the ass because I habitually hit something like shift-i to type "I" and end up with a lowercase i instead.
Drafters (the guys who draw engineering drawings) love it because you very rarely but just enough have to type lowercase letters (such as units, like 3mm) whilst all other text must be in capitals. Thus, you just use shift to shift it to lowercase for those rare moments and no other times.
Thanks! Case inverted makes sense. They can invert the case before hashing and submitting. If any of the four hashes match the stored hash, they get in.
I think what was meant was that they don’t store a second hash in order to allow chopping off a character at the end. I agree with you that they would need to store extra hashes for allowing all-caps and some of the other cases.
No extra hashes needed for an all caps version or inverted case version of a password, just perform those transformations on the user input, hash, and compare the result to the single stored hash.
> That seems the strangest of them all: "If the login fails, try stripping a character off the end and see if that matches".
Not really.
(The other ones aren't strange either, but I admit I have never heard of this practice.)
If you have the password stored in Notepad and there's a space behind it. Or you already have a space behind it, and then entered it. Or it is a DOS document you copy/paste from. Quite common. But I suppose the other 2 are more common.
One wonders how they discovered these common mistakes. The only way to do a systematic study would be to keep some un-hashed passwords around and compare.
No, you could think up 20 different plausible "common mistakes", add fixes for those and roll out to some subset of users, and then just count the improvement rate of each fix to get the best four.
Actually, you don't have to keep unhashed passwords around. You can just return the wrong password in a hidden field. When the form is posted, if the hidden field is set, you can do some analysis. This might involve storing some 'intermediate state' of the Levenshtein algorithm, for example, to find the shortest edit distance. You would only store that information anonymously, without storing the passwords. (I'm not saying they do this, that's just of the top of my head.)
Edit: Or just keep the (wrong, correct) pairs for users where correct is in the top 20 most common passwords.
I noticed this issue on Wellsfargo as well. I did some research and learned it is due to being backwards compatible with phone banking. No shift key on the telephone.
My outside impression is that security is taken seriously enough at Facebook to hire highly competent people to run it. They pay out bug bounties, and run internal red team exercises to test their own ability to detect and react to a compromise. https://threatpost.com/how-facebook-prepared-be-hacked-03081...
I think we can extend them enough benefit of doubt to rule out storing user passwords in plaintext.
During my new grad interview there I was made privy to some very questionable security decisions that they mentioned linked to PCI-DSS, but nothing on this level.
Edit: PCI compliance rules out plaintext storage so it seems unlikely.
It's worth noting that plaintext storage might not be the case if they store the fourth and only the fourth character separately at encryption time.
The only people that can positively rule out plaintext storage are Facebook engineers, which are generally competent, especially with something as simple as storing password hashes.
password is not case insensitive, it looks to me that they store 4 password version (one for each variant) so they check both 4 in parallel to see if there is a match, it is not bad, it will be worst if they have the password in clear and they see if the inputted one match the stored one in one of the 4 variants (calculated on the fly) that will be bad because means that the password are in clear text somewhere :(
Why store 4 versions? They just store the hash of the correct one and try hashing up to 4 versions of whatever you enter. As soon as one of those corresponds to the hash, you're in.
Facebook has been doing this for years and I assumed that this is exactly what they do. It saves on db space. Three extra hashes for each user adds up with that many users. And if they decide they need a new variant, they just need to release a new auth module, not recalculate hashes next time the user logs in. Or if they deem that one of these variants is no long worthwhile (for security or ux reason), again, they are not modifying the db.
You don't have to wonder if it "seems like going to far" because you can calculate almost exactly the reduction in the search space / entropy.
But don't just go based on the theoretical key space reduction. You have to look at the kinds of passwords that people actually choose and see if an online attack which gets on the order of 10 tries against a password before an account is locked would actually be marginally more effective because each submission is technically trying 4 passwords instead of 1?
The reality is the 4 additional tries that Facebook is giving you for free are not passwords that would be in the attacker dictionary in an attack which is allowed only 10-100 tries.
Whereas the 4 tries are probably at around 1% of the time, in other words billions of logins per year, result in a valid login for a user who knows the right password but simply failed to type it correctly.
Another key point to consider is if you give users these 4 extra tries for free on each submit, you cut down drastically (power law on the types of errors that users make) on the number of failed login attempts overall. When this happens you can now reduce the account locking threshold without catching a larger percentage of your real users in the "your account has been locked" trap. This results in a significant net improvement for security against online attack.