Hacker News new | past | comments | ask | show | jobs | submit login
Authentication Cheat Sheet (owasp.org)
307 points by colund on Feb 2, 2015 | hide | past | web | favorite | 150 comments



"Applications should enforce password complexity rules to discourage easy to guess passwords." - ARGH!

To clarify, to avoid downvotes for a non-'productive' comment, I firmly disagree since this will probably result in me having to pick a password that's harder to remember than I otherwise would. It might also might it more awkward to type quickly, making shoulder-surfing easier.

(Note that this is probably not i18n-friendly, either)


At my company this typically results in users printing or writing the password down.

I wonder if the guys who create these heuristics/recommendations ever had contact with humans. I believe that their research is thorough in their area of expertise (info sec), however, it sounds like they are only considering data per se, ignoring human behavior variables. There's little value in enforcing hard-to-break passwords while also encouraging users to write them down.

What I believe: Info sec researchers should team up with HCI people.


There's not much wrong with writing passwords down. Printing thema is less desirable.

It's a hundred times better to have a difficult password on a post-it on a monitor than it is to have an easily guessable password. Who do you suspect is going to hack you? Ask that question honestly and you'll know how best to thwart them.


I understand your point, and I agree that the biggest threats aren't physically nearby. In that scenario ("remote" attacks), of course, there are bigger problems than written/printed passwords.

However, at the enterprise level, physically visible passwords are a big problem. Imagine a less-than-happy worker, about to leave the company, having the opportunity to get coworkers passwords. In such scenario, less strict rules (let's say, rules that didn't make people writing the passwords down) would have been beneficial.

And there's another point: the "perception" about IT security rules. If they ask too much of people (think "non-IT people"), they might create a image of overzealousness/"overcomplication". I wonder if this doesn't make people less compliant, with security rules, on the long term.


> Who do you suspect is going to hack you?

According to my experience:

1. People I know in real life

2. People who execute phishing attacks

Your strategy is harmful in the first case, though irrelevant in the second


The thing wrong with writing passwords down is that writing passwords down makes 2FA into 1FA. A password, when stored outside of someone's head, is a token, not a password.

If you really do 2FA, though, you should actually relax your password requirements. The most important attribute of a password used in a 2FA scheme is memorability, to make sure the user doesn't write it down (and thereby remove a factor.) Even a dictionary word works, as long as it's not one that's written down on e.g. the user's employee profile, like their mother's maiden name. Generating one or two dictionary words would be fine.

Keep in mind, the majority of 2FA security is in the token. As long as you verify the token first, the only power the password needs is to distinguish the device owner from someone who stole the device, or has snuck onto it. It doesn't need to protect against automated attackers; that's what the token (plus rate-limiting) is for.


> It's a hundred times better to have a difficult password on a post-it on a monitor than it is to have an easily guessable password. Who do you suspect is going to hack you?

It depends on the threats you face. Generally, most attacks are from insiders.


I'm trying to do that actually. I organize the Boston Security Meetup where we have 150 attendees who come to Google Cambridge to listen to cybersecurity talks. I also organize UX Boston forum for user experience designers. I hope to get more security people interested in UX Design to understand the human aspects of keeping people safe.

https://www.meetup.com/boston-security-meetup

https://www.meetup.com/uxboston


> At my company this typically results in users printing or writing the password down. I wonder if the guys who create these heuristics/recommendations ever had contact with humans.

While I absolutely understand your sentiment, I think you might be conflating extreme password requirements with reasonable password requirements.

The article linked suggests that a strong password is 10 characters (that's the whopper), and three of four complexity requirements (capital, special, number, lower). That's not unreasonable. In fact, the only really difficult part of that is the 10 characters bit.

Switch that to 8 characters and you're golden.

Even better, have a five minute lockout and/or email unlock functionality after, say, ten failed attempts -- and you're doing great.

I deal with web application security assessments on a daily basis, and the current status (as a general rule) is abysmal. Passwords won't fix most of those problems, but making sure that users can't set "password" as their password can at least improve one potential issue.


How about just measure entropy based on some criteria (using things from different sets MIGHT count as entropy for each new unique set) and letting the end user decide what goes in to the password and how long it is?


Would a raw entropy evaluation help mitigate a human factor like just using "passwordpassword" instead? Would that even be a problem?


No link handy but I'm sure someone recently wrote an open source effective entropy calculator, which avoids either enforcing stupid entropy-reducing, complexity-increasing rule sets, while also preventing anyone using stupidly easy to guess passwords. Others do things like check against a normalised dictionary (i.e., I, 1 or L are all considered the same character). I really don't know why in this day and age of complex software with libraries for even trivial functionality, this practise isn't more widespread. People are still writing dumb complexity rules like it's 1980. This should really be the new "don't do your own crypto".


This is a step in the right direction, but down a dead end. Like trying to reach the moon by climbing up a tree.

Taking a step back, what we are doing is essentially: you may choose your own password! But, not really. We are going to tell you what not to. After you try it. And it's not a human, or equivalent AI: it's a (dumb) algorithm.

If you are going to let someone choose a password, ask yourself: why? Is convenience that much more important than security? Putting, what many people will perceive as, onerous constraints on the passwords greatly reduces that convenience. But users will still try to make it as low entropy as possible, so security suffers, nonetheless.

Or is security so important, to hell with convenience? Then why not just generate a PW for them? Here, print this out, since you were going to, anyway. Click "save this password", since you were going to, anyway.

Cut out the middle man and save everyone time.

Now, instead of forcing your users to do the try-to-cheat-the-entropy-algorithm dance, you can show them what a good password can look like. String four words together: happy (or less annoyed) users, good entropy, happy admins.


I agree entirely. In fact, this is precisely how passwords were implemented in a project I've only just finished.

In addition to the frustrations you've outlined, consider a further case where a user's username is their email address (this was true in the project I mentioned). How is the user most likely to behave in this scenario? Are they going to take the time to generate a new password that satisfies your constraints, or are they going to keep things familiar and use the same password that they use for their email? Afterall, the usernames are the same - so why not keep things easy to remember? Well, I'm sure that's convenient, but I absolutely DO NOT want your email password, even bcrypted/&c., in my DB. No thanks.

Clearly, then, the best choice is to generate something for the user; something that they can reset using a classic 'forgot password' system, but which they never have to define themselves. Four diceware generated words, plus spaces, each no longer than two syllables, seems to do just fine.



> On window.load, [..], it'll fetch zxcvbn.js, which is [..] 700k (330k gzipped)

As a client side solution, that download size seems excessive for a password strength meter.

I understand it contains a dictionary of passwords but that is larger than most JS frameworks. Perhaps a server-based XHR-based solution would be better.


Think about what you've just said. How insecure is sending the password (hopefully over an HTTPS connection!) to a remote server?


How else do you allow someone to use a password to login? I suppose you could run the hash locally if they have JavaScript-- but if they don't, then what? (Edit: Good point, all the hash achieves is that the user's entry isn't sent in clear text -- of course, the hash itself then becomes the password for the purposes of authorization.)


Running a hash locally is equally useless. You've effectively turned the hashed value into the password itself, achieving nothing.

There are secure key-exchange schemes that don't require sending over the raw password, but this isn't an example of one.


Just about every signup and login form does this (and yes preferably over TLS only). What is the problem with it?

The alternative is browser-side encryption of the password before sending but that will get @tptacek rightfully punching you in the face for even mentioning it.


It's only for the signup page so that really isn't a problem.


"This should really be the new "don't do your own crypto"."

oauth / openid for the win

Unless your companies "secret sauce" is user authentication and management, you probably shouldn't be doing it.


You still almost certainly need to implement a backup system. It is rarely a good idea to force a new user to enter his facebook/google/ms credentials to use your service. Even if 99% of your potential userbase has such credentials they may not want to connect them to your service.

That's even before getting into the question about whether or not such systems make it more likely that users will fall to phishing attacks by conditioning them to enter their credentials somewhere other than the website where they were issued that they went to directly.


Using OAuth also doesn't give you a free pass. Plenty of apps/sites/organizations mess up OAuth implementation, and OAuth doesn't solve things for corporate/enterprise users either.

It's also not suitable for certain demographics, which puts you back at square one (rolling your own).


Having recently had to implement improved password security for a customer who wouldn't read XKCD #936, the internationalization thing was a pain in the ass.

They wanted at least 10 characters, and at least one uppercase, one lower, one digit, and one special character. Easy enough with .NET's built-in membership stuff by setting:

passwordStrengthRegularExpression="(?=.{10,})(?=(.\p{Lu}){1,})(?=(.\d){1,})(?=(.*\W){1,})"

The "\p{Lu}" part handles uppercase characters even in Unicode chars, but Javascript has no equivalent, so I couldn't do client-side validation of that. Should be validating on both ends anyway, but it's still a pain.

The real part I hated was having to keep track of users' last N passwords to make sure they didn't re-use them. Since everything's hashed and salted, I just kept a table of previous hashes by user. Seems simple, but MS didn't see fit to include a HashPassword(string plainTextPassword, byte[] userSalt) method in the membership provider, so I had to reverse engineer their password-hashing method to check when they change passwords if it's something that's been used before.

Then I realized that they could just change their password N+1 times in about a minute, then re-use their expired password anyway, so we wound up having to set a minimum age of N weeks before a password could be reused as well.

The whole problem is an exploding requirements nightmare that could easily be solved by saying "Must be >32 characters and don't write it down anywhere, idiot."

The worst part is as much as I hate these types of requirements, I now perfectly understand why these systems are the way that they are.


How did you handle password++ iteration? For example P@ssword01, P@ssword02, P@ssword03 hash differently.


There's an easy way to measure entropy non-heuristically: see how well the password compresses. Then, to measure whether a new password is a variation of an old one, you just compress their concatenation and see if you get less entropy than expected.


That requires storing the old password in plain text (or something reversible to plain text, like lossless compressed formats). That is typically not recommended as old passwords can give clues to the current password.


Nah; you can require that they enter the old password to change their password, check that they've entered the right one, and then use that (while you're still holding onto it) to do the cross-compression check.


If you have both the old and new password in plain text you don't need to do a compression-based check, just a raw iterative approach would accomplish the same thing with very low overhead (in either CPU or memory accesses).

Compression systems typically need a lookup table of some kind and have more overhead than just the raw comparison.


As mentioned in another comment, I'd probably do a Levenshtein distance between the old and new passwords, and reject if they crossed some threshold. However, only knowing the plaintext of the immediately-preceding password as they enter it to authorize the change, it wouldn't do much to stop them from doing:

PasswordA SomeOtherPassword1 PasswordB SomeOtherPassword2 PasswordC SomeOtherPassword3

Just iterate on every other change, and you've beaten the requirement.


That's assuming what you want is to find passwords that are "the same" in a literal sense, rather than "the same" as in sharing a common generation algorithm that biases certain outputs.

For example, if a user's first password is "first123!@#" and their second password is "second456$%^", there are no letters in common—but those two passwords, when joined together, are very intracompressible (by an ideal compressor)—and likewise, an attacker who knew that the first was a previously-used password would be very likely to try the second. That both properties apply is not a coincidence of this particular password; the intra-compressibility of a set of plaintexts, and the predictability of unknown plaintexts of the set from known ones, are equivalent measures of informational entropy.


I would guess that it didn't.. just another example of the exploding requirements.


How do you handle (Do you?):

* not more than 2 identical characters in a row (e.g., 111 not allowed) * Name/Username in password (Name: Chuck Norris, username: ChuckNorris, Password: ChuckNorris#1)

These are reasons why I don't look forward to doing this and also why I'm leaning towards G+/FB/twitter/etc authentication in an app I'm planning.


Didn't in either case, because they weren't in the requirements, and the first one, while well-meaning, just further decreases entropy. I got into an email fight with our network security over trying to use a 40-character password LastPass generated that happened to have 2 identical chars in a row, and not being allowed. Not more than one identical character in a row is more secure than not more than 2, apparently.

For the second, I'd probably just do something like compute the Levenshtein distance between the username and password, and reject it if it passed some threshold.


> and at least one uppercase, one lower, one digit, and one special character

Not to nitpick, but they wanted at least 3 of those 4. Is that possible with a regular expression or are we now into the custom validator territory?


It's possible with a regex. Here's one spectacularly awful way: enumerate through all the permutations of 3 out of 4 character classes. Assume the classes are A, B, C, and D just to simplify the syntax here:

    (.*A.*B.*C.*)|(.*A.*B.*D.*)|((.*A.*C.*B.*)|(.*A.*C.*D.*)|(.*A.*D.*B.*)|(.*A.*D.*C.*)|(.*B.*A.*C.*)|(.*B.*A.*D.*)|((.*B.*C.*A.*)|(.*B.*C.*D.*)|(.*B.*D.*A.*)|(.*B.*D.*C.*)|(.*C.*A.*B.*)|(.*C.*A.*D.*)|((.*C.*B.*A.*)|(.*C.*B.*D.*)|(.*C.*D.*A.*)|(.*C.*D.*B.*)|(.*D.*A.*B.*)|(.*D.*A.*C.*)|((.*D.*B.*A.*)|(.*D.*B.*C.*)|(.*D.*A.*B.*)|(.*D.*A.*C.*)


Well, they wanted 4 of 4, actually, and yes, I had to do custom validation. Everything except the upper-case character worked in JS, so everything "just worked" if I took that part out, but that wasn't an option.


My bad, I thought you were referencing the OWASP site specifically.


I think context is important. If it's a bank password, I don't mind coming up with something complex. If it's for something that requires me to have an account, but I don't care at all whether or not someone else gains access to the account, I have no problem using a much simpler password and would be rather annoyed if I had to do something more complex.


Then you might as well do the RMS thing and set your password to be blank :). Passwords are currently a necessary evil. My hope is that we will ditch them sooner or later. For now, use a password manager to prevent really bad things that will result from what you are doing.


I agree on that as well. Enforced - NO (only minimum length should be enforced imho). Encouraged - Yes. Just show a good password strength meter to the user and fine-tune it to your security requirements.

If you enforce all kinds of weird password rules to the user, he will have to write the password down somewhere, because one couldn't possible remember all passwords. And for non-technical users that means some random pieces of paper or post-it notes. On the other hand, encouraging them to come up with something that is strong makes it more likely that they will invent a password that they can remember, thus making it more secure.


"Just show a good password strength meter to the user and fine-tune it to your security requirements."

That's the approach I follow in my projects.

I offer a Javascript (window.crypto powered) Diceware passphrase generator and use zxcvbn to let users know the relative strength of the password they've supplied.

I've also been planning an experiment to see if the width of a password field results in users choosing longer passwords, because more empty space. :)


> If you enforce all kinds of weird password rules to the user, he will have to write the password down somewhere, because one couldn't possible remember all passwords.

I'm going to use the appeal to authority argument, and bring over Schneier to argue that writing your passwords down isn't bad: https://www.schneier.com/blog/archives/2005/06/write_down_yo... (and others posts since, google "schneier writing passwords down" or something)

Remembering your password isn't necessarily more secure, especially since it easily leads to password re-use, which is even worse.


A personal pet peeve of mine is websites that happily inform me my 20-character long, randomly generated by a secure algorithm with an extra heaping of entropy, password is not "strong enough."

I just love generating 3 or 4 different COMPLETELY RANDOM passwords with KeePass because your stupid password rules were written by people who wouldn't know entropy if a dictionary open to 'en' hit them square in the jaw.


I see your point, but given the number of 'test', 'password' and '12345' passwords we see whenever there's a leak, that could be an issue.

Maybe just a minimum length? I too get annoyed when there are specific complexity requirements, like 'must include an uppercase letter' even though I've used a 20-character long password including numbers and punctuation.


The worst for me is when there is a strong complexity requirement, but some other stupid limitation, like only specific special characters can be used. I usually result to something like "ThisIsBullshit!1" or something similar (though I started using lastpass, also 2fa where available)

I did come across a site that had a cuss word filter, then wouldn't let me change my password... lol. No mention of why it was an invalid password.

If you must have certain complexity requirements, spell them out. Personally, if it's over 8-10 characters (with leading and trailing whitespace trimmed off), I'll take anything... convert to UTF8 bytes before 1-way hashing...

As for using SCrypt, if it takes a modern CPU 1/2 second on a process to hash a password, then that's ripe for a DDOS against your authentication server, which is where failed attempt counts, and locking for X minutes comes in.


If the problem is that people are using "password" as a password, why not just ban that example specifically? So you can't use "password", but you can use "djwipvbs". Assuming you're using bcrypt with an appropriate work factor, wouldn't "djwipvbs" be an acceptable password?

More generally, you could ban the top X passwords from those "most frequently use password" lists.


Consider the sources of those leaks. They're usually from places like adobe.com, where my own password is 1234 because I just had to make an account to download some trial software or whatever.


Why are you not already using a password manager? They have existed in fantastically-useful form for probably well over five years now.

If a password is easy to remember, it is easy to guess, and if you reuse a password it's likelihood of being compromised increases dramatically.

There is no simple solution for this problem. Password managers make the best of a crappy and likely unavoidable situation.


Because I cannot install applications on my work PC. Which means that I have to sit there carefully copying long strings from my phone to my PC if I want to log into a site at work.

Which is frankly more effort than I can be bothered with.


While in some ways I agree that these password rules enforcements are not ideal (they are hardly scientific, they seem much more "common sense" to me), I think making you use a harder-to-remember password is part of the point. The current received wisdom on passwords is that if it's easy to remember, it's easy to guess.

That said, I'm not crazy about the inherent paternalism of this sort of thing. I think allowing weak passwords with a warning is, in most instances (not including corporate IT and situations where you're requiring that the user protect your private information rather than their own), a preferred way to go. Informing people when they are making a weak password should at the very least let them make a choice about how much they care about their own security.


At Lavaboom, we simply check against the 10k most used passwords (in memory), but we plan to move to 1 million soon (on disk – account creation is relatively infrequent for the slower access speed to not matter).

Our problem is that we SHA the passwords on the client side, so each password is 256 bits long. The resulting hashtable (or bloom filter) is still a reasonable size for disk storage, though.


We SHA256 the passwords on the client side at Userify (SSH key management for EC2) as well (bcrypt is too slow in mobile browsers), and then bcrypt on the server side the resultant hash. (We don't cache it, though.)

Even in the event of a TLS leakage, we still never see your original password, and the server doesn't end up doing any more work. It's not perfect, but I definitely agree it's a great step forward.


That's exactly what we do, too, except we've bravely switched to scrypt at some point.


With a bloom filter that's still just 2MB of memory for a 99.99% accuracy.


It depends on your threat model of course, but in most situations an attacker that is physically near you is less concerning than an attacker that can be anywhere in the world.

In most cases I'd rather you had a twelve character random string written on a post it attached to your monitor than the password "password" not written down anywhere.


It is more palatable if the rules provide for phrases, say 20 characters or longer. Then you can do a few substitutions for word separators and the like.

And yes, I agree that it is seriously annoying.


If you answer to any kind of external or security compliance regime, that compliance is usually built around NIST guidance. They are big about strong passwords and MFA.


Exactly. If my password is 20 characters, why is it more secure to have it be 10 characters with using upper, numbers, and symbols? (ie: 26^20 >> 95^10).


Dictionary attacks work very well on passwords like this.


"An application should respond with a generic error message regardless of whether the user ID or password was incorrect."

I really don't like this advice (although I see why they put it in there).

I often use different email addresses for different services so that I can determine who sells on email addresses (depending on how much I trust them), and quite often I can't remember which email address I signed up with (was that mojang@...com or minecraft@...com).

At least if I see "user not recognised", I know to try a different email address.


There was a decent article (I think it was on HN) a while ago that argued against this type of generic error message. The basic idea is that you can very easily discover whether the email is valid or not by attempting to create an account with that email (in most cases). It's trivially easy to either verify that the email you are trying to use is valid, or even build a database of valid email addresses to crack by attempting to create accounts. So why bother with generic error messages at all. It is not really buying you anything on the security end and it seems like it is sacrificing some usability.


https://kev.inburke.com/kevin/invalid-username-or-password-u... Could not agree more on this. Best practice should evolve within time and wiki should be updated!


Agreed. I actually just fought and won this battle at work. If you don't want to expose the specific error when logging in, you must either not leak usernames/emails through the signup process either. Otherwise, it's just security theater.


Yes, and it is very hard to show a generic error during sign up.

"There is an error with your data..." WHAT ERROR? I've typed 10 fields.

I think a better approach is to rate limit wrong usernames / emails during sign in.


It's not a difficult problem to solve. Just let the sign-up succeed whether an account already exists for the email or not. Then say "You will receive an email with instructions to complete your registration".

If there was no existing account, send an email with the text: "An account was created for this email address at Example.com. If you requested this account, you may activate it by clicking here: https://..."

If there was an existing account, send: "An attempt was made to create a new account at Example.com for this email address, but you already have an account. If you have forgotten your password, visit https://..."


Yes, this was the alternative we discussed. We opted to go with the simpler (and very common) signup process, which does admittedly leak some information.


That works for sign-up, but not for the far more common case of sign-in.


If you use this approach, the idea is then that you use a generic login failure message. This prevents leaking of any account information, but provides a different (poorer, IMO) user experience.


This is exactly the approach we took.


So just keep trying addresses until you get a reset link sent to you. It's really unacceptable for any service to leak its user list in the way you suggest.

EDIT: x1j7xJuzX in the sibling subthread has it right for email addresses. It's true that separate usernames would be difficult to handle in a user-friendly manner without leaking, but with a valid email address a separate username is probably unnecessary. It doesn't help the user interact with the site. If users interact with each other, they can just choose non-unique display names. To prevent impersonation, just use display name plus some other invariant account property to generate a hash that is displayed alongside the display name.


Doesn't almost every service leak users at sign up? There are ways around it but I don't think they've very common.


How do you avoid 'leaking the user list' if you enforce unique usernames?


If you use email as username, you can make the sign up give the same "check your email to confirm your account" message for a new account and an existing account. This works well for new users and those who have already signed up but perhaps forgot, and leaks no information to someone who doesn't have access to that email account.

I'm not sure how to do this just for usernames, but usernames are less sensitive than emails anyway.


When registering for an account, duplicate usernames will flag a warning. This is enough to give hackers info whether or not a username exists.


"Maximum password length should not be set too low, as it will prevent users from creating passphrases. Typical maximum length is 128 characters."

Why would you ever have a maximum password length at all? bcrypt or (god forbid) your secure hashing algorithm of choice doesn't care about input length, and has a fixed output length to stick in a database. Why on earth would you limit the password length beyond anything so insanely large (1024, etc) to not even matter?


Hashing passwords takes a lot of CPU resources. Django limited their password lengths to a maximum 4,096 characters because of this: https://www.djangoproject.com/weblog/2013/sep/15/security/


You could still just truncate. (But what is the use case for a 4 megabyte password anyway... keys aren't even that long)

Certain companies (Fidelity.com, looking at you!) enforce a ridiculously SHORT password max length. 12 characters? I'm laying odds that they're not hashing passwords at all.

And then we come to the companies that disallow characters that could be used for SQL injection. Those companies frighten me the most..


The Django developers are wrong to blame PBKDF2 for this slowness. It takes just 1 second with an unoptimized Python PBKDF2 to hash a 1MB password, and probably 0.1 second or less with a native implementation. If they claim it takes 1 full minute, they must be doing something seriously wrong, like using a crappy parsing or serialization mechanism to pass a 1MB string around higher-level modules.

  $ time python -c 'import pbkdf2; print pbkdf2.crypt("a"*1000000,"XXXXXXXX")'
  $p5k2$$XXXXXXXX$hmAHZehesTpLs.pM3G4mKlHZI6/FMj.Y
  
  real	0m1.233s
  user	0m1.221s
  sys	0m0.012s


By default that uses 400 iterations, the current recommended is 10,000. Try it again with 10,000.


Indeed: 28 seconds with 10,000 iterations. I assumed Django was iterating 400 times. I was wrong. Thanks for correcting.


Django have been upping the iterations in each release, it's 15,000 in 1.7 and by 1.9 will be 24,000. 500 password checks saturates a core i5 4670 for 25 seconds: http://tech.marksblogg.com/passwords-in-django.html#why-are-...


Because `bcrpyt` only accepts 72-bytes, (so abstractions ignore everything after 72 characters).

Example: http://3v4l.org/4lGu3


bcrypt has a maximum input length of 72 bytes:

https://www.usenix.org/legacy/events/usenix99/provos/provos_...


I suppose one could split the input into 72 byte chunks, hash each chunk and then combine again. Either store that, or rinse-and-repeat until you end up with one final hash.


> I suppose one could split the input into 72 byte chunks, hash each chunk and then combine again

This sounds like a recipe for screwing something up and opening yourself up to some kind of weird attack. I'm not enough of a crypto expert to know what the downside of this would be but it's more mucking around with crypto that I would be comfortable with.


You should first SHA-512 the password (to get a uniformly-64-byte string) and then apply bcrypt to that.


> bcrypt or (god forbid) your secure hashing algorithm of choice doesn't care about input length

bcrypt itself accepts a maximum key size of 56/72 bytes (depending on stage) as per http://en.wikipedia.org/wiki/Bcrypt#User_input

To a user it may not matter (they won't know what is being truncated) but from a systems design POV you should limit the unnecessary. Why let users POST 1MB text strings to your server if you're just going to discard them?


Such a low maximum length does not make a lot of sense, but say a limit of 1024/2048 seems reasonable. The amount of time it takes to compute a hash is proportional to the length of the input and you do not want to facilitate a DOS attack.


to avoid an attack where a user submits an absurdly long passphrase to the auth system, that it crashes?


Every time you introduce a password constraint, you've reduced the potential password complexity. I absolutely hate arbitrary password requirements. "not more than 2 identical characters in a row"? WTF? Stop with this nonsense.


"not more than 2 identical characters in a row"? WTF? Stop with this nonsense.

This is OT, but there's an interesting snippet in "The Secret Life of Bletchley Park" [1] about decoding Enigma messages used by the Italian Navy in the Med.

One of the female operators had a set of messages from one Italian operator who sent a message once a week on a regular basis. They had determined that the first letter was an 'L'. She looked at the keyboard, saw that 'L' was neatly placed under the right hand and guessed that he was sending a test message consisting of nothing but 'L's tapped out in quick succession. Voila! She hit the jackpot.

From this insight, all dial wirings and movements of the Italian machines could be quickly deduced.

So, repetitive plain text can be a security issue.

[1] http://www.amazon.com/The-Secret-Life-Bletchley-Park/dp/1845...


That's a vulnerability for cyphers and has no application to modern password systems. If a password were all Ls up to the minimum then certainly that would be a bad idea, but having two Ls in a row because your password happens to contain or be a derivative of a word that has two Ls has no bearing on how secure the password is.

    sha256(LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL+mysalt) = 57c70b4fddd06c94c9a7b41d9884591bb1d487fb78df723b11bc4892e879f46e
    sha256(LRpSdU$EnD1ZrJJ2QyVHPycN*DZtrHm&YdH%%28f4ih+mysalt) = 29cd0708db0fb7350e17349012a6e728b357ef733e85f401fc757e6565ef5e80
Neither of those hashes would give an attacker the slightest bit of insight into the user's password even if the attacker suspected the first letter of each were an L.


> having two Ls in a row ... has no bearing on how secure the password is.

At least some password cracking programs are built to anticipate human tendencies, which I would guess includes repeating characters. If I were designing a password cracker, I would target human-created passwords and not random passwords. For example, I would have the program guess 123456 before it guesses R%Vg9~\


The other complexity rules rule that out, though.

If I have a password 10 characters long with at least one uppercase, one lowercase, 1 digit, and 1 special character then having one of those repeated won't make it any less secure. Rigidly enforcing that rule doesn't make sense, it's saying that "R%Vg9~\LL" is less secure than "R%Vg9~\".


Sure, but in analyzing a password for acceptable entropy, one should be smart enough to dilineate between:

LLLLLLLLLL

and

8x~3uLLx&#@_o

But most people who write password analysis are doing some really quick and dirty checks like [name/email not in password], [password exceeds X chars], [password contains at least 1 of these chars], etc. If you're going to introduce some other check, it should have the nuance to provide some allowances. I've had my auto-generated, 20-char digit/char/symbol PW from keepass get rejected for such things.


> I've had my auto-generated, 20-char digit/char/symbol PW from keepass get rejected for such things.

Huge pet peeve of mine. Really? "(uJgP6h9=8Uc6x?}#B6Q" isn't enough for you?


> Really? "(uJgP6h9=8Uc6x?}#B6Q" isn't enough for you?

Not after you've posted it on HN. That's only half joking...the biggest vulnerability in any password system is the humans involved. Security advisors should design around the natural behavior of their users, not try to force users into acting unnaturally. Otherwise, users will figure out how to introduce vulnerabilities that get around the constraints imposed upon them (the oft-cited writing passwords down).


Memo: ATTN All Employees

The password "(uJgP6h9=8Uc6x?}#B6Q" (no quotation marks) has been scientifically determined to be the most complex password. Please make sure to change every password to this new password within 24 hours.

Signed, The Mgt.


Obviously not, there is not e that could be replaced with a 3.


> One of the female operators had a set of messages from one Italian operator who sent a message once a week on a regular basis.

That was the most important mistake from the Italian operator.

> So, repetitive plain text can be a security issue.

The only thing that should be discouraged is that a password should contain only one repeated character, which is probably part of many dictionaries. Any variant (LLLLLLLLLLM) would pretty secure, the longer the better.


That doesn't mean constraints like 'no repeated characters' is a good idea. It gives the attacker significantly more information about the plaintext if they know they can rule out all strings with duplicated characters.


And isn't mmmdG0tKtN#mmmmmmmmmmmmm more secure than dG0tKtN#mm?


You say "reduced the potential password complexity", I say "reduced the minimum potential password complexity".


It still reduces the potential brute force search space. Instead of forcing a clever brute forcer to search all of the horribly insecure passwords with no special characters and repeating characters, you're telling them up front that they can cut certain strings out of their search space.

I can see both sides of the argument, but often the password complexity rules result in users writing down their passwords on sticky notes. You could make the argument that if an attacker is at the desk, you're already compromised, but still it's probably better to just enforce a policy of reasonable password complexity no matter what it is. They have javascript password complexity indicators on many sites now, I think that should become standard.


> It still reduces the potential brute force search space.

I may be playing Devil's advocate, & these may be the ramblings of a fool but...

The space of possible passwords with "N characters" is many, many times larger than the space with "1 to N-1 characters" combined. Infact it makes it reasonably insignificant?


I've very rarely seen a constraint that meaningfully reduced the password space.

No 2 identical characters in a row is a terrible constraint.

No more than 2 identical characters in a row is a reasonable constraint.

No more than 3 is a really good constraint.


A big one I've seen is more related to the TLS cheat sheet [1] they link to on that page.

Many sites will send session tokens over http because they don't set the "secure" cookie flag. It's a simple thing to do, and prevents a malicious ARP poison or DNS attack from potentially hijacking an account.

You'd be surprised how many sites are vulnerable to such attacks. Reddit, parts of Ebay, several university websites, and many other sites still are vulnerable to session hijacking.

I think people writing web libraries need to start building "sane defaults" concerning security. All cookies should be secure by default, and only those who know what they are doing should turn them off. It's not that much extra overhead, and the potential benefits outweigh the increased processing and bandwidth.

1: https://www.owasp.org/index.php/Transport_Layer_Protection_C...


Great point. Setting "SECURE" and the poorly named "HTTP" are key to cookie security.

One issue we ran into was: So our site runs behind a load balancer. We receive HTTPS connections into the load balancer but the internal connection between the load balancer and the actual websites was HTTP only, so when we tried to set SECURE on the cookies, the application framework we were using trying to be "helpful" unset the SECURE flag because it detected that the connect from its perspective was not secure (even though from the browser's perspective it was).

Keep in mind that the connection between load balancer and web-servers was never on the internet, in fact it never left a virtual machine farm (a single room essentially). So it is justifiable doing HTTP internally and HTTPS externally (and also makes certificate management easier).

We finally had to hack away a bit on the framework to get it to set secure regardless of the connection type.


> Many sites will send session tokens over http because they don't set the "secure" cookie flag. It's a simple thing to do, and prevents a malicious ARP poison or DNS attack from potentially hijacking an account.

Or you can of course enforce HSTS so that HTTP never gets used.


> not more than 2 identical characters in a row (e.g., 111 not allowed)

Why? If my password is id8FK38f@&&#d is it inherently less secure if 111 appears in the middle of it somewhere?


Next revision:

> no prime numbers, no more than 2 even or odd numbers in a row (e.g., 644 not allowed), no sequence of 2 or more characters may repeat more than 2 times (e.g. aabaa not allowed), no ascending or descending sequences longer than 2 characters in a row (e.g. 123 or cba not allowed)


These are just rules for the sake of having rules. It's downright silly and honestly it makes OWASP look like a joke to have something this ridiculous on their domain.


That was actually sarcasm (reductio ad absurdum), not an actual quote.

I think I've seen password restrictions similar in spirit to those, though.


I honestly couldn't tell. I think that in itself says something...


I agree. Arbitrary rules like this bug me especially since it reduces the number of passwords that a hacker has to try.


Some of the suggestions are bad. Why they are enforcing English characters? Like a-z? For example in Github I write щ and then it wants me to write a lowercase letter. WTF? It is lowercase! And more secure than an English letter!


Do people REALLY brute force passwords? Do people REALLY brute force all lowercase, all latin combinations up to 20 characters before trying symbols, uppercase and numbers?

I am very skeptical that the '3/4 complexity rules' approach is making systems meaningfully more secure. I've had all kinds of passwords, but I've never lost them to brute force. Every time it was because someone got inside a company and made off with the database.

If complexity rules don't add anything, they should be discarded in the name of usability.


> Do people REALLY brute force passwords?

Yes. Source: I have, multiple times.

> Do people REALLY brute force all lowercase, all latin combinations up to 20 characters before trying symbols, uppercase and numbers?

Remotely? No. Locally? Yes.

If I have some hashes, I am going to be doing every combination of characters at least up to 8 digits. If they're using something bad like 3DES or MD5 then I can go up to 20 digits and check everything.

Remotely you're typically using a common password dictionary which is just a few thousands passwords people often use. If you had a botnet you might be able to do every combination (I don't).

> I am very skeptical that the '3/4 complexity rules' approach is making systems meaningfully more secure.

Nor am I. I am more an XKCD-sentence password fan myself...

I really like entropy calculators. They're much more useful than broad generic requirements that actually reduce the set of potential passwords.

The entropy calculators that give people "prods" (so there are no requirements, just a traffic light system) are absolutely wonderful.

> I've never lost them to brute force. Every time it was because someone got inside a company and made off with the database

Which still requires brute forcing assuming the passwords are stored correctly. If it is something modern like PBKDF2 with a decent number of iterations (e.g. 10K rounds) it can be a nightmare even if your password is "just" 8 digits.

> If complexity rules don't add anything, they should be discarded in the name of usability.

Agreed. They aren't even based on any research, someone in the 1980s just thought they "sounded" secure. Nobody has spent any time actually researching this, we just repeat the same tired advice from thirty years ago because "common sense" tells us it is a good idea.


If I had the dump of the hashed password table (even salted), I'd start by trying all lowercase passwords to match the hashes.


Just a hypothetical, but what if an application started encouraging users to enter a "login sentence" instead of a password. i.e.: "Please enter a sentence that you'll be asked to remember each time you login." Obviously, the standard constraints of length and complexity (albeit slightly altered) can be enforced.

It's much easier for me to remember "Please close the window, I'm cold." then it is for me to remember "XSDJd94*(lo03X.._".

The "horse battery staple" XKCD comes to mind.


Simple does this and it's fantastic but perhaps only because it's the only service I use that requires this.

In other words, while it's useful from a password complexity standpoint to utilize longer passwords, and while this makes remembering them easier, it doesn't change the fundamental issue that managing many unique passwords is difficult for most people.

This leads to the same general mistakes (that password manager application users don't face):

1. Reusing a password (in this case a better one, but the problem remains)

2. Writing all of their passphrases down somewhere.

In general, I still prefer the use of pass phrases versus passwords, but this still requires a password manager for most people to be able to affect safe password habits.

(Thinking of this grammatically it's also interesting that we still refer to a group of jumbled characters as a pass'word.')


That could be an interesting idea, but I'm wondering if the user wouldn't expect an unobfuscated input box to type in his sentence. Or if he would expect : "Please close the window, I'm cold." to be the same as "Please close the window, I'm cold" (no end period here)


If you ate all the whitespace and punctuation and uppercase'd it as part of your password input routine, it wouldn't be all that much worse than existing shorter passwords yet make life a lot easier for valid users.

Another interesting strategy, not discussed so far, is everyone in almost all lives and professions has some pool of weird technical names that can be concatenated together for a password. I'm partial to "unpopular yet cool integrated circuit ID codes of the 80s". I thought I invented that idea and was introduced to an old programmer using concatenated library calls from multiple languages (so his passwords were concatenations like a java soundex algo library call followed by some obscure fortran matrix manipulation) My mom knows the long legal names of some obscure real estate cases / judgments / citations / forms.


I use the same passphrase, with something extra based on a few rules.*

www.example.com: !ThisIsMyPassword!6eec9b9f

news.foobar.com: !ThisIsMyPassword!9ef61f95

Problems:

- It can be pain to use on a mobile device.

- Some sites limit the length of the password

- Need rules for sites that force you to update your password

*Not my true password nor the rule that I use.


The correct response does not indicate if the user ID or password is the incorrect parameter and hence inferring a valid user ID.

ARGH. This is a usability nightmare - moreso when the recovery system implements the same rule.

"Okay, I had an account on this website, which email address was it again?"

try logging in a few times

"Hm.. I must have forgotten the password. Off to reset!"

go through the recovery process

recovery page indicates an email will be sent

email never comes

"Wait, so are they being 'really secure', or is email just broken right now?"

wait a couple hours

forget about the site


The trick here is it not go to the reset page but to the signup page. Almost always, there is a message that indicates that the email id is already taken.

(which is why I think that indicating specifically, that the userid was incorrect or the password was, is better UX.)


  The application may return a different HTTP Error code depending on the authentication attempt response. It may respond with a 200 for a positive result and a *403* for a negative result.
I would say a 401 - Unauthorized with proper WWW-Authenticate header.

403 means forbidden, which apply to when you try to access a resource without permission / authorization

Also, in their Password Storage Cheat Sheet [https://www.owasp.org/index.php/Password_Storage_Cheat_Sheet], they seems to recommend :

  Select:
    PBKDF2 [*4] when FIPS certification or enterprise support on many platforms is required;
    scrypt [*5] where resisting any/all hardware accelerated attacks is necessary but support isn’t.
    bcrypt where PBKDF2 or scrypt support is not available.
AFAIK, things are not so binary :

* https://news.ycombinator.com/item?id=3724560

* http://security.stackexchange.com/questions/4781/do-any-secu...

* http://security.stackexchange.com/questions/26245/is-bcrypt-...


There are different interpretations of what 401 should be used for. The spec only handles WWW-Authenticate authentication, which is pretty limited and not universally used (Bearer auth is occasionally used for APIs but Basic auth is pretty rare -- especially in end-user-facing parts of the web). The problem is likely that when the status codes were defined nobody thought people would ever need to build their own login forms.

I agree that it is more useful to use 401 to indicate that some form of authentication is required or has failed, and 403 to indicate that you are authenticated but not allowed to access something (which is what the spec emphasizes).

IOW, 403 should be "Unauthorized", 401 should be "Unauthenticated". Sadly the spec mixes those two meanings in various places.


I usually do set WWW-Authenticate to None or WebForm (to prevent browsers to pop-up basic auth dialog).

And I agree with ambiguous spec on those concerns.


Simplicity should be a primary goal in the methods used to protect systems. Just because the methods to protect are easy, doesn't mean its easy to crack. For instance, a decent size password and lockout and you're set as far as brute force attacks. They are not going to guess a 10 letter password in 5 tries. After x tries, make them reset. Two factor auth for really important stuff, isn't that pretty much it.

I believe we're seeing more successful attacks from the use of security techniques that are unnecessarily complex and not completely understood (or partially implemented) by most engineers - than cause passwords aren't long enough.


Password complexity rules are stupid. The only thing that matters is the total entropy. "Entropy too low" is the only error a user should receive when coming up with a password.

Those complexity rules are the result of an entire industry blindly following the best practices of an old unix DES crypt function. It's dumb and it should stop.

http://security.stackexchange.com/questions/33470/what-techn...


Going a step further, I don't even understand why we need any kinds of un-skippable errors because of this.

If I wanted a single number, say, '1' as my password, why shouldn't I be able to use it? It is my account and my responsibility, why does everyone feel the need to enforce something on others.

A simple warning would suffice.


A compromised account affects more than just the account owner. In many scenarios it's possible to escalate privileges and take over an entire service - your password might be all that stands between a malicious user and everyone's accounts.


I can see how this may be a problem for invite-only systems, but a place where anyone can create an account? Plz...

In a scenario like GitHub's organization (where one user has privileges over many shared resources), for example, its owner still carries the responsibility of safety of all its resources.

If it's possible to escalate privileges, then that's the place that needs fixing.


He forgot an important modern rule on authentication: don't do it.

If you can get another system to do it for you; persona, OpenID, Github, Google, Facebook, or twitter it's more secure for the end user. They have features such as two factor authentication, fraud detection, manage password resets for you, and the end user is more likely already have an account.

Many developers don't agree with this on a moral level, as you are giving power to third party. However developers are developers, and if you do it yourself you're bound to do at least one thing wrong.


Might be a sample of one - but I dont use any services which requires me to provide a google / github / twitter account. I have very little trust in what I am authorizing the service to do on my behalf. Perhaps this wariness has come from my experiences with linkedin where I have been negatively surprised on more than one occasion.


Isn't 2FA the best approach? I'm just asking.

A problem here where I work is that every application must have a different password and it must change every 90 days. Consequently everyone has a spreadsheet with his passwords written down because nobody could possibly remember them all.

It seems to me that with 2FA, one simple password is adequate. Two independent devices need to be compromised and brute force is ineffective since the turn around time is at least several seconds between tries.


One simple password is never adequate as that then trains the user to continue doing that across other sites - regardless of their use of 2FA.

I've found the best solution for me is to use a password manage (Personally, I use LastPass) and enable MFA/2FA across everything that allows it.


This doesn't touch on commercial authentication managers and how horribly they can be implemented. There's no authorization cheat sheet either.

They also make assumptions like "When multi-factor is implemented and active, account lockout may no longer be necessary." Sure, until someone finds a simple hole in one of the factors and the rest become trivially brute-forced, sniffed, phished, etc. The chain is only as strong as the weakest link.


I dont really like this page. Its a good effort.. but. (no ands!).

- Most things are just a flyby, such as "hey look heres a paragraph that tells your what MFA is". but doesnt tell you how to use it.

- Password rules are outdated "use caps, 10 char, numbers, etc!". The horse staple blabla has been the new standard for yearS now... and is way better.. generating the password for the user is often not a bad idea

- no mention of upcoming techs like FIDO/U2F


This is not a accusing comment, but more of a request for more information:

"Passphrases shorter than 20 characters are usually considered weak if they only consist of lower case Latin characters."

This goes against the concept of diceware generated passwords of 4-6 short words doesn't it? Where in this equation am I getting it wrong? I've been approaching passwords like this for a while now.


"An application should respond with a generic error message regardless of whether the user ID or password was incorrect. It should also give no indication to the status of an existing account."

If so.. then how to respond on user registration page when someone tries to open a new account with username / email address of an existing account?


    cat registrationError.php

    <?php echo "we don't want your kind here. go away." ?>


and if you use email addresses as user names, verify they actually own the email address. Apple doesn't do this, and I'm amazed at how many people sign up for iTunes with email addresses I control.


OTP passwords dont fight client side malware lol.


Some of this is good advice, but there's a BIG point on the 'Password Storage Cheatsheet' that's linked and referenced by the above article, that I don't think is solid. https://www.owasp.org/index.php/Password_Storage_Cheat_Sheet...

Specifically, it suggests storing passwords using: `return [salt] + HMAC-SHA-256([key], [salt] + [credential]);`

It goes on to note that the key must be protected as private, strongly random, and stored outside the credential store, while the salt (can we please call it a nonce?) can be safely stored against credentials.

I'm still not comfortable with this construction. Stick with their earlier advice and use scrypt, bcrypt, or PBKDF2. That's my order of preference too, which differs from theirs somewhat, but that's a minor quibble; all three are reasonable.

The problem with their construction is that HMAC-SHA-256() is designed to be fast, and so attackers have the opportunity to make a lot of guesses quickly. The secrecy of the key helps over a straight SH256(...), but not a lot for the following reasons:

1 - It assumes an attacker who compromises the credential store won't also have the key.

2 - It assumes an attacker is unable to recover the key.

1) is a valid assumption for certain classes of attack. If your key is stored in an environment variable or something, while credentials are stored in a database, an attacker who compromises your database via a SQL injection won't have the key. But the problem is that an attacker who compromises the application may. If I have full remote code execution on the server (and you have a bad day then, passwords aside) I'll have the key. Or maybe an attacker has an arbitrary file read (not quite as bad a day), and you store your key in a flat file on disk. Or an attacker can cause your application to generate a stack trace (disclosing runtime details) and view the key...

You get the idea - there's lots of potential ways to get that key regardless how you store it, and everything hinges on that. Once they have the key, they can mount dictionary or brute force attacks against credentials just as they would against `SHA256([salt]+[credential])`

2) is a valid assumption only if the attacker doesn't know a single password's plaintext value and the key is sufficiently long and random. If I know the password to my own account (and I likely do) or any other account (let's say one user on the site uses a password that was recovered in another breach) this scheme fails.

Suppose my salt is "SALTYSALT" (okay, so my PRNG sucks; also a detail to be wary of) and my password is ye olde "PASSWORD" (yah, this sucks too.. but it's my crappy password. Maybe I added it just to observe the resulting HMAC value?) Now I can just try calculating `HMAC-SHA-256("A", "SALYTSALT" + "PASSWORD")` If that doesn't match, try `HMAC-SHA-256("B", "SALYTSALT" + "PASSWORD") and so on. The one thing I don't really know that would help is the length of the key. If it's long enough (We'd want at least 256 bits), and strongly random, I may have a difficult time. It it's short (And they make no recommendation on it's length, just that you "Generate the key using cryptographically-strong pseudo-random data"), I'm going to crack the key, and then I'm back to attacking all the other credentials.

I MIGHT be convinced of this being reasonable if it can be ensured that all the HMAC calculations are done in something like an HSA or TPM which generates a large key internally and doesn't expose it, even to the application. But that's probably not the scenario we're talking about here. Even then, you've got nothing to lose by using an adaptive algorithm rather than the HMAC construct. For anything else, it's far safer (and easier really!) to use scrypt, bcrypt, or PBKDF2. So just do that.


Some of this is good advice, but there's a BIG point on the 'Password Storage Cheatsheet' that's linked and referenced by the above article, that I don't think is solid.

I read they recommend to use the both (adaptive hashing and "local parameterization"). As even if you utilize separate device (HSM for example) for encrypting the passwords (I'd encrypt instead HMAC), you should indeed not give up on adaptive hashing.

Here's some good commentary on this by Solar Designer (user solardiz on Reddit): http://www.reddit.com/r/netsec/comments/26d52c/yescrypt_pass...


"I read they recommend to use the both (adaptive hashing and "local parameterization")."

I read their text as recommending A or B, based on the intro where they state "Two approaches facilitate this, each imperfectly."

"As even if you utilize separate device (HSM for example) for encrypting the passwords (I'd encrypt instead HMAC), you should indeed not give up on adaptive hashing."

Right, this is the sort of thing that led me to hedge by saying I MIGHT be convinced if an HMAC were involved; it still gives me pause, for sure.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: