
The definitive guide to forms based website authentication - mmare
http://stackoverflow.com/questions/549/the-definitive-guide-to-forms-based-website-authentication
======
mdemare
Well, this is weird. I created this question when StackOverflow was just out
of beta, hoping to steer it to more broader questions - guides, if you wish.
This question really took off, but the format didn't, and SO mostly became a
stack of incredibly specific questions and answers.

And now somebody, but not me, has submitted this question to HN. Under my
name. I'm puzzled...

~~~
mceachen
It's not under your name, unfortunately. mdemare != mmare

~~~
pserwylo
But it indeed seems to be somebody who is claiming to be the author of the SO
post (perhaps for karma?).

They only created the HN account today in order to post this story, and they
chose a username which is obviously a reference to Michiel de Mare from the SO
post.

Whereas according to mmare's profile, he's been around on HN for a good while
now.

I agree that it's strange.

------
UnoriginalGuy
I'm normally highly sceptical of anything which is essentially a how to guide
on security of, well, anything but I have to say whoever this author is they
absolutely know their stuff.

Normally security advice is just 1980s circle-jerking of the same meaningless
"sound good" concepts (e.g. "At least one upper-case, number, special
character") but actually, no, not in this case.

Instead he is giving advice which is modern, which is based on how people
actually use these systems, and also the common mistakes developers make while
building them (e.g. not hashing forgotten password keys).

He even linked to NIST Special Publication 800-63 and THEN talked about login
attempts over time. This dude is just incredible. I literally couldn't have
written a better article than this.

~~~
moe
To provide a counterpoint, the section about the "Remember Me"-cookie is
rather terrible (I stopped reading after that).

It's not fundamentally flawed but rather inelegant (and potentially expensive)
to store a magic number server-side for each session. You can implement the
same thing more easily by handing out tamper-proof (HMAC) cookies containing
the start- and end-time, and storing only the last_logout-timestamp for each
user on the server-side.

Any cookie where _expire_at < now_ or _created_at < last_logout_ is to be
rejected at validation time.

~~~
jahewson
The problem for most mere mortals with the HMAC scheme is that there is a very
real possibility of the server being compromised and the secret key being
stolen. In this case an adversary could generate valid cookies for any user.
However, with the magic number scheme as long as only the _hash_ of the random
number is stored (we should add a per-user salt too) then the entire session
database could be compromised, but an attacker cannot do anything with it.

Also, though less of an issue with SSL, the HMAC approach is subject to replay
attacks while _expire_at < now_.

EDIT: The HMAC approach also lacks any method to invalidate cookies manually,
or automatically e.g. when the user changes their password. This means a
compromised account is open to attack until _expire_at < now_, and there's
nothing you can do about it other than blocking the account for that duration,
which now means that each request needs to do a database lookup to see if the
account is blocked. You could generate a per-user secret key, but now you have
a database lookup again, so you might as well use the magic number scheme.

~~~
moe
If your server is compromised you tend to have bigger problems than session
forgery. Why would an attacker bother to fabricate http-sessions after he
already gained access to your database and, in most cases, source-code?

Replay-attacks work the exact same whether you use this scheme or store a
magic number server-side. There's no difference whatsoever.

For a password-change you update the last_logout timestamp and hand the user a
new cookie (since his current one was just invalidated).

------
DenisM
As a rule most security advice on stack overflow is dangerously wrong. It's
just not a good topic for the site, because consensus if often wrong in such
complicated question.

I don't see anything obviously wrong with this particular article (aside from
challenge response or SSL choice - one should just always use SSL, and if you
can't, then seek professional advice), however I am still apprehensive of the
hive mind.

~~~
oakwhiz
There was some information in there about SRP being patented that I thought
was misleading. It is patented, but it's freely licensed.

~~~
dfox
The main problem with SRP being mentioned at all is that it has no meaningful
security value in web application context. It makes sense when client does not
entirely trust server, which makes no sense when you deliver client as bunch
of .js files from the same "untrusted" server.

~~~
oakwhiz
Very true, however, most users are willing to download such things as native
programs (e.g. installing web browsers) from unauthenticated sources over
unencrypted connections. If you are in a position to inject .js resources,
then the user's security would be compromised anyway.

EDIT: SRP is going to be integrated into TLS soon anyway, so we might as well
hold our breath for that.

------
optimusclimb
2 points by sreeix 160 days ago | flag | discuss
<http://news.ycombinator.com/item?id=4047424>

316 points by moonlighter 457 days ago | flag | comments
<http://news.ycombinator.com/item?id=2859234>

~~~
mceachen
One url is terminated by a /. HN isn't doing URL normalization (by hitting the
URL and only recording what is at the end of the 302-redirect-chain)

------
dochtman
Added a mention of/link to Mozilla Persona.

IMO, it's the easiest way to handle authentication today, fully decentralized,
secure, and with nice privacy guarantees. With it, you don't have to care
about user names (just use email addresses), passwords and secure storage
thereof, it mostly just works (and once it'll get linked into the big email
providers in December or so, almost everyone will already have an account).

~~~
rodolphoarruda
how does it prevent that sniffing data issue from happening when you are not
using SSL? Or you just cannot use Persona without SSL?

~~~
callahad
Presuming you're using session cookies, Persona is no less secure than any
other reasonable authentication system when used without SSL.

It also has the nice property that what Persona transmits over the wire -- the
proof of identity -- is only valid for 120 seconds. Sniffing it in real time
would temporarily allow you to masquerade as another user on that specific
site, but any sort of delay and you're locked out.

This is a huge improvement over, say, transmitting passwords, which could
grant access to an account for months or years.

------
jicktroyat
In the article they talk about the 500 worst passwords of all time. Here is a
gist listing those passwords. <https://gist.github.com/4033452>

Might be useful for some of you.

~~~
ktf
As an X-Files fanboy, I was pleased to see "trustno1" on that list!

------
y0ghur7_xxx
Where I work we use something simple like kerberos/basic/digest/custom http
header authentication on our apps, and then put Apache with mod_auth_form in
front of it (or ISA server).

I even wrote an authentication reverse proxy[1] in java in my spare time, so I
can use that to publish my apps, and have SSO across all of them (until
BrowserID becomes mainstream that is). This way I centralized the cookie auth
problem, and don't need to care about it in every app.

[1]<http://p.r0xy.it/>

------
jonalexr
Regarding website authentication, I've been looking for some feedback on a new
auth scheme.

Instead of using a standard password (all characters are allowed, min 5
characters, common passwords not allowed), you're able to login with a 4 digit
passcode. I know someone just cringed at that thought, but the idea
centralizes around improving user experience on the website.

First, all normal precautions would be taken (no common digit patterns - 1234,
1111, 2222, etc). There would also be a limit of two attempts before the
passcode is reset. The reset procedure would be them receiving a new passcode
via SMS, and them having to reply "yes" before the account is unblocked. The
passcode is also reset every month, and a new one is sent via SMS to your
phone (you can reply to change the passcode to something else).

Now for the issues I would need to address before this is even a possibility:

1) Users on the website login with their phone number, so one obvious attack
would be someone cycling through all possible phone numbers with the same
passcode (for example 8237). One suggestion in the article was detecting
average error rates and comparing them to see if the entire website login
should be throttled.

2) If someone somehow gets a hold of the database, all passcodes would be
easily crackable. Now usually this would be a huge issue, but this is because
normally people could use the email/password combination to login to other
websites the user might use. Since they're using 4 digit passcodes, this
wouldn't apply.

3) Someone could write a script to try phone number/passcode combinations
until the entire website has their passcode reset, but this would fall under
1) where the error rates would exceed the normal limits and the logins would
be throttled.

4) What would be an appropriate way to throttle? I mentioned it twice above,
and in the article it was referring to a timeout, but the user experience of
this would negate all benefits of a 4 digit passcode. Someone could keep
trying combinations, and keep throttling the site every day. I could block the
ip's, but what if those ip's were also sources of legitimate traffic and
stopping users from logging in/signing up.

Thoughts?

~~~
darklajid
Sorry, nothing personal.

But this 'new' approach feels like last decade online banking - and it wasn't
a good idea at that point.

In addition: Limiting user input and forcing password resets is, in my world,
directly acting against your idea of 'improving user experience'.

If I am allowed to use a password of my choosing, I'll probably come up with
something that is memorable and reasonably secure (depending on the context, I
admit). If you force me to follow random, voodoo rules (just digits, at least
one digit and one upper-case letter, more than x but LESS THAN y chars) I'm
going to sigh, come up with something like 'YeahRight123' and I'm going to add
a mental note to never trust this service fully. If I'm not leaving right
away, that is. Resetting a password regularly (oh.. I hate everything
noticeable SOX forces upon us)? Cool, you just motivate me to make my passwort
'cool123' - 'cool234' etc. (with variations for 'clever' password checks. If I
cannot keep a prefix, I'll juggle different parts and keep the same, crappy,
useless, insecure password, because .. I cannot be bothered to follow
arbitrary idiot rules)

Your idea follows the worst practices in terms of restricting the keyspace and
auto-resetting the password at arbitrary times, starting out weak already (4
digits..).

I wouldn't sign up with some 'security' in place that follows your suggestion.

~~~
jonalexr
It's great to get different perspectives on the concept. I agree that
enforcing rules and resets does impact user experience.

What if the user was to authenticate once via SMS (we send them a code and
they enter it within a reasonable time period), and once they do, they're
authenticated for an infinite amount of time. This way they don't need to
remember a passcode, and just need to have their phone on them when accessing
the website from a new computer - a similar experience to two factor auth.

------
eze
I hope this gains traction before it's closed as subjective or such...

------
thefsb
it's mostly good. NIST abolished their algo for pasword entropy estimation
some time ago. i do not much like any password strength tests, most of which
rate any number of terrible passwords as strong. as such i think they give a
false sense of security. maybe consider cracklib.

as DenisM said, always use SSL for all traffic if security matters and don't
trust SO for security advice.

~~~
debacle
The only really useful password strength test would be one that said "A stock
Thinkpad would be able to brute force this password in $x hours and $y
minutes."

Might make people think twice about that six character password.

~~~
bigiain
How about a response that says "we just googled that combination of email
address and the md5 hash of that password, it's been listed in at least 7
different database disclosures, including the Gawker one, the Sony one, and 5
different pr0n site compromises. We suggest using a different password here."

;-)

------
frasierman
Quick note about CAPTCHAs... A more accurate rate is $1.50 per 1000, and
that's even a tad expensive.

If you buy in bulk, it's much cheaper.

Source: Security researcher.

------
hayksaakian
Two things that stood out to me:

Given that the most common 50 passwords are known, why not reject them
outright? Simply state to the user: your password is too easy to guess.

Passwords should always allow spaces in order to allow people to use easier to
remember passwords, a la xkcd.

<http://preshing.com/20110811/xkcd-password-generator>

------
duncans
> I see multiple, severe problems with this old question from 2008 and I am
> tempted to delete it outright -- primarily because the most highly voted
> answers read more like blog rants than actual "answers".

[http://meta.stackoverflow.com/questions/95172/old-
problemati...](http://meta.stackoverflow.com/questions/95172/old-problematic-
question-edit-or-delete)

~~~
papsosouid
It is worth noting that Atwood is the one saying that, and I would trust a
random 3rd grade student's opinion on the subject over his. Notice how he is
unable to provide any actual criticism of the answer, just "I don't like it"?
That is his way of saying "I don't know what I am talking about, so someone
else criticize it and then I'll jump in and say I was going to say that".

------
danielwozniak
It says if your going to use captcha, use reCaptcha because it is "by
definition hard for ocr". I think it is completely mistaken.

Two words are shown for reCaptcha, one that is "by definition" ocr easy and
one that is hard. You don't need to "solve" the one that is hard. In-fact, you
can put anything for the hard one. You only need to solve the part that is "by
definition" ocr easy.

------
novaleaf
I'm jaded, but the first thing I thought when I read this was:

"If I asked this question, 5 minutes later it would be closed as subjective"

------
madjar
The first answer mentions a couple of time that any token given to the user
(for remember-me login or password reset) should be hashed in the database.

Would it be possible to replace the whole storing by signing the token with
some private key, so that the validity of the token can be checked without
having to compare it to some stored value ?

~~~
jahewson
Yes, you could use an HMAC for this, however you need to keep the private key,
well... private, which in practice is not easy. If the server is compromised,
an attacker could steal the secret key and use it to generate signed cookies
for any user. This method is also subject to reply attacks for the duration of
the token's validity, though that is less relevant with SSL.

Whereas if only token hashes are stored in the database, then the entire
database could be stolen and nobody can use it to generate valid cookies.

EDIT: Also, if an account goes rogue you have no way to invalidate its
cookies, so you'll have to do a lookup for each request to see if the account
is blocked.

------
led76
What do people think of services like <https://www.loginprompt.com/>?
(provides logins as a service for your startup)

Isn't this sort of security something we wish we didn't have to learn? And for
people who don't take the time maybe it's best to let a third-party handle it.

~~~
callahad
> _Isn't this sort of security something we wish we didn't have to learn?_

Absolutely. Time spent on your auth scheme is time you're not spending on
building your product. (And half-assing your auth scheme generally comes back
to bite people.)

That said, outsourcing it to a centralized provider may not be the best idea
for business, user, or security reasons. So it's a balance.

Of course, I'm biased: I work on the Persona team at Mozilla, where we're
trying to build a simple, secure, fully decentralized, and open source
authentication system that fits that niche rather nicely, but the points above
stand: you have to figure out the opportunity cost of your chosen solution.
There's no universal answer.

~~~
cedricd
100% agree with you. I love the concept of Persona, but it has a serious cold-
start problem. If I could implement it and nothing else on my site, I would,
but unfortunately the reality today is that most users don't know it.

------
nsxwolf
I still have no idea what a "Remember me" checkbox is when I encounter one. It
certainly doesn't seem to be a "keep me logged in" function. I don't know if
it has something to do with form autofill, because my browser seems to do that
wether it is checked or not.

Can anyone demystify this for me?

~~~
pygy_
"Remember me" checkboxes and form auto-fill are unrelated.

The form auto-fill behavior depends is part of the browser UI, and can be
configured in its option menu.

The a "remember me" checkbox sets an identifying cookie with a late expiry
date. Until that date, and unless you log out, the web site will recognize you
(the server keeps a registry of what ID number correspond to which user). No
need to authenticate on connection because the cookie is sent with each
request.

Without the remember me option, the expiry time is short, say 30-60 minutes,
but it may be renewed as long as the user is active. If you're inactive for a
longer period, the cookie will be discarded, and the site will not recognize
you anymore.

When you log out, the session reference is deleted on the server, and,
optionally, the cookie is cleared in the browser.

~~~
skeletonjelly
> The form auto-fill behavior depends is part of the browser UI, and can be
> configured in its option menu.

This can be dictated by the website as to whether or not this is allowable.

------
dools
Shouldn't this be called the definitive guide to _session_ based
authentication?

------
duncans
> if an attacker got his hands on your database, he could use the [persistent
> login cookie] tokens to log in to any account

If an attacker gets his hands on your database, it's kind of game-over
already.

~~~
DenisM
I don't mean to be rude, but you clearly don't understand this subject.
Databases leak, not least of all due to human errors. Half the effort in
computer security goes to preventing the leaks, and the other half goes to
mitigating the consequences of such leaks. Hashing the passwords, salting the
hashes, the entire md5/sha/pbkdf/bcrypt/scrypt debacle, all of these things
are there _only to mitigate the consequences of a database leak that is
presumed to happen at some time in the future_.

------
criswell
I love the attention to usability in the first answer.

~~~
mmishra
I am also amazed the way StackOverflow manages such a huge knowledge base .
Information is such nicely organised and unwanted content automatically gets
trimmed out in the end.

It is such a beautiful product. I like it particularly for the way they broke
the rules of conventional forums (Yahoo forums , Google Groups ) for technical
discussions.

------
nicolaus
Or use Mozilla Persona. <http://www.mozilla.org/en-US/persona/>

------
zobzu
Its full of good info, but most of the time now, i'd just put persona and be
done with it

------
bjhoops1
Fantastic resource. Just what I was looking for.

------
WayneDB
Why do maximum security sites always disable auto-complete for username and
password?

That seems less secure to me. If I always have to type in my password, chances
are that I'll choose a password that can be easily remembered or I'll be
forced to write it down somewhere.

(Personally, I use plugins to get around this anyway. My computer, my rules.)

~~~
jonknee
Probably to prevent people accidentally saving a login on a shared/public
computer.

~~~
bigiain
The curious thing about this "solution" is that it's pretty fundamentally
broken. If you're authenticating to any important site by typing a
username/password into a computer you don't "trust", you're doing it wrong.

The subset of computers in between "people I don't trust also use this
computer" and "this computer could easily have had a key logger Or root kit
installed" must be vanishingly small.

If you don't own it (or trust the person who owns it enough to satisfy your
personal security requirements), then any username/password you type into it
should be considered "possibly compromised" no matter what measures the
website has taken to protect you. Two factor auth helps, but still have the
problem that 2/3rds of your auth credentials could be compromised (the
attacker could end up knowing your gmail username & password, leaving only the
six digit auth-code to brute force, which I _hope_ google have sensible
protection in place for). Single use passwords also help, but both tfa and
single use passwords don't protect against an attacker who 0wns the machine
seeing and recording everything that happens in your current session -
including I suspect for a sufficiently skilled attacker (or perhaps even a
script kiddie with an off the shelf tool), complete access to the post SSL
decrypted data inside a trojaned browser (if I can modify the browser, none of
the httponly or secureonly flags for your session cookies are safe, sure,
JavaScript can't extract them, but the browser code can… And it could be
exporting them in real time to the bad guy, or piggybacking proxied
instructions to empty your bank account via Western Union while you check your
credit card balance)

~~~
jonknee
I don't disagree, but if you run a site of any size you will quickly realize
that users will do all sorts of crazy things against the best practices for
security. One of the top (if not the top) search requests Google gets is
"facebook login". What do you want to bet that a lot of those requests are
coming from a shared computer?

