
Life in a post-database world: Using crypto to avoid database writes - ComputerGuru
https://neosmart.net/blog/2015/using-hmac-signatures-to-avoid-database-writes/
======
jsnell
Last time I used this approach I ended up regretting it. The problem is that
the links become ugly, unwieldy for the users to handle, and occasionally
cause problems that are rather hard to debug. For example the base64 encoding
I was using for the authentication token included '.' as a one of the
characters. Well, turns out that the linkification code in gmail will ignore a
'.' at the end of the link (which is kind of sensible). So 1/64 of my
validation links ended up being invalid.

The supposed benefit of the scheme is that you don't need to evolve your DB
schema as you add or remove validation fields. That's rubbish. If you're happy
with storing data with no schema in the links, you should be equally happy
storing it without a schema in the DB. (For example a token field, an
expiration time field, and a json blob containing the actual payload
corresponding to the token).

~~~
tinco
This is why people use base48 or base36 for binary data in URLs.

There is a really good reason to store these things in the database though,
and that's on-demand authorization retraction.

~~~
bjt
You can still do that on-demand retraction as long as the hash is fed
something from the database that you can change.

IIRC, Django uses both password and last login timestamp inside the hash. If
you needed to invalidate someone's token you could just add 1 millisecond to
their last login time.

------
oautholaf
There's really a strong reason to do this for things like user session tokens,
where using crypto and avoiding database updates could easily remove a
substantial portion of your database traffic.

But for things that are less common, like password resets and email
validation, consider the downsides:

a) using the database gives you a built in (depending on schema) audit log of
these events, where a signed token does not b) If the key is stolen in the
scheme described, the person in possession of the key can hack into all your
accounts. If you use database state, someone needs to have access to the
database. c) (also mentioned above) You will almost certainly be able to have
a smaller token if it refers to database state, which has it's own advantages:
copy and paste errors, etc

There's certainly advantages to crypto tokens/cookies, and they're the right
call in some circumstances, but there's downsides to consider as well.

~~~
adventured
Spot on. The article disregards that you really need a log of activity to
check against with reset requests.

Should I allow an abusive user to send 37 reset codes in 15 minutes to an
email address they don't own (or even if they do own it)? Absolutely not. How
else do you keep track of that activity without storing it in a database of
some sort to check against?

~~~
riquito
For short lived things like these you can store the info you need in a memory
based storage.

------
Justsignedup
Problems I see:

\- Anyone with access to the secret token can now generate their own urls
without fail or evidence. With a DB you need to at least insert a row and thus
leave evidence (especially if you are auditing db activity)

\- No way to individually invalidate anyone. Example: What if ONE account's db
tokens were exposed. Should be able to invalidate those tokens, not ALL
tokens.

\- Not necessarily easier than one "expiring token" mechanism that is re-used
throughout the system. \- I don't feel great about exposing all my private
security checks in url parameters. Why give users clues on how I do security?
I can provide that in a better, more meaningful way.

\- No way to easily invalidate all tokens of specific nature. Example: If you
want to invalidate all password reset tokens, all 2-factor authorization
tokens, all confirm email tokens after any password reset, all systems must
know of each other rather than go to the token table and set "valid" on all
those tokens to "false" for everything. The hash would need to account for
what change triggers this hash invalidation rather than just someone picking
what to reset from a dinner menu as a separate concern.

~~~
hvidgaard
> Anyone with access to the secret token can now generate their own urls
> without fail or evidence. With a DB you need to at least insert a row and
> thus leave evidence (especially if you are auditing db activity)

And they wouldn't be valid. It's much simpler to invalidate them provided that
you don't need information from the DB, than reading from the DB.

> No way to individually invalidate anyone. Example: What if ONE account's db
> tokens were exposed. Should be able to invalidate those tokens, not ALL
> tokens.

Of course you design it in such a way, that you can change one of the inputs,
which will by definition invalidate the tokens.

> Not necessarily easier than one "expiring token" mechanism that is re-used
> throughout the system.

The point is to seperate this boilerplate logic from your core business logic.

> I don't feel great about exposing all my private security checks in url
> parameters. Why give users clues on how I do security? I can provide that in
> a better, more meaningful way.

Any serious security systems assume that an attacker already know 100% of the
workings of the system. You should not rely on obscurity, but on your secret -
in this case the crypto key.

> No way to easily invalidate all tokens of specific nature.

Again, if this is a requirement, then make such a global parameter an input to
the token generation. It's still much simpler than doing it "the old fashioned
way".

~~~
pwf
> Of course you design it in such a way, that you can change one of the
> inputs, which will by definition invalidate the tokens.

I was looking at using JWT to avoid a database read when authenticating a
request, but in order to get some sort of variable per-user value I'd have to
hit the database to get it, no? Doesn't that kind of defeat the purpose?

~~~
hvidgaard
Said variable could be in memory, but the point isn't to save DB reads, but to
avoid saving data you don't need to. This is a technique to implement the same
feature without having to store additional information in your core business
logic, that is relatively useless outside of this password reset or user
signup. Of course you may want to save this for BI of audit purposes, but then
you can (and should) use an appropiate seperate system.

------
michael_storm
_In other words: a maintenance nightmare, filled with security gotchas,
difficult to scale, and hard to write._

So is the alternative the author describes (except for scaling). HMACs for
temporary credentials are clever and simple, if you're already versed in
crypto primitives. But to anyone else, they're magic security sauce: "But why
can't I just use MD5? Wait, no, that's broken somehow - what about SHA? What
do I include in the HMAC? Do I have to use a new key every time?"

The solution trades implementation complexity for arcane knowledge. The author
is so confident that his solution is superior because he already has that
arcane knowledge.

That said, a well-implemented HMAC-based password reset flow is probably more
bulletproof than a well-implemented DB-based flow. So if you're implementing
AWS, go for HMAC. If you're implementing a blog, go with whatever you feel
like. Just implement it well.

~~~
sk5t
Are you arguing in favor of developer laziness? An average programmer should
be able to understand the use cases for HMAC within a couple hours or so,
given nothing but the Wikipedia page and an IDE or REPL, and knowing about
MACs might prevent him from cooking up all manner of broken security protocols
in the future.

~~~
skybrian
How confident should the average developer be that they are able to create a
secure crypto implementation in a couple of hours, based on some possibly-
dubious advice they read in Wikipedia? I would argue that they shouldn't be
very confident in this.

The first thing you learn about crypto is that it's harder than it looks and
best left to experts. Knowing when it's actually okay is difficult. At the
very least, I'd expect that they ask for a review from someone who does know
what they're doing, and how many developers who don't work at a big Internet
company have a spare security expert handy?

~~~
vidarh
Nobody is suggesting the developer should roll their own crypto. Pretty much
every language out there have mature packages for HMAC's. What you need to
know are a few basic things about how to use them to verify the integrity of a
message.

------
aaronlevin
Armin Ronacher wrote about this back in 2013. [My Favorite Database is the
Network]([http://lucumr.pocoo.org/2013/11/17/my-favorite-
database/](http://lucumr.pocoo.org/2013/11/17/my-favorite-database/))

------
thegeomaster
It's a nice article and generally a sound approach, but reading it has led me
to think about something else: imagine an Internet where the data about each
user (what would normally be entries in different tables on the servers) is
stored, encrypted and authenticated, on the user's computer, as opposed to
being stored on the server. Yes, I know, this goes against the 'cloud'
principle we see around us and introduces a bunch of issues with syncing this
data and it being available everywhere, which is the point of SaaS-like
websites. But on the other hand, small websites could use this approach to
drastically cut their costs and maintenance work they need to do. Also, a
compromise of a server would expose the keys used for encryption, but then the
attackers would need another piece of the puzzle, and that's stored on the
user's computer.

Don't get me wrong, I'm not advocating this principle or implying it is needed
and doable with the current technologies (I don't think we have a way of
reliably storing such long-term data within the user's browser), but it's an
interesting train of thought. It doesn't obviate the need for the server to
store any sensitive information, just some of it. User info, password hashes,
credit card numbers, maybe even social graphs depending on the use case, could
all be stored this way. There will always be websites for which this doesn't
make a lot of sense, but a lot of the other ones, mainly small players, could
benefit from this.

It's curious how, when you describe it like that, it sounds that your data is
being held hostage by a third party. After all, the server doesn't allow you
to see what's in the encrypted opaque blobs, it only allows a controlled set
of operations to be done with this data, and the only thing they do is further
mutate this opaque blob. But it is no less transparent than the current state
of affairs, namely servers keeping the data for themselves.

I'm interested to hear HN's opinion on this, flaws in the approach I've
missed, etc.

~~~
theseoafs
> imagine an Internet where the data about each user (what would normally be
> entries in different tables on the servers) is stored, encrypted and
> authenticated, on the user's computer, as opposed to being stored on the
> server.

Trying to authenticate solely on the client side isn't as fruitful as you
might be imagining, since it introduces potential security isues. Without any
server-side authentication using only information (i.e. a blob) that the
server has access to, it becomes very easy for clients to spoof/pretend to be
other clients.

Consider that if this kind of system were easy/fruitful to implement, a lot of
businesses would have done it already to avoid increased development/server
costs on their end.

~~~
thegeomaster
If the data is encrypted and authenticated with a server-known key, you
couldn't change the data and end up with the server accepting it. How would a
spoofing scenario work in this case? If you can steal another user's blob, you
can pretend to be them, but if you can steal that blob, you can also steal a
cookie if the website uses cookies for authentication. Maybe I am
misunderstanding you.

One issue I can think of is users spoofing older blobs (reminiscent of a
replay attack). I believe this could be mitigated if the server keeps a single
256-bit hash of each user's state (this would also make client-side HMACs
redundant), or a 64-bit ever-increasing integer which is then included in the
state. That's way more tiny than any user state.

------
legulere
Why the scheme in case-study 2 can be bad:

User 1 registers username "alice". User 2 registers username "alice". User 1
verifies account. User 2 tries to verify the account. If the programmer didn't
forget to check twice if an account name exists there's some error.

This might not happen often, but you still need to write a good error handling
for that case.

~~~
TJSomething
Add a salt. It doesn't even have to be that large because of the unlikelihood.
Alternately, adding IP address as a hashed parameter should distinguish them
well.

~~~
legulere
Do I understand you right, that you should add something to the user-generated
usernames? This seems like something that would be rejected by users to me.

------
ris
This is the way I've always done password resets and activation emails, and it
never seemed particularly non-obvious.

In fact I seem to remember django's default out-of-the-box password reset
mechanism does it more or less this way.

~~~
ars
I've also always done it this way. I didn't realize that was anything unusual.

It just seemed so obvious.

------
lumpypua
An iPad app I like and wanted to build on top of uses signed requests. Except
the signing occurs on the iPad and is verified by the server. Oops. I
disassembled the app and took the secret key to generate my own requests.

------
doomrobo
Cool approach, but I don't see how it could be used to make one-time links.
You can invalidate a valid immutable datum; you can only wait until it is
expired. To make a link expire immediately after being clicked, you need to
store a _something_ server-side.

~~~
mschwaig
If you include the current password hash when calculating the hash for the
reset link and you also include a random salt in your password hash, the reset
link will become automatically invalidated because the password hash will
change when a successful reset is performed.

~~~
raziel2p
I think the point was, if I request a password reset X times for the same
email within the duration of the expiration time, there'll be X amount of
valid password reset URLs that can potentially be bruteforced. It doesn't
matter if the other links are invalidated as long as one of them works.

This is solved by rate limiting, I suppose. Feels like something that
should've been included in the article.

~~~
aptwebapps
> I think the point was, if I request a password reset X times for the same
> email within the duration of the expiration time, there'll be X amount of
> valid password reset URLs that can potentially be bruteforced.

What does this have to do with whether you're storing tokens or using a hash?

~~~
raziel2p
Sorry for the late reply. In the apps I have, I delete any existing tokens for
the given e-mail address when a new one is requested, so only one is valid at
any given time.

------
Tinned_Tuna
Yea... this has been around for years.

JSP and ASP.NET have allowed for this kind of shennanigans in their "view
state" (albeit, security is a configuration option away...) mechanism. It's
not hard to extend it out to things flying back and forth to the user.

As for usability, these sort of things should be wrapped up in a nice
container class; HMAC taken care of, and (probably) a key-value API presented.
No fuss, no muss.

If there's no such library, creating one could definitely pose a security risk
to any project without sufficient expertise, as this post appears to be
endorsing.

Find an existing, tested, reviewed implementation that provides the API you
need, and stick with it.

------
ratsbane
These things can be useful fraud-detection signals. If you don't log password
reset requests then you're missing a chance to catch suspicious behavior. Even
if you're capturing that data but using it now, at least having it gives you
that potential.

Also, unless you plan on making the valid window very short, you probably want
to mark reset requests as used so someone's old email doesn't get used again.

It's not a completely crazy idea, but the minimal amount of trouble required
for the extra tables doesn't seem like enough to give up the logging and
expiration benefits.

~~~
bad_user
Logging things in a relational database for the purpose of catching suspicious
behavior is a bad idea.

You also don't need to mark reset requests as used, if the generated token was
composed of the user's old hashed password or email, as that token will be
invalid as soon as the user does the reset.

~~~
adventured
It's worth noting the parent didn't say anything about relational databases.
Even if they did, I don't see how that would be a problem.

How else can you keep track of whether there is a wave of reset abuse
targeting an user / email, if not through saving it to some sort of data
store?

eg - after 5 reset attempts in 15 or 30 minutes, prevent any further reset
attempts for the next X amount of time; either outright, or based on a
signature of the request

~~~
bad_user
" _Some sort of data store_ " does not imply the normal solution that people
deploy to solve this problem.

There is a big difference between defining a table in your database that keeps
track of resets, or using a queue of messages (event sourcing even?) with
filters applied for detecting abnormal behavior.

The former is a dirty solution, dirty because you're storing junk that
shouldn't be stored in a relational database and it doesn't take care of other
much more important kinds of attacks on your system. Whereas the later is
extensible.

So lets say that in addition to limiting the number of resets one does, you
also want to limit the number of failed login attempts to 10 per hour. You may
also want to limit users jumping between IP addresses, you don't want to be
too strict about it, because mobile connections, but you do want to prevent
multiple sessions active that use the same user credentials. You may also want
your system to evolve based on taking averages out of user's past activity.

Now where does that data go? Of course it's in "some kind of data store", but
that says nothing, because the log files stored on disk are also a data store.

------
francescolaffi
you should have a look at JSON Web Token [http://jwt.io/](http://jwt.io/)

~~~
vidarh
Interesting, but massive overkill when you're transferring assertions between
two parties where only one party (the server) is allowed to create the
assertions.

In the examples in the article, the JWT header is just plain cruft because
you're unlikely to be switching encoding often (and if you decide to,
including a single much shorter token as a "stand-in" for the bloated JSON
data would be much better; using JSON).

The payload also represents a lot of extra overhead unless you intend to
transfer more than just a single level dictionary.

It's kind of comical that they present it as "compact" given that probably
something like 30% of the length of the presented example is unnecessary.

------
EGreg
About the only thing useful here is having the session id be validated before
being used. PHP is notorious for allowing the client to select any session id
they want.

I've read "the network is my favorite database" a while ago. Similar concept.
But, the problem is that the urls are really long and wouldn't fit into an sms
for example. If you are trying to get users to get started with your service
the last thing you want to do is throw more friction at the process.

It's a bit like Microsoft shuttling ViewState back and forth in the old ASP,
instead of just storing sessions in the database. Signing things is good for
validation, to prevent giving out resources to a client which wasn't given
valid credentials. But that stuff should be stored in cookies, not links.
Links should be simple and to the point. And if that means having the links
contain a token the server can use to look up additional info, tinyurl-style,
then so be it. That lookup is fast, and once you did it, you can cache it in a
hash based cache like memcache. Better experience for the end user and better
security for you.

In short - I agree with the author's premise about guarantees etc. but when
giving humans links, they should be simple. Store all the authentication crap
in cookies.

To bring my point home, I'll just remark that inviting someone by sms or email
is a way to not only motivate them to sign up but also having them click a
unique link already verifies their mobile number or email address in one shot.
One click and they have an account.

------
ryan-allen
I love this technique, I copied it from Amazon a while back and have been
using it to avoid storing state in the database ever since, I love it.

Also, Rails has done something similar with their sessions, ages ago, with the
whole state being stored in the client with a secret on the server.

It's such a cool idea, and now there's the JWT [1] thing that you can use, and
I use all the time these days.

[1] [http://jwt.io/](http://jwt.io/)

------
njohnson41
Using HMAC'd tokens to authenticate requests is pretty common in the form of
cookie-based session variables (at least in most non-PHP web frameworks, e.g.
Flask). This is really the same technique, except for email requests, where
the data is embedded in a URL instead of a cookie.

Not to say it's not useful, just that it's not really novel. You also still
have to be careful of replay attacks, which the author briefly addresses.

~~~
tedunangst
It's both common and a common source of mistakes. Every web framework I know
of that uses HMAC'd sessions has had at least one glaring vulnerability that
could have been avoided by using the old school opaque token database lookup
technique.

------
plorkyeran
I tend to mistrust using crypto for things that can reasonably be done without
crypto. Even if the initial implementation is perfect, it's too easy for a
future maintainer to make a simple mistake that introduces major security
flaws. The place to use this sort of technique is where it enables a
fundamentally different architecture for your application, and not just where
it's a small performance optimization.

~~~
adevine
While this is true, to my mind it's almost "6 of one/half dozen of the other".
Using the examples given, there have been many security flaws where a
generated token like a password reset token (one stored in the DB) is not
truly random, or not as random as the implementer envisioned.

Also, keeping systems secure when they need to support use cases that
inherently try to subvert some measure of security (like "forgot password") is
difficult, and I find the difficulty is much more likely to be in the overall
workflow than flaws in the "generateToken" and "validateToken" methods
themselves.

------
sytse
I loved the article and all the use cases are valid. However I don't think the
first two use cases save that much effort.

1\. In Rails apps with the devise gem the password reset token is just a
column, not a whole other table. 2\. Email activation, having the additional
non-activated records is not a big problem, 20% extra records in my users
table is pretty acceptable.

~~~
bryanlarsen
It's not the extra database columns, it's the removal of a state field. The
additional of a single boolean state field essentially _doubles_ your API area
because now you have two types of users that may try to do things, and it's
highly unlikely you have saturation testing for unverified users.

~~~
tomjen3
Actually rails has (or had I haven't looked at it seriously since the load and
execute all the yaml fiasco) a relatively simple before filter (it may have
been called before_filter, actually) that you use to filter out users who are
not signed in, it wouldn't be difficult at all to change that to check the
user was signed in and verified and to update the single place where the user
is allowed in (likely /verify) as an unverified user to not use that filter.

Django has native support for unverified users.

That said it is still a good point.

~~~
bryanlarsen
But that's not the only place you use users. What about batch jobs? What about
every place where you get a list of users for the admin for one reason or
another? Did you cover and test those? Devs seriously underestimate how much
the additional of a state field impacts their code, especially once you get
the combinatorial explosion of several state variables interacting...

------
Kenji
>HMAC256("userId=johnnysmith&expirationTime=1356156000&oldBcryptHash=$oldBcryptHash&clientIpAddress=$clientIpAddress",
$mySuperSecretKey)

Maybe I'm wrong but he can still use this password reset if he remembered his
login and logged in. ID stays, it might not be expired, BcryptHash stays, Ip
stays. That, of course, is suboptimal.

~~~
sk5t
I don't understand what problem you are describing. A user who remembers his
momentarily-forgotten password can of course change or reset it regardless.
The article quite sensibly describes using the old password hash as a nonce
without pestering the database with extraneous records.

~~~
Kenji
It would be nice to stop all reset tokens if the user has logged in after the
reset tokens were made. But you're right, it's not really a problem unless one
chooses make it one ;)

~~~
sk5t
One could overcome this by tracking the user's last password-driven login date
and rolling it into the MAC; alternately, just use a short expiration time on
the reset token and don't sweat it.

~~~
morgante
But that's just introducing additional state (last login date), so if you're
starting to do that there's not much point in the HMAC approach.

~~~
yeukhon
I think it all comes down to what benefit your user's security.

If you want to expose to user his or her account's activity, like history of
recent password reset, recent login activity, things that a typical FB user
would see in the user's activity page, that can be a really nice security
feature. So maybe after all the HMAC approach works just fine with extra state
added. But I think the point the author is bringing is that the application
can simply just do HMAC(.....) easily and verify in a few lines of just if and
else branch, whereas the traditional password reset code would probably take
more lines and more conditions. I had written code that did exactly what the
author didn't like and trust me that code wasn't pretty and I had spent many
hours to optimize the code. Edge cases and test cases shitted my codebase.

------
dmos62
Loved reading this. Most original article I've recently read.

~~~
danielschonfeld
+1!

This was a really refreshing read on HN. Wish there were more articles like
this one utilizing crypto for modern programming applications.

------
meesterdude
Great article!

While I don't think I'll be implementing any of the solutions outlined in the
article (because the database does not represent a pain point for me) I do
think this is a fascinating technique and is something worth keeping in the
toolbox as a solution.

------
jakozaur
One of the other use case is session ids. Pretty much you can sign:

user id|timeout|user custom property

Having custom property (per user number), gives an opportunity to kill
sessions. E.g. session is compromised, just increment it and regenerate
session for valid user.

------
hurin
_In other words: a maintenance nightmare, filled with security gotchas,
difficult to scale, and hard to write._

I'm not sure I follow this. To change a password you need to write to the db
anyways, it's one extra check to set token used=1 or remove it. Why is that a
huge deal?

 _Now you don’t have to worry about clearing old bitrot from the database or
worrying about when to expire non-verified accounts._

It's also possible to use Redis or another fast store with timeouts. The
problem seems rather exaggerated.

------
randerson
This is all fine and well until you need to measure "how many users reset
their passwords" or "what is the bounce rate of users confirming their email
addresses". This may be useful for avoiding sensitive data, but it's not a
practical substitute for writing to the database, nor is it really that much
effort to create two database tables.

~~~
morgante
That sort of data belongs in an analytics or event database (ie. one with
entirely different requirements from your normal database).

Usually when I implement a stateless approach like this I still fire off a log
event every time we generate a password reset.

------
steven2012
If I understand the article properly, what the author is suggesting is that
instead of writing lines into the database that contain information about
users, they create a hash that has some secret + user-supplied information,
and if it matches, then that's considered validation.

The reason why I completely disagree with this is because this is embedded in
code. All your developers will have access to this code, and you have people
joining and leaving your company all the time. If this scheme is used to
protect more important information and if it leaks, then it could cause chaos.

The difference is that all developers generally have access to code, but only
some well-trusted people have database access in prod. If I care about
security, that is a more secure way to do things.

~~~
baby
You could make the secret of the MAC an environment variable.

PS: I don't understand why you are getting downvoted, your comment is more
relevant than most of the comments here who don't understand a MAC.

------
jjarmoc
Lately I see this approach used more and more. I really don't understand what
the big gain is. Sure, you save yourself a few database round trips, but is
that really that big a deal?

Compare that savings with the additional size of URL parameters (increasing
network overhead) in addition to the processing overhead of
encryption/signing/validation. Have we really made a performance gain? I don't
have any hard numbers, but I'm willing to wager than at gains are fairly
negligable.

From a security standpoint, I'm not fond of this approach, and I don't think
it's benefits outweight the added risks. Here's why:

\- Encryption (including simple signing) is really easy to get wrong in subtle
ways. A simple mistake like using SHA256() instead of HMACSHA256() breaks the
system badly.

\- A corralary to the first point is that using a strong key is vital. The
article says nothing of the criteria a strong key should meet and a poor key
could enable additional attacks. While I'm not aware of anything allowing for
HMAC key recovery in better than brute force, this could be problematic. If
your key is a random 10 character ASCII string, it falls to brute force in
minutes at most. If it's 256 random, you're much better off.

\- This scheme introduces a significant number of anonymous user-supplied
inputs to the application which didn't previously exist. Each of these is a
potential place for things to go wrong, which can give rise to other
vulnerabilities (XSS, SQLi, etc) depending on what you're doing with that
input. You'll need to be sure that you validate the signature prior to doing
anything with these unverified inputs, in addition to performing sanity checks
on their contents prior to signing. Part of secure application design is to
minimize attack surface and complexity - this does the opposite.

\- Once a signature is created, it can't be revoked. I mean, you could store
the signature in the database and validate it's presence on receipt, or
include in your signed values a nonce that's stored in the database, but that
kind of defeats the goal of avoiding database storage. This is something the
article tries to address by signing over things like an expiry time, the
user's old password digest, etc. But that's insufficent as well (I'll get into
why below).

\- While I'm not aware of anything allowing for HMAC key recovery in better
than brute force, choosing a secure key is still important. The article says
nothing of the criteria a strong key should meet and a poor key could enable
additional attacks.

\- Even if you get everything right, key exposure is fatal. You've introduced
a single point of failure in the system that didn't previously exist, and a
fairly catostrophic one. If the app ever leaks that key, you're only option is
to rotate it (assuming you detect the exposure), which invalidates any
existing signed values.

That's the broad points, let's look at the article's examples:

    
    
      http://myapp.com/resetPassword?userId=johnnysmith&expirationTime=1356156000&token=%SECURITYHASH%
    

Okay, so we have an irrevokable reset token for johnnysmith valid until
12/22/2012\. The article doesn't mention where the key comes from. Since the
key is the only thing johnny doesn't know, he can attempt as many keys as he
wants offline. If he gets a hit, he can now create a valid token for any
account he wants. _NOT GOOD_ If the key is sufficiently long and randomly
generated, this becomes much more difficult to the point of infeasibility but
if the key is weak, so too is the overall security.

As the article notes the token can also be reused multiple times, even if
johnny has already used it. They offer a means of fixing this:

    
    
      HMAC256("userId=johnnysmith&expirationTime=1356156000&oldBcryptHash=$oldBcryptHash&clientIpAddress=$clientIpAddress", $mySuperSecretKey)
    

Okay, so with the bcrypt digest covered under the hash, a change of password
will cause the signature to no longer match. But now on verification we have
to pull the user's hash before calculating the HMAC, so we've added a database
read back which reduces the perceived benefits.

Furthermore, this token is still valid if Johnny requests a password reset,
and while waiting for the email to arrive remembers that he changed it from
'Password123' to 'Password456' and logs in. There's then a window of time
where that token remains valid (until expirationTime arrives) and can be
abused. The article doesn't address this case at all. I suppose they'd say
that on issuing a token, you should disable login to Johnny's account until
he's reset, but how would you do this? Well, you'd set a flag in that database
on Johnny's record, and we're kind of back where we started again with both a
database read and write, but with added cryptographic overhead and attack
surface too.

The next example talks about account registration. It states that you can
verify a user's email account by sending them a link to:

    
    
      http://myapp.com/createAccount?email=johnnysmith@gmail.com&token=%SECURITYHASH%
    

This doesn't concern me as much as password recovery, since account creation
is likely to be fairly anonymous anyhow and not nearly as security sensitive.
Still, it's not technically true that the token validates your receipt of the
email. Instead, it validates your receipt of the email, or your knowledge of
the key and values used to generate the token. In practice, this is likely a
small difference, but a truly random token stored in the database would
require database access or visibility into the email to obtain, whereas this
requires only knowledge of the key instead.

Case Study 3 talks about one-time-use and expiring resources. There's not
enough detail here about their scheme for me to really critique, but they
point out AWS/S3 use of HMAC signing as an example. Except that AWS/S3 signing
isn't about one-time use or expiration. Instead, it's about authenticity.

AWS operates with an ACCESS_KEY and SECRET_KEY which are a pair of tokens
related to each other. Requests with 'params' are signed as:

    
    
      Signature = HMAC(ACCESS_KEY + params, SECRET_KEY)
    

Signature, ACCESS_KEY, and params are sent to the server. SECRET_KEY is the
HMAC signing key. On the server side, AWS presumbaly does a DB lookup for your
ACCESS_KEY, obtains the corresponding SECRET_KEY, and validates the signature
matches. This authenticates that the requestor has knowledge of the
SECRET_KEY, but it implies the existing of a database lookup. If you think of
the ACCESS_KEY as a username, you'll see that this is really authenticating
each request, as well as preventing tampering. You can issue keypairs that're
good for only one use, or expire after a set time period, etc. but there's
nothing implicit in the HMAC scheme that causes this. Interestingly, this
scheme almost certainly requires _storing the token in a server-side database_
the exact thing the author wants to prevent.

This can be done reasonably well, but it's a minefield to navigate when we
have a tried and true solution. What's so bad about database storage of
tokens? The post kind of starts with the assertion that they should be
avoided, but I see no clear explanation as to why.

~~~
jjarmoc
Okay, wow... forgive my typos and grammatical errors. Rage replying :)

------
engendered
While encryption is obviously a tool every developer should be using, this
seems a little hand-wavy and built against a rather contrived strawman. It is
a little info-mercially -- you fuss, you muss...

Is a table and a couple of columns actually difficult? Is this, or has this
_ever_ been, a problem for anyone? When did we enter the "post-database"
world?

This all makes plenty of sense if you buy the strawman, otherwise it seems
like some pretty strong over-reaching.

