
How Long Should I Make My API Key? - okket
https://blog.learnphoenix.io/how-long-should-i-make-my-api-key-833ebf2dc26f
======
ecesena
I see some confusion both in the article and in the comments.

If we're talking about an API _key_ , then the issue isn't collision and
there's no relationship with hashing or uuids except the recommended final
length.

A key should be a random 128-bit string (or 192, or 256 if you need additional
security).

If you need to generate a key from a passphrase, use a key expansion function,
not an hash just because the length is the same.

I don't know if there's any proof of randomness for uuid, the risk (in absence
of a proof) is that to provide uniqueness the bit distribution is uneven,
which could leak some information to an attacker, thus reducing the real bit
strength of the key.

~~~
justusw
Most programming languages do not use secure random number generators. As you
suggested it is not always wise to use UUIDs for API keys if you are not sure
about the nature of the PRNG.

Python for example uses os.urandom for random UUID generation (uuid4), as can
be seen in the source:
[https://github.com/python/cpython/blob/3.5/Lib/uuid.py#L603](https://github.com/python/cpython/blob/3.5/Lib/uuid.py#L603)

The standard library guarantees that os.urandom is "[...] suitable for
cryptographic use."
([https://docs.python.org/3.5/library/os.html#os.urandom](https://docs.python.org/3.5/library/os.html#os.urandom))

I'm not sure about other languages, so it might always be a good idea to look
it up in the source code.

~~~
nitrogen
In Ruby you should use the SecureRandom module, but even that has had issues
in the past.

[http://ruby-doc.org/stdlib-2.3.1/libdoc/securerandom/rdoc/in...](http://ruby-
doc.org/stdlib-2.3.1/libdoc/securerandom/rdoc/index.html)

[https://news.ycombinator.com/item?id=11624890](https://news.ycombinator.com/item?id=11624890)

~~~
MichaelGG
Seems like it still has issues? The bug is still open. Embarrassing but
entertaining.

------
henrikschroder
This is not very well thought out.

1) If you are generating API keys, those are presumably used to make API
calls, and every such call has to verify that the API key exists so it can
deny or allow the request, which means that doing a lookup on any API key has
to be an _extremely_ cheap operation.

So do the lookup after generating each and check for collisions. If you hit
the jackpot and actually managed to randomly create a duplicate, generate
another. Problem solved, and you don't have to worry about the collision
frequency of your hash algorithm as a weird workaround for skipping a single
extra lookup at API key creation.

2) Use GUIDs. It's what they're for. It's in the name. Globally Unique
Identifier. This is the problem they solve.

~~~
rspeer
GUIDs are Microsoft's implementation of UUIDs. UUIDs are long enough (128
bits) that you can generate them randomly and be confident in never getting a
collision.

That's what UUID4 is -- all but six bits (that's 122 bits) are completely
random. Checking for UUID4 collisions is the height of paranoia. It's
basically checking whether your PRNG broke.

So your second recommendation is basically the same as the article's, and
contrary to the first. Comparing UUIDs is not that expensive, but as expensive
as it has to be for a key that will never collide.

~~~
justusw
Comparing UUIDs is cheap, but avoiding non-constant-time compares is a
necessity. So the usual for loop with an early exit when two bytes don't match
up is poison. A web application could become susceptible to timing attacks.

This is a good article that helped me understand and mitigate the
vulnerability: [https://codahale.com/a-lesson-in-timing-
attacks/](https://codahale.com/a-lesson-in-timing-attacks/)

------
jnbiche
Why would you use a hash function here? What are we hashing? The username and
password? If so, that means you can't revoke the API key without revoking the
password. Since you're storing the username and password anyway, an API key
should just be a long random string, generated using a CSPRNG, base64-encoded.
Also, by generating the API key yourself, you're not depending on the security
of the user's password (which you would if you're hashing it).

~~~
BillinghamJ
If you want to support communication over non-secure connections, you'd want
to provide a "secret" key, and an ID for that key. Then you'd create an HMAC
hash with the secret key, rather than actually sending the key.

Doing that securely and protecting against replay attacks is non-trivial
though.

~~~
jessaustin
This seems like something we're not supposed to do... why not just use tls or
ssh?

~~~
vertex-four
TLS client authentication (which I think is what you're getting at) is a pain
in the ass to implement clients for properly - OpenSSL is dreadful to use, and
for many HTTP client libraries you'll need to use it (or a very thin wrapper)
to set X.509 client certs, if you even can. On the other hand, there's decent
libraries and plenty of information on how to generate and verify an HMAC, and
avoid replay attacks.

~~~
jessaustin
No that's not what I'm getting at. I'm saying the design of secure protocols
is best left for smart people. Every time you post on HN, you're using cookie
auth, but you don't have to worry about replay attacks because everything runs
inside TLS. Any environment that has HMAC libraries also has the libraries one
would need to do the right thing.

~~~
nitrogen
HMAC is useful for expiring, limited-use tokens without having to store every
single token, just the base key. It's used by CloudFront, for example.

~~~
jessaustin
Oh definitely, just pointing out that TLS client certs are used only rarely.

------
jwatte
"Hashing" and "random keys" are different. The author keeps mentioning
"hashing" but it's "number of true random bits" that matters.

If the generation of those keys uses pseudo random data, time, etc, the risk
of collision goes up.

Read 16 bytes from /dev/urandom and stuff them into base64encode and you have
a pretty good random API key in readable ASCII.

~~~
Bino
I usually try to avoid base64 when encoding data/password/keys to users. Due
to it's large/"odd" charset. I would recommend using binary to hex (base16)
notation or base32.

~~~
jwatte
Why is it bad to use case sensitive alpha plus digits plus two punctuation?
The article already uses 62 chars (minus the two punctuation.)

If you encode fewer bits per character, you get a longer key but no more
security.)

Raw binary: 16 bytes Base64: 22 bytes Base32: 26 bytes Base16: 32 bytes

The only reason to use base 32 is codes that a human may enter (promo,
verification, serial) to remove case and O/0 ambiguities.

The only reason to use base 16 is laziness (some APIs generate this for you.)

API keys are presumably copied from the generation page and pasted into the
local app config file.

------
katrielalex
You need the API key to be long enough to be unguessable. Otherwise, I could
get free stuff by just guessing someone else's key.

This is significantly longer than what you need to avoid a collision. The idea
of having a three-character key is just crazy.

~~~
abrookewood
I thought he was just using that as an example to see what the risk of
collision was - not actually suggesting that they be used.

------
IgorPartola
I am confused about the hash function involvement in this. I just use a random
number generator to give me one of the 62 possible values. Why do I need to
bash something?

~~~
BillinghamJ
[https://news.ycombinator.com/item?id=12327215](https://news.ycombinator.com/item?id=12327215)

