

Ask YC: Which crypto hash should I use for a unique id? - siculars

I am considering which hash to employ as a unique key for a system I'm developing and I am interested in knowing what the communities thoughts are. I know SHA1 and MD5 are heavily used in this area.<p>My chief concerns are:
computation cost (should be language agnostic),
length of output hash (bytes),
ease of implementation,
collision (think billions or many billions)<p>http://www.hashemall.com/ is a fun site i found while kicking this around...
======
aristus
Your chief concerns are not that important: the major hash functions are a)
very cheap to run, b) already implemented in every language you can think of,
and c) have collision rates of one in many trillions.

MD5 is not recommended for digital signatures since it's been broken. SHA1 is
under suspicion. But in most cases _other than_ digitial signing of untrusted
data, collisions are not the end of the world. Just use SHA1 or SHA256 and
move on to more important problems.

------
apgwoz
Why not use GUIDs (<http://en.wikipedia.org/wiki/Globally_Unique_Identifier>)?
They're actually intended to be used as ids, while crytographic hashes are
designed for different purposes.

~~~
siculars
you have to generate a guid for every piece of data you want to store but the
guid doesnt have any real connection to that data other than you linking it.
the hash is always reproduceable given the same inputs.

~~~
apgwoz
Yes, the hash is always reproducible given the same inputs, but you never
specified this as a requirement in your question. Only that you want a unique
identifier.

I don't know for what purpose you need these identifiers, but I have to wonder
why you want to use an identifier that is going to change when the data
changes. If it's going to be thrown in a relational database, this sounds like
a nightmare.

------
Harkins
SHA1.

MD5 has been broken, researchers have deliberately crafted documents with
different contents and the same MD5. So collision is a problem if you're
hashing documents from untrusted sources.

~~~
siculars
ya i read that somewhere, md5 is out.

