

Comparing Hash Algorithms: Md5, Sha1 or Sha2? - Kop
http://www.not-implemented.com/comparing-hash-algorithms-md5-sha1-sha2/

======
lucb1e
No, no, no.

First off, MD5 is said to be insecure. Sure, but only for some purposes. This
very recently appeared on the security stackexchange website[1]: Why does some
popular software still use MD5?

In the SHA-1 chapter he mentions that you can truncate SHA-2 outputs if you
are concerned about storage space. No! It may be entirely safe or even
desirable in the case of SHA-2, but how can you be sure? It might very well be
designed differently. Never invent your own crypto. Another recent thread on
the security stackexchange mentioned[2]: _"The fact that you need to ask this
question is the answer itself - you do not know what is wrong with stacking
these primitives, and therefore cannot possibly know what benefits or
weaknesses there are."_

Then the article goes on about SHA-2: _"Sha-256 should be chosen in most
cases, including hashing your user passwords"_ Oh God no. Use bcrypt, scrypt,
PBKDF2--I don't care. Don't invent your own crypto. Hashes work great for
storing passwords, but hashing algorithms aren't invented for password
storage! Hashing algos are supposed to be fast (see the SHA-3 competition,
they were looking for algorithms that were especially fast on hardware
implementations as complement for SHA-2).

And then lastly, testing only a single implementation (the one in .NET) is
hardly a good comparison. It doesn't matter much how much hashes you can do
per millisecond, especially since secure _password_ hashing algorithms are
supposed to run slowly.

All in all, have a look at this if you want to know the real answer:
[http://security.stackexchange.com/questions/211/how-to-
secur...](http://security.stackexchange.com/questions/211/how-to-securely-
hash-passwords)

[1] [http://security.stackexchange.com/questions/33108/why-
does-s...](http://security.stackexchange.com/questions/33108/why-does-some-
popular-software-still-use-md5)

[2] [http://security.stackexchange.com/questions/33531/why-
improv...](http://security.stackexchange.com/questions/33531/why-improvising-
your-own-hash-function-out-of-existing-hash-functions-is-so-bad)

~~~
Kop
Thanks lucb1e for pointing it out -- I edited the article to reflect this and
I apologize for (temporarily) spreading the misconception that SHA-like
functions are good for hashing passwords.

The intention for my article was to analyze the different hash functions, and
benchmark them in .NET, and not to discuss anything related to passwords
(which would require a blog post of its own, and it's a topic I hadn't
extensively researched). I got a little carried away mentioning passwords.

------
Systemic33
| "including hashing your user passwords"

Using a hash algorithm for password storage is just plain wrong. Use something
designed for password encryption instead, such as BCrypt

------
wereHamster
> Sha-256 should be chosen in most cases, _including hashing your user
> passwords_.

NO, NO, NO, NO. How many times does that need to be repeated. Don't use SHA to
hash passwords!

------
aeon10
I thought a hash being relatively slower makes it better.

Because then it would take more time to brute force. I mean if hashes were
near instant then even without any theocratical collision that wouldn't be
good, that hash would fall faster than a slower one!

~~~
masklinn
> I thought a hash being relatively slower makes it better.

Depends on the workload, and the article does not explain anything there so
it's completely useless.

For "slow hash" workloads such as password hashing you want a slow hashing
function, but none of the three hashes listed is acceptable for password
hashing anyway so their comparison is not really relevant (they can be part of
a password hashing strategy e.g. as the crypto hash function of PBKDF2 through
HMAC[0], but in any case you'll want to size your iteration based on the
acceptable time spent hashing in the normal workflow).

For things like checksumming, you do care about checking throughput and you
care that payloads are hard to craft (so it's not feasible to alter the file
but keep the checksum unchanged), depending how important the checksumming is
you may accept tradeoffs e.g. if you want to be resistant against accidental
corruption CRC32 might be enough (very fast but easy to fake), if you want to
protect users against malicious attacks (e.g. distro package repository)
you'll want a good cryptographic hash as validation is more important than raw
throughput. MD5 can work there, maybe more on the CRC32 side or as an early
validator (e.g. with a fallback hash so MD5 is used to weed out some stuff
fast and potential false positives are eradicated through a second hash
function)

For hash maps, you want speed and spread across buckets.

To provide flexibility, core crypto hashing function try for two goals: 1.
throughput speed and 2. limiting collisions

[0] if you want to hash user passwords — and you should - there are currently
three known good choices, all variable workloads (means you can increase the
number of "rounds" without decreasing the entropy, making the passwords as
hard to crack as you can afford): PBKDF2, Bcrypt and Scrypt. The first 2 are
CPU-hard (the variable workload deals with the amount of computation only),
the last one is also memory-hard (it also takes a variable amount of memory —
configured when hashing — making it much harder to parallelize on GPU or
ASIC). The point of variable workloads is that as CPUs improve you can add
more rounds to the process to keep up (for instance the original PBKDF2 RFC
recommended "1000 iterations" in 2000, the current standard is generally
10000~20000 and OWASP recommends 64000. Generally speaking you want to give it
as much time as you can — on your production hardware — without user
disruption, shooting for at least 250ms is a good idea in most systems
which'll only rarely need to "login" users and apply this delay)

