
Lifetimes of cryptographic hash functions - tswicegood
http://valerieaurora.org/hash.html
======
ReidZB
I think SHA-2 should be "minor weakness discovered" (if not outright
"unbroken"), not "weakened".

At the onset of the SHA-3 competition, everyone was nervous about SHA-2: it
appeared as though a good attack was inevitable, what with the cryptanalytic
attacks on SHA-1.

But as the competition went on, things got calmer. The attacks against SHA-2
that were so expected simply weren't coming[1]. And so now the status quo is
that SHA-2 seems pretty darn safe, and the real focus of the SHA-3 competition
shifted towards not necessarily having a direct replacement for SHA-2, in the
sense of performance, but instead having a design that was sufficiently
different to not allow SHA-2 attacks to apply to it. And Keccak is just that:
quite different.

Anyway, my point is that SHA-2 is mislabeled. Honestly, I think cryptographers
recommend it the most out of any of the hash functions currently; SHA-3's
software performance is rather... lacking.

[1] Some may argue that this is because cryptographers were focused on the
SHA-3 candidates, but I'm not so sure

~~~
thirsteh
BLAKE2 is a very good alternative if you want software performance:
[https://blake2.net/](https://blake2.net/).

Just stop what you're doing and look at scrypt, bcrypt or even PBKDF2-HMAC-
SHA512 if you're thinking something that involves the words "passwords" and
"fast hash function." ([http://throwingfire.com/storing-passwords-
securely/#notpassw...](http://throwingfire.com/storing-passwords-
securely/#notpasswordhashes))

~~~
moondowner
+1 for bcrypt and related solutions.

[http://yorickpeterse.com/articles/use-bcrypt-
fool/](http://yorickpeterse.com/articles/use-bcrypt-fool/)
[http://codahale.com/how-to-safely-store-a-password/](http://codahale.com/how-
to-safely-store-a-password/)

------
pbsd
Why is SHA-2 orange? As far as I know, besides length-extension, there's no
known weakness on the full hash function.

~~~
chacham15
[https://en.wikipedia.org/wiki/Comparison_of_cryptographic_ha...](https://en.wikipedia.org/wiki/Comparison_of_cryptographic_hash_functions)

or more specifically:
[http://dx.doi.org/10.1007/978-3-642-38348-9_16](http://dx.doi.org/10.1007/978-3-642-38348-9_16)
and [http://eprint.iacr.org/2010/016.pdf](http://eprint.iacr.org/2010/016.pdf)

~~~
harshreality
The attacks are on reduced-round versions of SHA-256 and SHA-512. Those do not
describe any attacks faster than (dumb) brute force against SHA-256 or
SHA-512.

I think labeling SHA-256 as orange is highly misleading. The SHA-2 family of
hashes is going nowhere unless the partial-round attacks get a lot closer to
the full-round versions. They're still about 20+ rounds away.

------
josephagoss
If the SHA-2 family have weaknesses, and SHA-2 is used for generating Bitcoin
blocks, whoever breaks this first will be an overnight millionaire, just make
sure you break them slowly (about 20 a day max) to avoid suspicion that the
hashing is compromised. Sell as much as possible and then release your paper.

~~~
qnr
Indeed, bitcoin is like a cryptography competition with ridiculously huge
prize.

1\. Break SHA2 -> control bitcoin generation ($2500 each generated block at
current prices)

2\. Break ECDSA -> unlock any addresses that have ever sent money on the
blockchain

3\. Break ECDSA+SHA2+RIPEMD160 -> break ALL addresses, even those that have
never sent money.

Incidentally, the difference between 2 and 3 is why it is not recommended to
reuse bitcoin addresses.

~~~
eterm
Break any and the value of bitcoins will crash hard.

You can't take millions out of a system which doesn't actually have millions
worth of liquidity backing it. The total worth of all bitcoins is actually far
smaller than the market capitalization of all bitcoins because the trade of
them is based off an assumption that many if not most coins are dead coins.

~~~
qnr
Well, MtGox monthly volume is over 1 million BTC [1] or $100 million at
current prices. I think you should be able to take out at least a few million
without raising suspicion

In case of SHA2 compromise in particular, generating, say 20% of daily blocks
would hardly be noticed (and can be easily explained as a new shipment of
ASICs). This is about 720 BTC or $72000 daily.

[1]
[http://bitcoincharts.com/markets/mtgoxUSD.html](http://bitcoincharts.com/markets/mtgoxUSD.html)

------
ceautery
The "slashdotter reaction" column is priceless!

------
Scaevolus
Use of SHA-1 for digital signature generation has been deprecated by NIST
since 2011. It's disallowed after 2013-- which is important for software
aiming for government use.

[http://csrc.nist.gov/publications/nistpubs/800-131A/sp800-13...](http://csrc.nist.gov/publications/nistpubs/800-131A/sp800-131A.pdf)

------
mistercow
>[1] Note that 128-bit hashes are at best 2^64 complexity to break; using a
128-bit hash is irresponsible based on sheer digest length.

Can a short hash which has not been weakened be lengthened by taking two
hashes and concatenating?

    
    
        fixedSalt = "blah"
        longerHash = (salt, input) ->
            hash(salt + input) + hash(salt + fixedSalt + input)
    

Edit: Never mind. Obviously an attacker would only have to break the first
half of the resulting hash.

But is there any valid way to lengthen a too-short hash? Not that it's of
practical importance; I'm just curious academically.

~~~
sdevlin
This is actually a pretty interesting question. The answer, at least for
Merkle-Damgard hash functions (MD5, SHA-1, SHA-2, etc) is that concatenating
(or "cascading") hash functions doesn't really improve the strength of the
resulting construction.

Merkle-Damgard hash functions look like this:

    
    
      function MD(M, H, C):
        for M[i] in pad(M):
          H := C(M[i], H)
        return H
    

For message M, initial state H, and compression function C. In other words,
pad the message, break it into blocks, and use the compression function to
"mix" each block into the hash function's state. The final state is the
result.

Antoine Joux showed that for any MD hash function, generating many collisions
is not much more difficult than generating one. Here's how:

    
    
      1. Take the initial state H[0].
      2. Find two single-block messages that collide under C with H[0] as the input state. Call the result H[1].
      3. Now find another pair of single-block messages that collide under C with H[1] as the input state. Call the result H[2].
      4. Iterate as needed.
    

Here's the trick: each single-block collision you find is actually doubling
your set of colliding messages. This is because for each block of the message,
you can select either of two candidate blocks. So if the effort required to
find a collision in a b-bit hash function is 2^(b/2), the effort to find 2^n
such collisions is only n * 2^(b/2).

What does this mean for cascaded hash functions? Well, consider a hypothetical
construction that simply concatenates a 128-bit hash function with a 160-bit
function. A plan of attack might look like this:

    
    
      1. Find 2^80 colliding messages under the 128-bit function. This should take roughly 80 * 2^64 ~= 2^70.3 units of "effort".
      2. Evaluate each message under the 160-bit function. There's probably a collision in there somewhere. This will take around 2^80 work, dwarfing what we did in the first step.
    

The effort to find a collision under both functions is thus only about what it
takes to find a collision under the stronger of the two.

Shameless plug: we will have a few problems exploring the consequences of Joux
collisions in set seven of the Matasano crypto challenges.

~~~
mistercow
Ah, so in fact the naive concatenating solution I gave, in addition to being
just as easy because the attacker only has to break half of it, is actually
even _easier_ because the attacker has two targets to collide with.

What about non-concatenative methods? For example, could you do something like
"shift and xor". For example, say you have a 4-byte hash function, so that:

    
    
       hash(salt + input)             :  0x3AF9
       hash(salt + fixedSalt + input) :  0x8034
    

then you shift one by a byte and xor like so

    
    
           3AF90
       xor 08034
         = 32FA4
    

And then you could iterate that to extend to as many bytes as you want? And
then maybe you'd want to xor the first and last byte together so that nothing
from the first hash remains - although now I'm thinking intuitively which
generally seems to be a bad idea with crytpography.

~~~
sdevlin
> Ah, so in fact the naive concatenating solution I gave, in addition to being
> just as easy because the attacker only has to break half of it, is actually
> even easier because the attacker has two targets to collide with.

I wouldn't say it's easier. Remember that we need to find a single message
that generates a collision under _both_ hash functions. So our strategy is to
generate a massive number of collisions for the shorter function and hope that
there's one pair of messages in there that collide under the longer function.

> What about non-concatenative methods?

I think this will again boil down to finding a single message that generates a
collision under both hash functions. It won't matter too much whether you XOR
the hashes or concatenate them.

~~~
mistercow
Hmm. It still seems to me that the simple concatenative method could still be
broken at least slightly more quickly, since each function can be collision
tested separately, but my brain is vaguely gesturing at comprehension about
why xoring is still weak.

I feel like we should be teaching the principles of crypto to young children
so that we end up with some humans that can grok it as easily as the rest of
us do algebra. But there are a great many things I would want young children
to be taught if I were made the Benevolent Dictator of All School Boards.

------
jgale
I guess 2004 was a crazy year for cryptography.

~~~
pbsd
Indeed it was. Wang's breakthrough work [1,2] broke most of the common hash
functions at the time, and later also SHA-1 [3]. The SHA-3 competition was
motivated by this streak of new successful attacks.

[1] [http://eprint.iacr.org/2004/199](http://eprint.iacr.org/2004/199)

[2]
[http://www.iacr.org/cryptodb/archive/2005/EUROCRYPT/2868/286...](http://www.iacr.org/cryptodb/archive/2005/EUROCRYPT/2868/2868.pdf)

[3]
[http://link.springer.com/chapter/10.1007%2F11535218_2](http://link.springer.com/chapter/10.1007%2F11535218_2)

------
liamzebedee
Sorry, why is RIPEMD-160 deprecated? I've been unable to find any supporting
information as to why.

~~~
rgbrenner
it appeared after sha1, receives less attention than sha1, and is slower than
sha1... so why use it at all?

~~~
thirsteh
Many use it because it wasn't developed by the NSA. It is a default in
TrueCrypt, for example.

~~~
nly
It's used in Bitcoin as well (after SHA-2) to shorten up addresses. Come to
think of it, I think Tor uses it for .onion addresses... which they shorten to
80 bits (of preimage resistance!)

------
tudorconstantin
Confession time: i still have some apps with salted md5 hashed passwords

~~~
thirsteh
You shouldn't be using any of the functions on that page directly, anyway:
[http://throwingfire.com/storing-passwords-
securely/#notpassw...](http://throwingfire.com/storing-passwords-
securely/#notpasswordhashes)

~~~
zanny
But it does matter, because if you can exploit a weakness in the hash function
you can figure out the salt, strip it, and then use your precomputed
dictionary.

~~~
thirsteh
That's not really how it works. Either way, you shouldn't be using just a
regular "hash function" anyway. Even basic constructions like PBKDF2 use HMAC
constructions where SHA1 and even MD5 are still pretty safe to use (although
not very computationally expensive.)

------
dkokelley
For more information on the 'weakened' state of SHA-2, see
[http://eprint.iacr.org/2004/207](http://eprint.iacr.org/2004/207)

(Full text PDF:
[http://eprint.iacr.org/2004/207.pdf](http://eprint.iacr.org/2004/207.pdf))

------
lmm
It'd be nice to have whirlpool in the list - I remember when it was seen as
the great new hope for a good hash, but I haven't heard anything about it in
recent years.

------
wldlyinaccurate
The Expert/Programmer/Non-expert reactions at the bottom are priceless.

------
Ricapar
The "reactions" table at the bottom made my morning.

------
islon
2004 was a bad year for cryptographic hash functions...

------
rorrr2
It's missing the most important ones: Scrypt, PBKDF2, Bcrypt.

[https://en.wikipedia.org/wiki/Scrypt](https://en.wikipedia.org/wiki/Scrypt)

[https://en.wikipedia.org/wiki/Bcrypt](https://en.wikipedia.org/wiki/Bcrypt)

[https://en.wikipedia.org/wiki/PBKDF2](https://en.wikipedia.org/wiki/PBKDF2)

Scrypt being an absolute nightmare to bruteforce, even for short passwords.

[http://i.stack.imgur.com/sOMvu.png](http://i.stack.imgur.com/sOMvu.png)

~~~
ReidZB
These aren't cryptographic hash functions _exactly_ , though, at least not in
the sense that a cryptographer would think. I mean, they _will_ fit just about
any definition of a cryptographic hash function you can think of, but really
it's not that useful to label them as such. Instead, they're usually called
key derivation functions.

On top of that, even if we were to include these in a hash function list,
they're decidedly not the most important ones. Most important for password
storage and other key derivation, perhaps, but the applications of hash
functions are far more general. The preimage resistance of scrypt is reliant
on SHA256, for instance.

~~~
16s
Cryptographic hash functions must be efficient to compute. Those examples
(scrypt, bcrypt, etc) were designed to be difficult to compute. Those are
password hash functions, not cryptogrpahic hash functions. Two totally
different things with different purposes.

~~~
ReidZB
Well, in complexity theory (which theoretical cryptography uses heavily), an
efficiently computable function is one that has a polynomial-time algorithm to
compute it.

I mean, wouldn't you say that scrypt is efficient to compute? For instance, is
5 seconds not a relatively quick function evaluation? Compare that to super-
polynomial-time attacks, some of which wouldn't succeed before our Sun burned
out and Earth died. And if you ramp up the security parameters to an insane
degree, the user can no longer compute the function themselves. That's the
reason for the "efficient to compute" clause in most definitions.

So, while you're right that the fact that KDFs are designed to be much slower
than hashes is what really separates them, that doesn't disqualify KDFs from (
_technically_ ) being cryptographic hash functions. At least, not if you view
the definition in a theoretical sense, which is the appropriate way to do so.
Still, I agree with your premise; in a practical sense, KDFs shouldn't feel
like they are cryptographic hashes, since their purpose is markedly different.

