Hacker News new | past | comments | ask | show | jobs | submit login

H(m) -> SHA1(SHA1(m)) is not the same hash function as H(m) -> SHA1(m).



Right, but if SHA1(m) = SHA1(n), then SHA1(SHA1(m)) = SHA1(SHA1(n)) also.

The converse is not true. There will be cases where SHA1(SHA1(m)) = SHA1(SHA1(n)), but SHA1(m) != SHA1(n). (Or does SHA1 somehow guarantee that this will not happen?)

So it seems to follow that the chances that H(a) = H(b) when H(m) -> SHA1(SHA1(m)) must be higher than when H(m) -> SHA1(m).

Am I missing something?


I think you're going to need to better explain the logic behind graf 2 before I can answer you.

But on the off chance that we can end this thread gracefully, I'll point out that SHA256(SHA(256(m), m) is SHAd256(m), and considered by Ferguson to address security concerns in straight SHA256.


It boils down to this:

If SHA1(n) = SHA1(m), we know SHA1(SHA1(n)) = SHA1(SHA1(m)). (this is a consequence of SHA1 being deterministic)

Also, even if SHA1(n) != SHA1(m), there is still a chance that SHA1(SHA1(n)) = SHA1(SHA1(m)). (this is a consequence of collisions: SHA1(a) may equal SHA1(b) even if a != b.)

So the probability that SHA1(SHA1(n)) = SHA1(SHA1(m)) must be more than the probability that SHA1(n) = SHA1(m).


You realize that despite the fact that I still don't follow the logic here, SHA collisions have no meaningful impact on the security of PBKDF, right?


Actually I didn't, but that more or less answers my original question. Now I realize why you (or rather, your character) said that it didn't matter that md5 is broken in this case.

In general though, I'm still convinced that putting a hash function in a loop will increase the probability of a collision.

Or maybe I misunderstood your point about the loop? I assume by putting a loop around the hash you mean something like this:

  key = SHA1(password)
  1000 times do
    key = SHA1(key)
  return key


Look I don't want to be dense here --- and there's a good chance that's exactly what's happening --- but it sounds to me like you're talking about collisions between SHA1, SHA1(SHA1), SHA1(SHA1(SHA1)), etc.

Rename the functions. SHA1(SHA1) is now SHA1', SHA1(SHA1(SHA1)) is now SHA1''.

Explain why I care if there's an m and an n such that SHA1'(m) = SHA1''(n)?

They're two different hash functions.


No, you're still missing the point :)

He's saying that SHA1'(m) = SHA1'(n) implies that SHA1"(m) = SHA1"(n).

Let's do this with a very simple, silly hash function.

Let h(x) = (x/2)%10 (in integer arithmetic, e.g, 5/2 = 2, so h(5) = (5/2)%10 = 2%10 = 2.) So then let h2(x) = h(h(x)) = (h(x)/2)%10, and h3(x) = h(h(h(x))) = (h2(x)/2) % 10.

Here's a table:

  x:  h  h2  h3
  0   0  0   0
  1   0  0   0
  2   1  0   0
  3   1  0   0
  4   2  1   0
  5   2  1   0
  6   3  1   0
  7   3  1   0
  8   4  2   1
  9   4  2   1
  10  5  2   1
  11  5  2   1
  12  6  3   1
  13  6  3   1
  14  7  3   1
  15  7  3   1
  16  8  4   2
  17  8  4   2
  18  9  4   2
  19  9  4   2
  20  0  0   0
  21  0  0   0
  ...
Hopefully this table shows that collisions are much more likely for h2 than for h, and more likely for h3 than for h2. The range of h3 is smaller than the range of h2 which is smaller than the range of h.


Right, that's what I had in mind.

Of course, it's a bit of a jump to assume that the result is the same when you use a cryptographic hash function instead of a trivial one.

But regardless of the hash function, the same problem will occur unless there are no collisions for input between 0 and 2^160 - 1 (inclusive). From what I can tell, SHA1 can't guarantee this.




Registration is open for Startup School 2019. Classes start July 22nd.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: