
From Collisions to Chosen-Prefix Collisions Application to Full SHA-1 [pdf] - lelf
https://eprint.iacr.org/2019/459.pdf
======
nneonneo
There are, broadly, two kinds of collisions. In the first kind, you can pick
an arbitrary prefix P shared between the collision pair; the algorithm spits
out a pair of (essentially random) blocks of data C1, C2 in the middle that
are different but yield the same hash value (H(P+C1) = H(P+C2)). In the second
kind, you can pick two arbitrary (equal-length) prefixes P1, P2, and the
algorithm outputs collision blocks C1, C2 such that H(P1+C1) = H(P2+C2).
Because of the Merkle–Damgård construction common to MD5 and SHA-1, you can
add arbitrary (equal) suffixes without affecting the collision.

With the first kind, you don't have much control over the collision blocks,
and the prefixes and suffixes have to be the same. This can still be useful -
file formats can be carefully rigged so that the differences in the collision
blocks are noticed somehow (e.g. misaligning length fields, changing testable
field values, introducing different code). With MD5, these collisions take
just seconds to compute; with SHA-1, the one known example (Shattered) took
2^63 hash evaluations to compute.

With the second kind, you can do a lot more evil, since you can pick two
completely arbitrary prefixes. This was abused (for example) to create a rogue
CA certificate from an ordinary signed website cert. With MD5, this takes
several hours with a good GPU and CPU; with SHA-1, it was previously thought
to take around 2^77 evaluations - more than 16,000 times harder than the first
type of attack.

This paper describes a method that would take only 2^69 evaluations for the
second kind of attack on SHA-1, only 64x harder than the first type (in the
worst case). This would be feasible for a big institution like Google or a
government.

All of which reminds us that, yet again, SHA-1 deserves to be laid to rest.

------
hinkley
Around 2010 I had to play Benevolent Dictator and refuse to support SHA-1
signatures on software that could potentially kill people.

Every time another article like this comes out I experience two emotions.
First, a bit of smugness mixed with a lot of relief that I won that argument.
Second, a bit of shock when people say "it's time to stop using SHA-1" as if
that hasn't been the recommendation for 10 years.

It's dead, dead, dead. Now is not the time to have discussions about
upgrading. Now is the time to start throwing people under the buss for
dragging their feet on upgrades.

------
twoodfin
Some nice practical detail: They estimate using the methods of previous
research that the new technique brings the price on EC2 of your very own SHA-1
chosen-prefix collision to somewhere between $1.2MM and $7MM.

------
JackC
Does this have any implications for the urgency of replacing SHA-1 in git?

I recently asked a question on the pip issue tracker[1] about whether it would
make sense for pip to _start_ accepting sha-1 git refspecs when running in
--require-hashes mode, on the grounds that there's more upside to encouraging
use of --require-hashes than there is downside risk of sha-1 collision
attacks. This surely doesn't help that argument, but I'm not sure if it hurts
it either.

[1]
[https://github.com/pypa/pip/issues/6469](https://github.com/pypa/pip/issues/6469)

~~~
latchkey
Previously:
[https://news.ycombinator.com/item?id=13719368](https://news.ycombinator.com/item?id=13719368)

------
floatingatoll
The Git SHA1 to SHA256 transition plan is here, though it may not be finalized
yet:
[https://github.com/git/git/blob/master/Documentation/technic...](https://github.com/git/git/blob/master/Documentation/technical/hash-
function-transition.txt)

------
CiPHPerCoder
First SHA1 was shattered. [https://shattered.io](https://shattered.io)

Now it's reduced to shambles.

It's time to stop using SHA1. (HMAC-SHA1 is still okay.)

~~~
tptacek
It's OK in the sense that it's not a security emergency, but, don't use HMAC-
SHA1 (or HMAC-MD5, which is also in the same sense "ok").

~~~
wolf550e
When do you expect it would be feasible to forge HMAC-SHA1 or HMAC-MD5?

~~~
nneonneo
All current collision attacks assume you know the intermediate hash values
(IHVs). With a typical HMAC forgery scenario, you don’t know the secret key so
you don’t know the IHV corresponding to the inner hash application, which
means existing collision attacks are not applicable.

If you do know the secret key, collisions are easy, but probably pointless
since you can already forge HMAC signatures for anything you want.

Still, even if there’s no way to break HMAC-SHA1 (or even HMAC-MD5) faster
than the birthday attack, you should still prefer to pick a better hash
algorithm for safety.

~~~
a1369209993
Also, you shouldn't use a different hash algorithm for HMAC than for other
hashing, because that increases the amount of security-critical code you have,
which directly increases the chance that there's a bug in your security-
critical code.

~~~
tptacek
It's pretty unlikely that you're going to have a bug in your SHA1 and SHA2
cores, for whatever that's worth to you. Virtually nobody implements them by
hand.

~~~
kbwt
Anecdotally I know of one very popular application that has a bug in their
SHA-1 implementation causing it to effectively only perform a single round.

~~~
wolf550e
Would that make it only interoperable with itself, and not have the security a
correct implementation of SHA1 provides? Is it a security bug (i.e. is there
anything worth stealing by breaking their not-quite-SHA1-hash)?

~~~
kbwt
The application is using a proprietary client/server protocol, so it already
lacks lacks any kind of interoperability.

In this specific case, it's unclear whether the bug has direct security
implications. The broken SHA-1 is used on some user-controlled data that gets
XORed onto the server's decryption of a user-specified payload before being
passed into an RC4 key schedule. It's certainly plausible that this might
produce a server-assisted privacy compromise of other users' sessions.

------
tzs
I didn't see any date in that paper. Is it new? (The URL suggests 2019, but
that could just be when it was put on that server, not when the paper was
written).

~~~
lelf
It is,
[https://eurocrypt.iacr.org/2019/program.html](https://eurocrypt.iacr.org/2019/program.html)

