Well, according to that reference, it's hardened against a specific, previously known attack. Do you have any information on whether that also protects against the different, new attack which was just published?
Not so much the specific attack, as the broad class of attacks. I think this new work is in that same broad class but I am not a mathematician.
The idea in Marc Stevens' anti-collision work is that some inputs are "disturbance vectors" which do unusual things to the SHA-1 internals, and we want to detect those and handle that case specially since there is almost no chance of that happening by accident. It has a list of such vectors found during his research.
This paper doesn't talk about "disturbance vectors" but it does discuss ideas like "Boomerangs" which I think ends up being similar - I just don't understand the mathematics enough to know whether that means "the same" or not.
Hardened sha1 does detect this new attack. Easy to test: Check their pair of files into a git repo and see that they have different checksums, while sha1sum(1) generates the same for both.
No, you and joeyh are incorrect about the test (but correct about the result). As can be seen in the output, SHA1(bar)= f1d2d2f924e986ac86fdf7b36c94bcdf32beec15 but git_SHA1(bar) = 257cc5642cb1a054f08cc83f2d943e56fd3ebe99 . Why is there a difference? Not because of hardened SHA1. Hardened SHA1 essentially always produces identical outputs to SHA1
> git doesn't really use SHA-1 anymore, it uses Hardened-SHA-1 (they just so happen to produce the same outputs 99.99999999999...% of the time).[1]
There's essentially no chance that the string "foo\n" fell into that tiny probability of difference. The reason there's a difference is because before git hashes something, git will do various processing to it (maybe appending and prepending various things) and those things broke the carefully created collision. But a chosen-prefix attack might mean those various things can be accounted for, and a collision could still be found.
How would an attack on a git repo work? You create a repo with identical hashes but different content and next time the user clones from scratch they get your modified version?
yeah my thoughts about git are similar. look at the two messages they have an an example:
Key is part of a collision! It's a trap!yE'NsbK#ދW]{1gKmCx's/vr|
-pJO_,1$)uB1qXv#U)9ESU;p~0G:Y
ݕbBIjFra눰3&t'lB_!h5M([,˴QMK#|o5pv|i,+yYpݍD7_Rf\'GUZ,ϵdvAYAugV=Lk8_E 2
+nolBtxXoQt&+?Y3LP:'Qt(,ۛuԪWJm:A"M6<|B4kVv̨ޠA=M+m%殺j5N|EMA\Ed-
s&@u@:a?pq^Xf0U?R}
they have the same sha1sum, but in all practicality its nonsense since both messages are pure trash. you couldn't have malicious C code that would have the same hash as non malicious C code in this example
Dump your garbage string behind a // or inside an #if 0, restrict the garbage string character set to characters which will not disturb that, and your compiler will whistle while it works.
That's exactly what a chosen prefix attack means. You choose the arbitrary prefixes. Then the garbage is inserted. Then (due to SHA1's Merkle–Damgård construction) you append a postfix that's mostly arbitrary (but the same in both files).
I think active projects would detect this fine - but what if that commit was pushed to lpad and everyone ended up pulling it to local because it's a dependency of a dependency of a dependency in NPM?
Or what if it's a really obscure library for parsing like... pyramidal jpeg2000s, are the library consumers going to be checking the source? Heck, most people already don't check download checksums unless their downloader does it automatically.
Hmmm, does the garbage string actually have to survive long?
If there's a followup CL to "delete a garbage string that accidentally made it into the repo", which doesn't actually fix whatever else was added, would that get you anywhere?
If you could push up a commit that computed to the same hash of the last tagged release in a repo... I'm not certain, the tag might end up referencing the new object? Certain versions of git (i.e. maybe git for windows) may also react in different manners.
In theory you might get people building software packages for distros to build your malicious version, you may also just temporarily shut down the ability for anyone to check out the version (basically denial of service for making?) but the time window would be weird.
You'd probably be most successful modifying the original repo - either by being the creator of the software or gaining their trust. However, it would have to be a rather powerful SHA1 attack for the commit to still be valid syntax, hard to detect, and make a meaningful malicious change.
I see that Git doesn't actually use SHA-1 any more, it uses "hardened SHA-1": https://stackoverflow.com/questions/10434326/hash-collision-...