
Why it’s harder to forge a SHA-1 certificate than to find a SHA-1 collision - thedg
https://blog.cloudflare.com/why-its-harder-to-forge-a-sha-1-certificate-than-it-is-to-find-a-sha-1-collision/
======
agwa
We're lucky that hash collision attacks have a relatively simple mitigation
like this. (Although you have to trust CAs to follow the rules and implement
it properly, and events of the last few years indicate that CAs need to have
as few things to screw up as possible.)

However, we're not always going to be so lucky. The next major transition in
digital certificates could very well be to post-quantum crypto due to
advancements in quantum computing. Under that scenario, attackers will be able
to simply compute a CA's private key and sign arbitrary certificates. There
will be no mitigation short of clients ceasing to trust pre-quantum certs. But
clients won't be able to do that unless servers are using post-quantum certs,
and server operators won't want to do that if it would mean cutting off legacy
clients that don't support post-quantum certs.

The solution to this first mover problem is to set a hard deadline after which
legacy certs are retired. This forces clients and server operators to act.
Pushing back the SHA-1 deadline at the 11th hour as CloudFlare proposes sends
a dangerous message that such deadlines don't have to be taken seriously. This
message will come back to haunt the Internet in the future.

~~~
EGreg
Can someone explain to me in plain language / layman's terms how a quantum
computer is supposed to reverse cryptographic hash functions? What would be
the process EXACTLY?

~~~
agwa
It has nothing to do with "reversing" hashes. The attacker would use the
quantum computer to determine the CA's private key (e.g. by factoring the RSA
modulus using Shor's Algorithm), and would then be able to sign any hash they
want. No need to attack the hash function; in fact hashes remain secure under
quantum computing.

~~~
tromp
> in fact hashes remain secure under quantum computing.

Hashes will see their security cut in half, in terms of the effort needed to
find a pre-image. (EDIT: security in bits = log of #evaluations needed)

E.g. finding a SHA256 pre-image, which amounts to a search over a space of
2^256 candidates, can be sped up using Grover's algorithm, to roughly 2^128
hash evaluations.

~~~
mmozeiko
That's not half. Its square root of n. By cutting effort in half of 2^n you
get only 2^(n-1).

~~~
droffel
The effective number of bits of entropy are cut in half.

~~~
mmozeiko
Well he said "in terms of the effort needed to find a pre-image". For that
effort won't be half.

------
TazeTSchnitzel

      $ curl -s https://blog.cloudflare.com/content/images/2015/08/white.jpg | md5
      ccf22bc377846166ed65cd3cd58d2e3d
      $ curl -s https://blog.cloudflare.com/content/images/2015/08/brown.jpg | md5
      810cac197d97da7b216c7883be523495
      $ curl -s https://blog.cloudflare.com/content/images/2015/08/black.jpg | md5
      6bede506abffe08d0c2406d92fbff393
    

Let me guess, the CloudFlare CDN is recompressing the images? :D

~~~
pcl
If you download the files to disk, you'll get the right values:

    
    
        $ curl -s https://blog.cloudflare.com/content/images/2015/08/white.jpg > file && md5 file                                                 
        MD5 (file) = b69dd1fd1254868b6e0bb8ed9fe7ecad
    

I've seen this same sort of thing happen with curl in other contexts, but I've
never tracked down the details. I assume it has something to do with file
stream handling in certain versions of curl. You'll see a discrepancy if you
look at the output of piping to 'wc' vs. examining the downloaded file.

~~~
schoen
I ran the same curl command twice in a row and got files with different sizes!
I think it actually does have to do with Cloudflare rather than with curl in
this case.

~~~
richardwhiuk
Apparently sometimes the CDN is stripping the EXIF data

------
KirinDave
It's worth noting that SHA1 is also suitable for use in HMAC on older
hardware, security is not significantly compromised by SHA1's properties.

You can move to more modern algorithms, but there isn't a pressing need to
remove SHA1 implementations for that application.

~~~
jud_white
I have a question I haven't been able to find an answer to, hopefully someone
here can help.

Why is HMAC+(hash) considered secure, while being considerably faster than say
bcrypt with a cost of 12? For example, if a service used a user provided
password to validate a "secret" (what would normally be the signed message),
is that less secure than bcrypt? If so, what makes guessing the secret used in
HMAC difficult?

~~~
aidenn0
Presumably you will use a key-stretching algorithm before applying HMAC.

[edit]

So yes, directly using a password in HMAC is a bad idea, and less secure than
using some function designed for deriving keys from passwords.

You are confusing two things about sha-1. Assuming a 100% secure cryptographic
hash function that is as fast as sha-1, you should _not_ use it directly for
hashing passwords (though it can be part of a larger construction like
PBKDF2).

This is because the number of passwords you can check per second in an offline
attack is related to the speed of the hash function, and bcrypt (and pbkdf2,
and scrypt, argon2) are all various ways of slowing down the hashing process.

Similarly it is likely that md5 would be roughly as secure for protecting
passwords as sha-256 when used in pbkdf2 because the known weaknesses of md5
would not be of assistance in performing a dictionary attack against a hashed
password.

If you have a high-quality key then HMAC is secure without needing the hash
function to be slow, so if you wanted to use a password with HMAC, you would
first use a KDF to generate a high-quality key from the password, and _then_
use the key in hmac. This is similar for any cryptographic tool that wants a
high-quality key as input (i.e. most of them).

[edit2]

A shorter answer is that HMAC is secure and fast because its input key already
has sufficient entropy. bcrypt is slow because its whole point is to make it
difficult to attack a key with low entropy.

~~~
jud_white
Thanks for the varying levels of explanation (thanks to viraptor too). I think
part of the reason I was confused is because GitHub's web hook setup allows
for a supplied shared secret which, based on what I understand from above, is
not as secure as it could be unless the user ensures the shared secret has
sufficient entropy. If I'm still not getting it please let me know. Thanks
again.

~~~
aidenn0
A quick 30s scan of the webhooks docs looks like that is correct; if you used
e.g. 12345 as your secret, you would be susceptible to a dictionary attack
from anybody who was able to record a message, on the order of 10s of billions
of keys per second can be tested with a multi-GPU setup.

I suspect that the web hooks typically run over TLS, so recording the
plaintext of a request would be a challenge in and of itself.

~~~
KirinDave
If your shared secret is vulnerable to brute forcing, it's vulnerable to brute
forcing. An easy fix for this: generate your shared secret by hashing or
salthashing a low-entropy password.

As a general rule though, HMAC is used with randomly generated secrets. I
don't know why GitHub doesn't just tell you the secret.

Amazon's implementation is much more correct.

------
ben174
Odd, I'm getting completely different MD5s from the three example images.

    
    
      ~  curl -s https://blog.cloudflare.com/content/images/2015/08/white.jpg | md5

ccf22bc377846166ed65cd3cd58d2e3d

    
    
      ~  curl -s https://blog.cloudflare.com/content/images/2015/08/brown.jpg | md5

810cac197d97da7b216c7883be523495

    
    
      ~  curl -s https://blog.cloudflare.com/content/images/2015/08/black.jpg | md5

6bede506abffe08d0c2406d92fbff393

~~~
davesque
Cloud flare is mucking around with the payload to save some bytes, probably.

------
viraptor
I don't get the issue with serial numbers. For the browser, it's a completely
opaque random number - it doesn't matter what it is and how it changes.

Is the issue here that they're talking about CA collisions and need the
"authority key identifier" extension to match? This shouldn't matter when
colliding with service certificates.

------
ademarre
Nice write-up, but it's slightly misleading or confusing to not explain that
Nat McHugh's image collisions were chosen-prefix attacks. The post makes it
sound like the images were the product of some unexplained collision, and then
goes on to explain how chosen prefix can be used to forge certificates.

~~~
dsp1234
_to not explain that Nat McHugh 's image collisions were chosen-prefix
attacks._

In the paragraph right before those images, it says:

"This was cleverly demonstrated by Nat McHugh, who used a chosen-prefix hash
collision"

~~~
ademarre
Thanks for pointing that out. It was updated after initial publication.

Originally it said:

> _This was cleverly demonstrated by Nat McHugh, who used the broken hash
> function MD5 to create two images with the same hash..._

------
kazinator
Let me guess: birthday paradox? It's harder to find someone in the room who
has the same birthday as you, than to find a pair of people who have the same
birthday.

------
bracewel
Guh, until CF/FB can provide some data that shows users with no upgrade path
are genuinely going to be effected by this, and not connections MITM'd by some
crappy AV or other random middlebox the LV proposal seems like a pretty silly
idea...

[https://www.cabforum.org/pipermail/public/2015-December/0064...](https://www.cabforum.org/pipermail/public/2015-December/006495.html)

~~~
specialk
There is a great clourflare article [1] that has talked about this before when
they first suggested LV certs.

> The seemingly good news is that globally, SHA-2 is supported by at least
> 98.31% of browsers. Cutting 1.69% off the encrypted Internet may not seem
> like a lot, but it represents over 37 million people.

There is also an interesting discussion in Security Now #538 [2] there is also
a transcript of the show [3]. Skip to page 2 of 39 just where Leo says "Yeah".
Android 2.2 and Windows XP SP 2 are on the list of things that don't support
SHA-2. These devices exist particularly in the developing world. It sends the
wrong message, to the developing world in particular, if we don't support
HTTPS for them. It encourages websites in areas where it isn't 1.69% of their
users but maybe 5% of their users to just not enforce TLS. TLS with a SHA-1
signed LV cert is better than no security at all.

Facebook's also has a cool server add-on to dynamically serve LV certs to
those who need them is very promising. If it is in-production at Facebook it
is bound to be good.

[1] [https://blog.cloudflare.com/sha-1-deprecation-no-browser-
lef...](https://blog.cloudflare.com/sha-1-deprecation-no-browser-left-behind/)

[2] [https://www.grc.com/securitynow.htm](https://www.grc.com/securitynow.htm)

[3] [https://www.grc.com/sn/sn-538.pdf](https://www.grc.com/sn/sn-538.pdf)

~~~
yuhong
The Android one is an error from a GlobalSign page later corrected. Android
uses OpenSSL I think and has supported SHA2 certificates since 1.0.

------
yuhong
I was proposing restricting to manual issuance of OV certs given that
automated issuance is the source of most attacks.

------
leereeves
Can an attacker bypass this by using the serial number and validity period
from an already issued certificate?

