So these researchers collected ~200 million TLS handshakes, and found a few hundred that were miscomputed, they suspect by bit errors.
However, I do not believe modern computational devices are so unreliable. If I computed 200 million TLS exchanges on my home PC over a few days, I wouldn't expect a single one to be miscomputed. Servers with ECC memory ought to be another order of magnitude more reliable.
So why do we see such high rates of miscomputation?
The research doesn't necessarily imply that any typical device has such a high failure probability. From the paper:
> The three private keys revealed by the 11 faulty [RSA] signatures in our [passively observed] data were associated with three certificates that were served from four different IP addresses associated with Baidu. [...]
> After we disclosed to Baidu, they informed us that the traffic we observed was between the clients and Baidu’s golang-based L7 load balancer BFE which offloads cryptographic operations like signature generation to a hardware accelerator. [...] Based on the temporal pattern of signature errors we observed, we hypothesize that the errors may have been due to a single failing hardware component which then passed vulnerable signatures through the unprotected software implementation.
Yea it's just a coincidence that most of the key leaks identified in a US university campus internet are from Baidu. Just a random hardware error, nothing suspicious at all...
Peter Gutmann mentions in the cryptlib documentation that the extra check makes RSA signatures 10% slower, so I assume it's tempting to patch it out (or not add it in the first place).
However, I do not believe modern computational devices are so unreliable. If I computed 200 million TLS exchanges on my home PC over a few days, I wouldn't expect a single one to be miscomputed. Servers with ECC memory ought to be another order of magnitude more reliable.
So why do we see such high rates of miscomputation?