
An Almost-Secret Algorithm Researchers Used to Break Thousands of RSA Keys - williamkuszmaul
https://algorithmsoup.wordpress.com/2019/01/15/breaking-an-unbreakable-code-part-1-the-hack/
======
schoen
This describes research published in 2012 by Arjen Lenstra et al.
([https://eprint.iacr.org/2012/064.pdf](https://eprint.iacr.org/2012/064.pdf)),
which relied on a scalable n-way GCD algorithm that Lenstra's team thought
best not to explain to readers (in the hope that the attack wouldn't be
quickly replicated for malicious purposes). Coincidentally, another team
(Nadia Heninger et al., [https://factorable.net/](https://factorable.net/))
published extremely similar research in a similar timeframe, without
withholding details of that team's GCD calculation approach.

The Heninger et al. paper explains quite a lot about where the underlying
problems came from, most often inadequately seeded PRNGs in embedded devices.
As the linked article mentions, other subsequent papers have also analyzed
variants of this technique and so there's not much secret left about it.

If people are interested in learning about the impact of common factors on the
security of RSA, I created a puzzle based on this which you can try out, which
also includes an explanation of the issue for programmers who are less
familiar with the mathematical context:
[http://www.loyalty.org/~schoen/rsa/](http://www.loyalty.org/~schoen/rsa/)

Notably, my puzzle uses a small set of keys so you can do "easy" pairwise GCD
calculations rather than needing an efficient n-way algorithm as described
here (which becomes increasingly relevant as the number of keys in question
grows).

~~~
ikeyany
I attempted to do your puzzle but the link was blocked by my employer's
firewall:

> URL:
> [http://www.loyalty.org/~schoen/rsa/](http://www.loyalty.org/~schoen/rsa/)

> Block reason: Violence/Hate/Racism

So... what exactly is loyalty.org all about?

~~~
casefields
"www.loyalty.org itself is the server for the web site of Californians for
Academic Freedom, the group I founded to oppose the California loyalty oath,
which is still a non-negotiable requirement for anyone who wants to work for
the State of California -- including student employees of the University of
California. "

[http://www.loyalty.org/~schoen/](http://www.loyalty.org/~schoen/)

That's the most controversial thing about the page. In my view it's not a big
deal to have that stance but some people don't like others who rock the boat
when it comes to the status quo.

~~~
schoen
My activism on this issue for the past decade and a half has basically
consisted of corresponding for a few minutes each year with some new person
who objects to signing the loyalty oath. I doubt that the site blocking has
anything to do with this.

~~~
wlll
I was curious about the contents and contentions points of the oath so went
looking and one of the files you link to is a 404:

[http://www.loyalty.org/oath.txt](http://www.loyalty.org/oath.txt)

~~~
cogburnd02
[http://web.archive.org/web/20000831114425/http://www.loyalty...](http://web.archive.org/web/20000831114425/http://www.loyalty.org:80/oath.txt)

------
phw
This idea has since been applied to several other domains. Last year we had a
look at all archived RSA keys of Tor relays: [https://nymity.ch/anomalous-tor-
keys/](https://nymity.ch/anomalous-tor-keys/)

We found that several thousand relays that shared prime factors (most part of
a research project), ten relays shared moduli, and 122 relays used a non-
standard RSA exponent, presumably in an attempt to manipulate the Tor
network's distributed hash table, which powers onion services.

------
lipnitsk
Another group did a talk on this at DEF CON 26 last year:
[https://research.kudelskisecurity.com/2018/08/16/breaking-
an...](https://research.kudelskisecurity.com/2018/08/16/breaking-and-reaping-
keys-updated-slides-and-resources/)

They analyzed over 340 million keys from the web.

> As part of the presentation given at DEF CON 26, one of the outputs was
> Kudelski Security’s Keylookup application. On this site, you can submit your
> own public keys and have them tested against our dataset. We will let you
> know if your key is vulnerable to Batch GCD and ROCA attacks. If your key is
> in our database, we will be able to give you an answer immediately, if it is
> not, you may have to wait a bit until the tests complete.

>
> [https://keylookup.kudelskisecurity.com/](https://keylookup.kudelskisecurity.com/)

~~~
tptacek
I don't think ROCA is related to the vulnerability in the article?

~~~
lipnitsk
Correct, but the main point of their talk and research was focused on Batch
GCD processing of hundreds of millions of keys, with ROCA analysis done in
addition.

------
truantbuick
What are the characteristics of those who generated an RSA key sharing a prime
factor? Can they be linked back to a few bad CSPRNG implementations?

What are practical steps to be responsible about it?

It's contrived, but I just imagine that if I'm generating some particularly
important keys, that I should somehow find a way to give /dev/urandom a kick
of some kind. Even if that were possible, it's more likely to make things
worse than better. Still, it makes me a little paranoid to even hear about
theoretical weaknesses -- especially like collision attacks. I have no idea
how long it takes for the CSPRNG to get properly seeded. Does it take a
microsecond after booting? 10 minutes? A day?

~~~
z3t4
Some RNG's use the time of the day in milliseconds as seed, I guess those are
easy to brute force. I guess it's all about the size of the seed and it's
randomness!?

~~~
schoen
This is probably the most famous issue about that phenomenon:

[https://people.eecs.berkeley.edu/~daw/papers/ddj-
netscape.ht...](https://people.eecs.berkeley.edu/~daw/papers/ddj-
netscape.html)

You could say that our understanding of PRNGs has improved a bit since then.

A recent thread about brute-forcing PRNG states in a game:

[https://news.ycombinator.com/item?id=18880528](https://news.ycombinator.com/item?id=18880528)

~~~
rincebrain
I would have thought [https://github.com/g0tmi1k/debian-
ssh](https://github.com/g0tmi1k/debian-ssh) would be the most famous issue in
many people's memories of poor (read: absent) PRNG use. ;)

------
jMyles
A lot of people over the years have gotten the message that "use /dev/urandom
and forget about it" is the final, end-all, be-all for secure random number
generation.

In fact, this ideology (and that's what it is - an ideology) has been
trumpeted right here on HN, in some cases by people who repeatedly seem to
comment on topics that they don't fully understand. Security is hard, but
there's also a high reputational value on being perceived as an authority on
the topic. As a result, there are some nuggets of "wisdom" that require
asterisks next to them, including this one.

Even though "just use /dev/urandom" is _almost always_ true, it isn't _always_
true. In fact, the universe of cases where some form of blocking entropy is
needed (and again, this is a very tiny set) is growing, not shrinking.

[https://security.stackexchange.com/questions/186086/is-
alway...](https://security.stackexchange.com/questions/186086/is-always-use-
dev-urandom-still-good-advice-in-an-age-of-containers-and-isola)

~~~
aidenn0
Part of the problem is that neither /dev/urandom nor /dev/random on linux do
what most people want. /dev/urandom _never_ blocks even right after you've
spun up a fresh machine that has not seeded the PRNG, while /dev/random blocks
very conservatively, to the point where it is not useful for certain things.

The proper approach for high-volume random numbers is probably to seed a
userspace PRNG from /dev/random but that's extra work, particularly in a
concurrent program.

~~~
tptacek
Please do _not_ seed a userspace RNG from /dev/random. Most major crypto RNG
attacks trace back to userspace RNGs. If you can trust /dev/random to provide
a seed for your userspace RNG, then by definition you can also _from that
point on_ trust urandom as well.

(getrandom(..., 0) is probably the right long-term solution).

~~~
baby
BTW, I've read that before (don't use userspace RNG). Can you point me to a
few problems that arose because of userspace RNG?

~~~
tptacek
Yeah, next time we're drinking.

~~~
baby
Now I remember about forking and VM cloning issues.

Also, you live a bit far from me, but see you at Black Hat maybe :D

------
cbhl
PuTTYgen, being a Windows application, assumes that a mouse is connected and
prompts the user to move the mouse to generate entropy during key creation.

By comparison, ssh-keygen documents the SSH_USE_STRONG_RNG environment
variable -- but then recommends against its use (!) since it can cause key
generation "to be blocked until enough entropy is available".

~~~
hackcasual
> "to be blocked until enough entropy is available"

which is fine. The idea that entropy is a consumeable resource is a crypto
myth that needs to die.

~~~
tlb
It's not consumable, but it can be in short supply right after boot. So
/dev/random should block _until_ there's sufficient entropy, but never after
that.

~~~
cbhl
If I recall correctly, at one point, the network interface was used as a
source of entropy. Then someone demonstrated that sending the right sequence
of network packets to a machine would let you control the key that got
generated. So they removed it.

Then folks discovered -- in production -- that some cloud computing
environments just don't get _any_ other new entropy after boot, and so
instances would hang on generating SSH host keys.

Some folks went to /dev/urandom. Other folks decided to seed instances with
entropy from another computer (with fancy names like "cloud entropy service").
And then someone had to decide how _that_ machine gets entropy (like plugging
in an FM radio into the mic jack).

~~~
justinclift
> some cloud computing environments just don't get any other new entropy after
> boot

For environments like that, I think Haveged is the general approach these
days. Latest dev code (revived project) is now here:

[https://github.com/jirka-h/haveged](https://github.com/jirka-h/haveged)

It's the (officially blessed, I think) continuation from the original Haveged:

[http://www.issihosts.com/haveged/](http://www.issihosts.com/haveged/)

------
userbinator
IMO a bit clickbaity of a title --- I thought there was a recent breakthrough
in integer factorisation, when in fact this is really attributable to
insufficient randomness.

~~~
schoen
A bigger problem for many readers might be that this is a new post, mainly
summarizing research from 2012. So it's not really appropriate to say
"(2012)", but it also doesn't announce new discoveries since that time.

------
gtsteve
On this topic, I recall reading earlier computers (of the 80s) used radio
tuners tuned into nothing to generate random numbers. Of course, you need some
way to randomly choose the frequency but then what comes out should be pretty
unpredictable.

I believe Random.org uses an approach similar to this. What is so special
about this approach that we couldn't install it as a card in a desktop for
example?

~~~
amlozano
Modern desktops have had RNGs built into their chips since Ivy Bridge. Beyond
that, they have plenty of decent sources for random numbers such as network
traffic. Similarly, mobile devices can use sensor noise to create random
numbers pretty well.

The devices people are concerned with are things like embedded devices and
sometimes virtual devices.

------
daedalus2027
hi, I attempted to recreate the algorithm in the post:

[https://github.com/daedalus/misc/blob/master/testQuasiLinear...](https://github.com/daedalus/misc/blob/master/testQuasiLinearGCD.py)

------
sempron64
This is very interesting. I wonder what key generators were used to create the
insecure keys.

~~~
schoen
See
[https://factorable.net/weakkeys12.extended.pdf](https://factorable.net/weakkeys12.extended.pdf),
which includes some identifications of some of the responsible devices. (Two
groups of researchers published independently on this issue at nearly the same
time.)

~~~
lixtra
TLDR:

Devices that create TLS and SSH keys just after boot when there is not enough
entropy.

~~~
loeg
That's not a great takeaway.

The takeaway is: Linux's /dev/random and /dev/urandom interfaces are both
broken, in the sense that neither is reliable for _embedded_ developers during
certain early boot conditions. Some of that was maybe worse in 2012 than
today, but the fundamental interface properties have not changed.

Tl;dr: Use getrandom() instead of /dev/[u]random. Do not use GRND_RANDOM. Do
not use GRND_NONBLOCK.

~~~
Scoundreller
But maybe better pre-2009:

> Surprisingly, modern Linux systems no longer collect entropy from IRQ
> timings. The Linux kernel maintainers deprecated the use of this source in
> 2009 by removing the IRQF_SAMPLE_RANDOM flag, apparently to prevent events
> controlled or observable by an attacker from contributing to the entropy
> pools.

> Although mailing list discussions suggest the intent to replace it with new
> collection mech- anisms tailored to each interrupt-driven source [21], as of
> Linux 3.4.2 no such functions have been added to the ker- nel.

> The removal of IRQs as an entropy source has likely exacerbated RNG problems
> in headless and embedded devices, which often lack human input devices,
> disks, and multiple cores. If they do, the only source of entropy—if there
> are any at all—may be the time of boot.

~~~
X6S1x6Okd1st
Removing something from an entropy pool because it could be used by an
attacker seems really odd to me.

~~~
schoen
It may have been inspired by arguments like
[https://blog.cr.yp.to/20140205-entropy.html](https://blog.cr.yp.to/20140205-entropy.html)
(although that one was published five years later than the change we're
talking about).

~~~
clarry
That would be rather misguided though. It's not like a device is going to know
the exact time the interrupt handler routine would run, hash it, and go back
"whoops, better not fire an interrupt yet because it won't be on the right
tick!"

------
sublupo
> The authors hint in a footnote that at the heart of their computation is an
> asymptotically fast algorithm, allowing them to bring the running time of
> the computation down to nearly linear; but the actual description of the
> algorithm is kept a secret from the reader

How could something like that pass peer review? Their claim is effectively
unable to be reproduced.

~~~
andreareina
If I'm reading the numbers right, even using the slow way you'd expect to
break on the order of tens of keys per day. Thus the claim that "two out of
every one thousand RSA moduli [...] offer no security" is easily verified. The
secret algorithm doesn't compute secret data that can't be verified.

------
sliken
Anyone know of a service
([https://factorable.net/keycheck.html](https://factorable.net/keycheck.html)
was discontinued) or code that would help identify particularly weak keys?

~~~
sliken
Actually factorable.net does have code. Planning to analyze the 1200 user keys
I have around to see if any are weak. From what I can tell you can just
convert public keys into hex and feed them into their program.

~~~
anomalroil
Yes, but it will only batch-factor the 1200 keys you'd be feeding it with. For
such factorization attacks to work best, you need a dataset as large as
possible.

------
incompatible
Could you use a similar idea to go after bitcoin keys? If so, you may not be
able to crack any particular key, but you could steal the bitcoins from the
ones you did crack.

~~~
schoen
As sowbug and tptacek mention, there are no common-factor attacks against
Bitcoin keys. However, that doesn't mean that the PRNGs used in cryptocurrency
implementations aren't a concern. The most famous incident was probably this

[https://lists.linuxfoundation.org/pipermail/bitcoin-
dev/2018...](https://lists.linuxfoundation.org/pipermail/bitcoin-
dev/2018-April/015873.html)

(you can find subsequent journalism about the effects of this if you're
interested).

There have also been other cryptocurrency PRNG attacks that weren't as high-
profile as this issue.

~~~
tptacek
Java SecureRandom: a userland CSPRNG!

~~~
pvg
This is JS. The (current-ish) Java ones use the OS-provided facility.

------
skookumchuck
After decades of problems with seeding RNGs, why isn't there a electronic
circuit that gets a seed from quantum noise or something like that? The
circuit could be part of the CPU or support chips.

After all, amplifiers are always trying to increase the signal/noise, and the
basis of the reliability of digital circuits is avoiding the noise. Instead, a
circuit can amplify the noise and sample it.

~~~
tatersolid
There is. RDRAND for x86/x64 has been in all Intel/AMD for several years.

Most ARM SoC have some equivalent device, but they are nonstandard and require
driver support.

Even the TPM chips in basically every desktop, laptop, and server for over a
decade have hardware RNG. Again driver support is needed.

The problem is cheap “blue plastic boxes” may not have a hardware RNG, nor
will Virtual machines or containers. Writing code to figure out what RNG is
available and how to use it is a nightmare so few people do it.

This is why most security people say “use the OS CSPRNG always”. That way
user-space code doesn’t have to carry all the platform specifics with it. And
presumably integrating the hardware RNG can be done once at the OS layer.

------
betolink
I don't know why this is still a problem these days, let's just use lava lamps
for entropy: [https://blog.cloudflare.com/lavarand-in-production-the-
nitty...](https://blog.cloudflare.com/lavarand-in-production-the-nitty-gritty-
technical-details/)

------
zde
> But since they both used the same program to generate random prime numbers,
> there’s a higher-than-random chance that their public keys share a prime
> factor

BS

