Hacker News new | past | comments | ask | show | jobs | submit login
NSA in P/poly: The Power of Precomputation (scottaaronson.com)
286 points by evanb on May 23, 2015 | hide | past | web | favorite | 34 comments

But maybe the right lesson to draw is mod-p groups and elliptic-curve groups both seem to be pretty good for cryptography, but the mod-p groups are way less good if everyone is using the same few prime numbers p, and the elliptic-curve groups are way less good if everyone is using the same few elliptic curves.

I don't know about the mod-p analysis, but that would be an... idiosyncratic... conclusion to draw about modern elliptic curve cryptography. The trend seems very much in the opposite direction, towards standardization and reliance on a few very carefully designed curves (optimized for performance on modern CPUs, and to avoid implementation pitfalls that give rise to vulnerabilities with older curves).

In fact, drawing any kind of conclusion about the diversity of fields in use seems a little strange. Yes, if you're going to use 1024-bit prime group, you definitely want to be using one very few others use. But just don't use 1024-bit prime groups. It doesn't much matter how many people use your (properly selected) 2048 bit group; no amount of precomputation puts it in reach of attackers.

> towards standardization and reliance on a few very carefully designed curves

That was the subject of djb's recent talk[1]. He brings up the very good point that we don't know that curve wasn't designed to have some weakness that was creator has kept hidden. This includes the case where a "random", "not biased" generation method, which only moves the attack point back one step.

[1] https://news.ycombinator.com/item?id=9568659

djb and Tanja Lange created this site[1] that catalogs the commonly used curves and lists which ones are thought to be safe.

[1]: http://safecurves.cr.yp.to/

Yes, we are also moving rapidly away from the NIST curves, which are not what I'm referring to when I talk about carefully designed curves.

I would say there's a balancing problem.

The best practices change over time, but old literature, and especially old online resources, have lots of inertia. We can call it the stagnating force of cargo culting, or the dark side of PageRank. The literature I remember reading clearly stated that in DH the security lies in the data used for key exchange and not the parameters. The use of well-known parameters are in fact encouraged. You state the same:

> Yes, if you're going to use 1024-bit prime group, you definitely want to be using one very few others use. [...] It doesn't much matter how many people use your (properly selected) 2048 bit group; no amount of precomputation puts it in reach of attackers.

Using standard, known-secure parameters is clearly a winner here. So: the best practice is to use provided DH parameters. What do people deploying applied cryptography do? They use parameters available from their chosen crypto library. Parameters, that were set as defaults years earlier.

Around year 2000, it was computationally (relatively) expensive to generate your own DH group. And, because using your own group makes it harder to bootstrap trust, it would have added complexity in an already complex mechanism. Hence, the best practice of using known DH group made even more sense.

The fact that Logjam mitigation strategy includes the recommendation to generate your own DH group flies against the face of established DH best practices. (Can we just assume that the modern crypto libraries will catch a faulty RNG and refuse to expose clearly vulnerable parameters? For sake of brevity, at least?) Those deploying applied crypto in the field are now faced with conflicting information. The established and easily discovered (PageRank[tm]) literature states that the best practice is to use known DH groups. The same literature also states that for compatibility reasons, using >1024 bit groups are not recommended. Now contrast this with the Logjam paper: those who require compatibility are suddenly in the need of their privately generated DH parameters. So suddenly the best practice is to ignore established best practice?

That's a nasty conflict.

Of course the sane and secure route was to simply make sure all servers were using ECDHE. But if you need to serve clients that are using ancient browsers, you probably still have to accept the known 1024 DH groups too.

Generating DH parameters is as expensive as generating an RSA key pair. It was never too expensive to generate DH parameters for your server, once, or every restart, or daily. It still is too expensive to generate DH parameters per-connection. If Apache always did what it does now (generate new DH params on service start, of the same size as the RSA key), it would have been secure and not computationally expensive for anyone except embedded platforms that don't run Apache anyway.

The client does not care whether the 1024 bit DH prime is common or not, so even if you have to use 1024 bit DH params, make sure they're new and that the NSA can probably crack them.

What's concerning is this idea, pretty clearly expressed in the blog we're commenting on, that it's a good idea to generate DH parameters on the fly. I don't think that's true at all.

It's a good idea not to use 1024 bit params. That's the change that's needed.

Attempting to generate params (or RSA keys or whatever) on the fly just exposes you to another class of bugs.

Yes, how does he go from DUAL_EC_DBRG being backdoored to using the same few elliptic curves being an issue? It's not accurate or productive to lump e.g. Curve25519 or Goldilocks448 in with DUAL_EC_DBRG and issues with 1024-bit DH primes.

I feel like I'm misinterpreting what he meant, but can't see what other point he could be making.

You mean DUAL_EC_DRBG.

He's correct that if people generated their own ECC curves instead of using standardized curves, then standardizing maliciously chosen curves would cease to be an attack vector.

That doesn't of itself imply that the pros of standardizing curves do not outweigh the cons, but it is a con of standardizing curves.

What are the motives for retaining the 1024 bit static dhparams? Wasn't it older Java that had trouble with larger dhparams? Nevertheless, couldn't this be mitigated somewhat by having openssl[1] dynamically generate dhparams? Isn't that what they do for ECDHE? 1024 bit DHE would still be crackable by the NSA, but with dynamic params it would take much longer since they wouldn't get to precompute anything.

[1] Do schannel and other ssl implementations use static dh parameters by default, like openssl does, or do they dynamically generate them? And do they default to 1024 bit or are they smarter about it and default to match the RSA key size?

Java, before version 8, couldn't handle DH parameters larger than 1024 bits, so that may be a motive. But I suspect a large part of it has to do with the fact that OpenSSL left DH parameter configuration (also ECDH curve selection) up to application developers, who don't really have the qualifications to make an intelligent choice. As a result, some applications make the administrator provide their own DH parameters file (and if you don't, no DH for you), others just hard code parameters (usually 1024 bits long, or 512 bits if the export flag is set) into their source code, and others dynamically generate them (and may or may not let you specify the length to generate). Unfortunately, this means that it's not a question of what OpenSSL can do to mitigate this, but what every single server application using OpenSSL has to do.

Generating dhparams every time is very expensive - it involves finding large prime which can take some time. ECDHE requires only to generate random N-bit number (any number) which is pretty much instant operation.

You're conflating private keys and parameters. The random number that ECDH requires you to generate is the private key, just as FFDH requires you to generate a random private key (which is also fast). If you were actually generating your own random curves (which TLS doesn't allow) it would be much more expensive.

That's because with ECDHE you're accepting a much more extensive set of static parameters.

Whenever the NSA deliberately lets people and companies in the USA continue to use a broken system for "secure" communications or transactions, they are putting all of them at risk instead of helping protect them -- in direct contradiction with the NSA's official charter.

Many other nations have the financial and technical ability to perform the same attacks, and the inclination to do so. And soon, other entities will, too.

Am I correct in assuming that if I have generated a 2048 bit dhparam.pem file which I pass to my web server, I am not using one of those "same few primes"?

Yes. And even if you did happen to be sharing a 2048 bit prime number with every other server in the world, it would still be essentially impossible for the NSA to crack in our lifetime (without massive advancements in quantum computing at least). 1024 bit is feasible for a powerful nation-state; 2048 bit is not.

Quick question - how many more times infusible is it? The article mentions roughly 7.5 million times from 512 to 1024, what is the rough jump required for 2048?

My math fu is not up to this at the moment.

2048-bit keys/primes are around 1 billion times harder than 1024.

I believe so

> A third solution is to migrate to elliptic-curve cryptography (ECC), which as far as anyone knows today, is much less vulnerable to descent attacks than the original Diffie-Hellman scheme. Alas, there’s been a lot of understandable distrust of ECC after the DUAL_EC_DBRG scandal, in which it came out that the NSA backdoored some of NIST’s elliptic-curve-based pseudorandom generators by choosing particular elliptic curves that it knew how handle.

But maybe the right lesson to draw is mod-p groups and elliptic-curve groups both seem to be pretty good for cryptography, but the mod-p groups are way less good if everyone is using the same few prime numbers p, and the elliptic-curve groups are way less good if everyone is using the same few elliptic curves. (A lot of these things do seem pretty predictable with hindsight, but how many did you predict?)

And that's why we need browsers to support curves other than NIST's P-256 and such. I know Chrome intents to, and I imagine Firefox isn't far behind. What's Microsoft's plan for the Edge browser regarding this? I haven't seen them say anything about it in all of their recent Edge-related posts.

Can anyone explain to me why some implementations only support 512-bit or 1024-bit parameter? Aren't the algorithms the same for all sizes? Why can't a certain implementation handle arbitrarily large parameters?

Probably because for performance reasons, much of this is hard coded. Buffers of a fixed length, etc.

"Performance reasons" is one of the better excuses to use if you want to force a committee to approve a weakened version of a standard. It has the air of "meeting everyone's needs" and it is unlikely that complaints about the potential weakening (or downgrade attack) will be listened to.

Probably a good idea to see if the performance issues are real, though. Modular arithmetic is really expensive; to do the two exponentiations used in (non elliptic) 2048 bit DH, you'll be needing 6 million cycles on an Intel chip at an absolute minimum. That's 5 milliseconds in wall time (at a mobile standard 1.2GHz)).

Imagine that you've got to load code from 8 different domains for a website. Then your CPU is working flat out for 40ms. Probably more like 50ms on ARM.

That's a lot of time.

Don't try to discredit those with perf concerns when the operation they're complaining about is incredibly expensive; they have genuine concerns.

I'm not trying to imply that all of these concerns are fake, just that performance is one of the easier places to hide bullrun-style sabotage. All claims should be checked, of course.

The link in my earlier post to djb's talk also discusses this issue, if youre' interested.

As someone who has written an embedded webserver... including the underlying TCP/IP layer and the driver for the "supposedly NE2000 compatible" Realtek chip... on a Z80 clone using only about 4k of flash and ~1.5k-2k of RAM, I sympathetic to real performance limitations. (that device only handled once minimum-size packet only one socket at a time)

That said, if you have a 1.2GHz chip, you have enough CPU for crypto. 40ms is a trivial cost for crypto, especially as you only use DH and pubkey to negotiate a symmetric key that isn't going to cause the same kind of CPU load.

There isn't anywhere close to a real performance limitation on that kind of platform, and I would regard any complaint about the performance on a >1GHz CPU as highly suspicious. When you have 1/100 or even 1/1000 the CPU cycles, that something else entirely.

Let's presume that 5ms is a lot. Does fixing the bit size cut this down significantly? If not, then better just handle all parameter size with a single codebase.

Absolutely not

What's the cost of increasing, let's say, the key size of a webpage serving SSL content. Merely adding SSL has an non-negligible cost for sites with some traffic.

Then you're forgetting all the dedicated hardware that needs to deal with that encryption. Sometimes it's a smartcard, or a security token, sometimes it's a mobile phone.

So because costs are hypothetically increasing, you prefer to live in a fantasy land where 512-bit keys are actually sane?

I am not arguing that there isn't a cost associated with crypto. For almost all uses[1], the price of crypto part of the cost of making something that connects to the internet. If you leave it out, you're creating an attractive nuisance and potential liability for someone. If you use the bad defaults of 512-bit crypto, I suggest that any claim of a product being "secure" or "using SSL" is a lie.

> smartcard

A smartcard isn't plugging into the internet on its own. Whatever it reads the card can wrap everything in proper crypto.

> mobile phone

You have far more CPU than you need.

[1] The exceptions are limited, such as a device that literally cannot do the crypto (I'm thinking an old 1MHz micro), Note that these devices shouldn't be directly on the internet, either.

A random thought: Perhaps excellent implementations explicitly limit themselves to a specific key size, like 1024 bits, because then the implementation can ensure constant-time computations. We know that there are exotic timing attacks, and these would be more difficult to thwart if we have to use general implementations to cope with arbitrary length keys.

Is that a reasonable argument for the inertia over key sizes?

I don't think so. The bid for constant-time implementations is fairly new whereas the key size issue has a long history. What I hear all the time in the practice is speed and backwards compatibility.

Hopefully now OpenSSH (and the SSH RFCs) will drop "diffie-hellman-group1-sha1" from the default list of allowed KexAlgorithms.


Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact