
Socat: “the hard coded 1024 bit DH p parameter was not prime” - mrb
http://www.openwall.com/lists/oss-security/2016/02/01/4
======
mrb
It irks me that in security advisories that fix a _possible_ backdoor—like
here—sometimes no root cause analysis is done or communicated to the public.
Who chose this parameter? Who wrote the code? Who committed it? So I did a
little sleuthing...

Here is the commit introducing the non-prime parameter (committed by Gerhard
Rieger who is the same socat developer who fixed the issue today):
[http://repo.or.cz/socat.git/commitdiff/281d1bd6515c2f0f8984f...](http://repo.or.cz/socat.git/commitdiff/281d1bd6515c2f0f8984fc168fb3d3b91c20bdc0)

The commit message reads: _" Socat did not work in FIPS mode because 1024
instead of 512 bit DH prime is required. Thanks to Zhigang Wang for reporting
and sending a patch."_ So a certain Zhigang Wang presumably chose this prime.
Who is he?

Apparently he is an Oracle employee involved with Xen and Socat. Here is a
message he wrote:
[http://bugs.xenproject.org/xen/bug/19](http://bugs.xenproject.org/xen/bug/19)

So why has Gerhard seemingly not asked Zhigang how he created the parameter?

~~~
matthewaveryusa
I'm pretty sure that when you generate a prime you're using the Miller–Rabin
primality test in which case you only probabilistically choose a prime.

In fact, the is_prime functions in openssl don't check if a number is prime.
They only check that a number is prime within 1-2^-80 probability. I'm not
sure what the implications are though.

See
[https://www.openssl.org/docs/manmaster/crypto/BN_generate_pr...](https://www.openssl.org/docs/manmaster/crypto/BN_generate_prime.html)

~~~
defen
2^-80 is an incomprehensibly tiny number. Malice or incompetence are both FAR
more likely.

~~~
makmanalp
You know, I never know what to make of that logic - what if that tiny
probability was exactly this one time? It's not like we saw it happen twice,
and it could happen at some point. To my gut it seems you can't really know
until you have other positive or negative observations.

I wonder if someone has compiled a list of very improbable events that have
been observed.

~~~
makmanalp
Wow - this has got to be my most downvoted comment. I can't edit the original
anymore, so here's my update:

I guess the harsh reaction came from the fact that I didn't define the scope
very well: My question wasn't in reference to M-R specifically, just in
general. I understand that in this case it makes sense to look at likelier
causes (see Sharlin's response).

My point was that it's interesting to look at what happens (or what our
reaction is) when very very very improbable events do happen. It seems weird
to go with the assumption that because something is extremely unlikely that it
won't happen.

When I roll a dice 20 times, I get a particular arrangement of numbers. Given
the total number of arrangements possible, that particular arrangement is
extremely unlikely, yet I just got it.

A guy got struck by lightning 7 times
([https://en.wikipedia.org/wiki/Roy_Sullivan](https://en.wikipedia.org/wiki/Roy_Sullivan)).
The odds of any person getting struck by lightning is 1 in 10000. Seven times
in a row is 1 in 2^93. But then when you start drilling down, you see that
he's a park ranger, and that he's out while lightning happens, which makes the
probability that he'll get struck much higher.

If I had phrased the question to you asking what the likelihood of _any given
person_ in the world being struck seven times was, you could have calculated
the former and said 2^-93 is such a small probability that it's not worth
thinking about - and yet here is Roy Sullivan, so there's some sort of
conflict in my logic. What's wrong with the former calculation?

Why is it that for any given person the probability is 2^-93 but for Roy it's
somehow different, even though he is a "given person"? Is it that the 1 in
10000 number was wrong? But then if we look at all the people who never once
got struck, it seems about right. If we inflate that number to 1 in 100 to
make Roy likelier to get 7 in a row, then it seems everyone also should be
getting shocked more often at least once or twice.

Or maybe it's that somehow the probability changes when we have more
information and those two numbers and situations are not comparable on an
absolute scale. Maybe if you get hit twice then you're much likelier to get
hit again because you're probably in some dangerous location - but how was I
to know to factor this in? It seems that it's very much about how you
calculate the probability. Who knows what other hidden factors could be wildly
affecting the true value of the probability?

That also makes me think - is there even such a thing as the "true" or
inherent probability of an event happening?

edit: Or maybe it's the law of large numbers - given enough "trials" or in
this case lightning events with people around, even something with an absurdly
small probability is bound to happen eventually. But then why do we never
factor that in and always just call it a day with 10000^7?

~~~
Jach
You've touched upon an old conflict.
[http://lesswrong.com/lw/oj/probability_is_in_the_mind/](http://lesswrong.com/lw/oj/probability_is_in_the_mind/)

~~~
bonoboTP
There's no need to refer people to lesswrong for this. The interpretation of
probability as a subjective degree of knowledge/evidence is well-known. For
example: [http://plato.stanford.edu/entries/probability-
interpret/](http://plato.stanford.edu/entries/probability-interpret/) or
[http://plato.stanford.edu/entries/epistemology-
bayesian/](http://plato.stanford.edu/entries/epistemology-bayesian/)

No need to spam links to that cult site.

------
baby
Q: How does p not being a prime => backdoor?

A: p not being a prime means two things:

* subgroup confinement attacks (where you send a public key made with a fake generator g) should be able to take place if the code is weak -> this is because there must be low order subgroups.

* the generator g might not be of great order. This can be easily tested if you know how to factor p: the order of the multiplicative group (Zp)* is the euler's totient function on p. If you know the order of the multiplicative group then you have an algorithm to find the order of your generator: you try all the divisors of the group's order, see the smallest one that works.

Unfortunately, if you don't know how to factor p then you can't easily do
that.

Another question is: How can they know it's not a prime if they don't know the
factorization of p? We have efficient provable tests for that: they tell you
if p is prime or not and nothing else.

~~~
schoen
> We have efficient provable tests for that: they tell you if p is prime or
> not and nothing else.

I think this is a typo -- the efficient general-form tests are _probable_
rather than _provable_.

[https://en.wikipedia.org/wiki/Primality_test#Probabilistic_t...](https://en.wikipedia.org/wiki/Primality_test#Probabilistic_tests)

~~~
openasocket
There are exact tests that run in polynomial time, like
[https://en.wikipedia.org/wiki/AKS_primality_test](https://en.wikipedia.org/wiki/AKS_primality_test)

~~~
schoen
I hate to say that a polynomial-time test isn't efficient because many people
use that as the very _definition_ of efficient, but my understanding is that
AKS is incredibly impractical because the polynomial is ginormous, even though
its asymptotic behavior is nice. So if you actually wanted to know if, say, a
1024-bit number was definitely prime, you wouldn't be able to run AKS on it in
a "reasonable time" on a real computer.

~~~
baby
I haven't actually studied to what extent the AKS tests are do-able. I always
figured there would be no problem running one for a 1024 bits prime. Found
this on SO: [http://cs.stackexchange.com/questions/23260/when-is-the-
aks-...](http://cs.stackexchange.com/questions/23260/when-is-the-aks-
primality-test-actually-faster-than-other-tests)

Also, to further the discussion on probable vs provable: the probable tests
are enough in our case because they tell us _provably_ if an integer is not a
prime (that we care), but _probably_ if an integer is a prime (which we don't
care here).

~~~
taejo
This is not conclusive, but the best deterministic primality test (an AKS
variant by Pomerance and Lenstra) is 6th power. 1024^6 is quite large.

------
swalsh
I always wonder if when things like this get found, there's someone in the NSA
going "Wow they finally found it, only took them x years"

~~~
phkahler
>> I always wonder if when things like this get found, there's someone in the
NSA going "Wow they finally found it, only took them x years"

I would guess they were aware of the problem weather they created it or not.
So yes.

~~~
duaneb
> I would guess they were aware of the problem weather they created it or not.

While it's smart to assume so, it's also pretty laughable to think that the
NSA has an exhaustive list of encryption vulnerabilities.

~~~
tankenmate
> While it's smart to assume so, it's also pretty laughable to think that the
> NSA has an exhaustive list of encryption vulnerabilities.

True, but in this case I'm sure they have enough hardware to factor any widely
deployed primes used in crypto or semi crypto comms software. After all, that
is half of the NSAs job description.

~~~
Lozzer
I expect you have enough hardware to factor widely deployed primes. Compound
numbers might be a different story.

------
oxguy3
Title is misleading -- this appears to be an issue with a tool called socat,
not with OpenSSL. That's a world of a difference in the severity of the issue.

~~~
yeukhon
I think the first line is confusing and misleading until I read yours.

> In the OpenSSL address implementation the hard coded 1024 bit DH p parameter
> was not prime.

It should have been worded "In Socat, the DH p parameter used by OpenSSL
implementation was hardcoded and was not a prime."

~~~
LukeShu
The original phrasing is correct, but confusing if you aren't familiar with
socat. Socat fundamentally works by giving it a pair of addresses; they are
referring to socat's implementation of addresses starting with "OPENSSL:"
(caps-insensitive), AKA "OpenSSL addresses".

~~~
yeukhon
I see. You are right, and yes I may have jumped to the gun too quickly.

------
agwa
This is a vulnerability in socat's TLS support. It has nothing to do with
OpenSSL (besides the fact that OpenSSL provided a footgun API by leaving it to
application developers to supply DH parameters).

~~~
ChuckMcM
"Footgun API" \- I like that!

Its an interesting challenge though, I wonder if the person who picked the
constant the first time understood the ramifications of it being prime or not.
And if they did, how hard they worked to validate its primality.

------
madars
Very nice asymmetric backdoor!

If you happen to know the factorization and the factors are not too large
(e.g. two 500-bit factors + some chaff), then you can just use Pohlig-Hellman
algorithm to solve the DLP modulo each individual factor, combine the results
and recover the shared Diffie-Hellman secret.

But without this trapdoor information (and, say, if p was chosen to be a Blum
integer), computing the Diffie-Hellman shared secret is as hard as factoring
that modulus (see
[https://crypto.stanford.edu/~dabo/abstracts/DHfact.html](https://crypto.stanford.edu/~dabo/abstracts/DHfact.html)).

------
mabbo
Someone should really write a set of unit tests available in every conceivable
language, marked as "Please copy this unit test into your test base, and use
it to verify all your primes are prime".

You can even make it a bit fuzzy- Miller-Rabin uses random numbers, right? So
make it that every time the unit test is run it generates new random values.
Your test won't be deterministic, but it will fail at least some of the time
which should be enough to raise an alarm of a problem.

~~~
digler999
I'm aware that factoring a prime into composites is one of the most difficult
computational problems, but isn't it cheap to determine if a number is prime ?
a^(p-1) == a mod p (where == means "congruent to" ). with modular
exponentiation isn't it simple to compute the modulus ?

~~~
cyphar
The little Fermat test requires you to test all "a"s in order to remove false
positives. For most numbers, testing a few thousand bases is enough but you
have Carmichael numbers that break the test.

There are better primality checks, but they all have downsides (either slow
but provable or fast but probabilistic). Finding prime numbers can be shown to
be easier than finding factors (see: how GIMPS checks for primality), but that
doesn't make it easy.

------
wanderfowl
Nice catch. Now, the real question is who committed it, and how they came up
with the number.

If it is a backdoor, it's pretty smart, because it's very deniable as a
"stupid mistake". And if it's a stupid mistake, it's extra stupid, because
this committer will have trouble convincing the world that that's all it was.
At the very least, somebody needs to be going through all this person's
commits with a fine-toothed comb.

The methods of handling situations like this in the face of a known threat is
going to be interesting. You hate to ban or hinder a good programmer from a
project, but once possibly-bitten, twice shy.

~~~
jcoffland
Aren't large non-primes usually created by multiplying two large but smaller
primes together. Factoring is then the challenge. Or is there more to this
that I'm missing?

~~~
wanderfowl
The question is whether they:

a) knew it was non-prime, and used it to weaken the crypto

b) knew it was non-prime, and used it because they didn't think it needed to
be prime (which is a massive sin of ignorance)

c) grabbed 1024 bits of rand() and didn't check if it was prime (again,
stupid)

d) grabbed some rand and checked the prime-ness using a bad method

e) used a "prime number generator" that produced bad output

I agree that making non-prime numbers is not terribly difficult, but the
question of how they got the number is only interesting in that it gives info
about why.

~~~
caf
f) Used a machine with bad RAM that flipped a bit.

This case could actually be tested for - see if any of the one-bit differences
from the number used are prime.

~~~
mortenlarsen
A bit could have been flipped in the software or the result of the function
(true/false), but in the number itself there appear to be no single bit flips
that make it prime (at least in the binary representation).

Edit: A single bit flip could have been used as "semi" plausible deniability
in the case of malicious intent.

------
natch
>there is no indication of how these parameters were chosen

Is there really no protocol used in projects undertaken by the security
community that would ensure that each component of the tools we rely on has a
known history?

~~~
gcommer
Well, socat has a git repo at git://repo.or.cz/socat.git so you can see the
history of the prime number. In particular, it was upgraded from a 512 bit
prime to a 1024 bit "prime" in commit 281d1bd on Jan 23, 2015.

Neither the code then, or the new code in this patch, have comments indicating
how the prime was generated. (It is only mentioned in the advisory that
openssl dhparams was used for the recent patch)

------
foota
Might be an interesting exercise to search through github for large numeric
constants and find ones that aren't prime, then manually check through those
for ones that are supposed to be.

------
bryogenic
This is the reason RFC 5114 exists.

[http://tools.ietf.org/html/rfc5114](http://tools.ietf.org/html/rfc5114)

------
kachnuv_ocasek
Has anyone checked the new number is a prime?

------
schoen
Some community-vetted larger DH parameters that software developers can use:

[https://datatracker.ietf.org/doc/draft-ietf-tls-
negotiated-f...](https://datatracker.ietf.org/doc/draft-ietf-tls-negotiated-
ff-dhe/)

------
archgoon
Previously:

    
    
        915       static unsigned char dh1024_p[] = {                                                           
        916      0xCC,0x17,0xF2,0xDC,0x96,0xDF,0x59,0xA4,0x46,0xC5,0x3E,0x0E,
        917      0xB8,0x26,0x55,0x0C,0xE3,0x88,0xC1,0xCE,0xA7,0xBC,0xB3,0xBF,
        918      0x16,0x94,0xD8,0xA9,0x45,0xA2,0xCE,0xA9,0x5B,0x22,0x25,0x5F,
        919      0x92,0x59,0x94,0x1C,0x22,0xBF,0xCB,0xC8,0xC8,0x57,0xCB,0xBF,
        920      0xBC,0x0E,0xE8,0x40,0xF9,0x87,0x03,0xBF,0x60,0x9B,0x08,0xC6,
        921      0x8E,0x99,0xC6,0x05,0xFC,0x00,0xD6,0x6D,0x90,0xA8,0xF5,0xF8,
        922      0xD3,0x8D,0x43,0xC8,0x8F,0x7A,0xBD,0xBB,0x28,0xAC,0x04,0x69,
        923      0x4A,0x0B,0x86,0x73,0x37,0xF0,0x6D,0x4F,0x04,0xF6,0xF5,0xAF,
        924      0xBF,0xAB,0x8E,0xCE,0x75,0x53,0x4D,0x7F,0x7D,0x17,0x78,0x0E,
        925      0x12,0x46,0x4A,0xAF,0x95,0x99,0xEF,0xBC,0xA6,0xC5,0x41,0x77, 
        926      0x43,0x7A,0xB9,0xEC,0x8E,0x07,0x3C,0x6D,
        927       };
    
      $ echo 'isprime(143319364394905942617148968085785991039146683740268996579566827015580969124702493833109074343879894586653465192222251909074832038151585448034731101690454685781999248641772509287801359980318348021809541131200479989220793925941518568143721972993251823166164933334796625008174851430377966394594186901123322297453)' | gp -q
      0
    

With fix

    
    
        xio-openssl.c
        915       static unsigned char dh2048_p[] = {
        916      0x00,0xdc,0x21,0x64,0x56,0xbd,0x9c,0xb2,0xac,0xbe,0xc9,0x98,0xef,0x95,0x3e,
        917      0x26,0xfa,0xb5,0x57,0xbc,0xd9,0xe6,0x75,0xc0,0x43,0xa2,0x1c,0x7a,0x85,0xdf,
        918      0x34,0xab,0x57,0xa8,0xf6,0xbc,0xf6,0x84,0x7d,0x05,0x69,0x04,0x83,0x4c,0xd5,
        919      0x56,0xd3,0x85,0x09,0x0a,0x08,0xff,0xb5,0x37,0xa1,0xa3,0x8a,0x37,0x04,0x46,
        920      0xd2,0x93,0x31,0x96,0xf4,0xe4,0x0d,0x9f,0xbd,0x3e,0x7f,0x9e,0x4d,0xaf,0x08,
        921      0xe2,0xe8,0x03,0x94,0x73,0xc4,0xdc,0x06,0x87,0xbb,0x6d,0xae,0x66,0x2d,0x18,
        922      0x1f,0xd8,0x47,0x06,0x5c,0xcf,0x8a,0xb5,0x00,0x51,0x57,0x9b,0xea,0x1e,0xd8,
        923      0xdb,0x8e,0x3c,0x1f,0xd3,0x2f,0xba,0x1f,0x5f,0x3d,0x15,0xc1,0x3b,0x2c,0x82,
        924      0x42,0xc8,0x8c,0x87,0x79,0x5b,0x38,0x86,0x3a,0xeb,0xfd,0x81,0xa9,0xba,0xf7,
        925      0x26,0x5b,0x93,0xc5,0x3e,0x03,0x30,0x4b,0x00,0x5c,0xb6,0x23,0x3e,0xea,0x94,
        926      0xc3,0xb4,0x71,0xc7,0x6e,0x64,0x3b,0xf8,0x92,0x65,0xad,0x60,0x6c,0xd4,0x7b,
        927      0xa9,0x67,0x26,0x04,0xa8,0x0a,0xb2,0x06,0xeb,0xe0,0x7d,0x90,0xdd,0xdd,0xf5,
        928      0xcf,0xb4,0x11,0x7c,0xab,0xc1,0xa3,0x84,0xbe,0x27,0x77,0xc7,0xde,0x20,0x57,
        929      0x66,0x47,0xa7,0x35,0xfe,0x0d,0x6a,0x1c,0x52,0xb8,0x58,0xbf,0x26,0x33,0x81,
        930      0x5e,0xb7,0xa9,0xc0,0xee,0x58,0x11,0x74,0x86,0x19,0x08,0x89,0x1c,0x37,0x0d,
        931      0x52,0x47,0x70,0x75,0x8b,0xa8,0x8b,0x30,0x11,0x71,0x36,0x62,0xf0,0x73,0x41,
        932      0xee,0x34,0x9d,0x0a,0x2b,0x67,0x4e,0x6a,0xa3,0xe2,0x99,0x92,0x1b,0xf5,0x32,
        933      0x73,0x63
        934       };
    
      $ echo 'isprime(27788893276069724796504555675597658900595616769773727063231875314156885361379100133264804184710789407128574011804155595735704837674243828066040543912171576627544718762752948158991754559261759162739343094515270757451837630913502740443023902769553802723685440839891240497710460941757089246131322686180648463540974702859210630184042730717698427486397505787974799692901205514386555272667298045803284972074823213104807295638814082142694729938965663710648170010420323923305528998108799706139846097432481556448740855888110797022123731105964852194684036975049177742094726795060211226322344210328442014189175085444396370522979)' | gp -q
      1
    

The original error could have been checked with a quick code review verifying
that the provided number was in fact prime. This should have been done when
the original patch was submitted.

~~~
CamperBob2
Kind of interesting that the first nonprime version ended in a comma, as if it
originated as a fragment of a longer array of bytes. That's nonstandard C, and
I'd expect that whatever tool(s) generate those arrays would know not to add a
comma after the final entry.

~~~
bigiain
<cynical thought>That's a nice piece of plausible deniability, huh? I wonder
if the committer has a few extra bytes that he can claim were supposed to be
there which make a number that passes all the primality tests? That'd be a
nice excuse for the NSA/3PLA overlords to have given him...

~~~
cperciva
Extra bytes would have made it not a 1024-bit integer.

~~~
gertef
Whicgh provides an excuse for why the prime was truncated, and OOPS made it
into a not-prime.

------
jasonjayr
That looks like it's for 'socat', not openssl itself ...

------
chris_wot
I'm curious, is there a list of known primes held somewhere?

~~~
nicolas314
Pi would be a good source, but they are in the wrong order.

~~~
cyphar
Pi would be a horrible source. Why would you want to use a deterministic digit
generation function to generate your entropy. Even if you always used very
large digit offsets. I can't imagine it being a remotely good idea.

~~~
Filligree
As a source of primes? OP never said he wanted them to be random, and pi does
contain every (finite) number. :P

~~~
im2w1l
That is not know to be the case, only conjectured.

~~~
Natsu
I thought it was proven to be trancendental? Or were you saying that this
property of trancendental numbers is only conjectured?

~~~
tzs
Pi is transcendental, which means that it is not a root of a non-zero
polynomial with rational coefficients.

Being transcendental does not imply that a number's expansion in a given base
must include every digit string. Consider the number 1/10^1! + 1/10^2! +
1/10^3! + 1/10^4! + ....

This number, whose decimal expansion is 0.110001000000000000000001... is
transcendental (proven by Liouville in 1844). Its decimal expansion clearly
does not contain every decimal number. It only contains the digits 0 and 1,
and after the first two places never even contains consecutive 1s.

It is known that "almost all" real numbers do in fact contain in their base b
expansion every sequence of base b numbers, each sequence occurring with
frequency proportional to its length. These are called "normal" numbers. Very
few interesting numbers (where "interesting" means that we have some reason to
be interested in aside from their normality) are known to be normal, though.

~~~
chris_wot
Sorry to go off topic here, but can you give an example of a number that isn't
interesting?

~~~
tzs
By "interesting" I mean a number that arises out of something else that one
might be interested in.

For instance, consider pi. If you are interested in geometry, pi will turn up.
If you are interesting in number theory, pi will turn up (e.g., it is
connected to zeta functions). If you are interested in probability and
statistics, pi will turn up. If you are interested in differential equations,
pi will come to the party.

If you somehow have never encountered pi, I can convey it to you by telling
you about one of those things. For instance, I could tell you that it is the
period of the non-zero solutions of the differential equation y'' \+ y = 0.

An uninteresting number would be one that has no known connection to other
things. If I have a particular uninteresting number, and I want to convey it
to you, I'll have to just tell you the number.

A random number would almost certainly be uninteresting, such as this hex
fraction, which came from /dev/urandom on my computer:
0.bfdab557104bf2d8952fb1ea0adfd732794a353d5b35d95cda927f4ad8f6dd11f11b2e968298.
It is extremely unlikely that anyone has ever seen that number before. The
only known thing interesting about it is that it was made specifically as an
example of a number that is not otherwise interesting.

------
droithomme
It's interesting that all the time no one noticed it was not actually prime.
This leans skepticism towards assumptions that widely used security critical
open source code is reviewed by anyone competent at evaluating it, even over
long periods of time.

------
Mojah
Mirror here: [http://marc.ttias.be/oss-
security/2016-02/msg00003.php](http://marc.ttias.be/oss-
security/2016-02/msg00003.php)

------
cyphar
I think most people are glossing over the first part of the title. Why is the
DH p parameter hardcoded? Why not just generate one on each startup?

~~~
Ded7xSEoPKYNsDd
Generating those parameters is very slow. Just run `time openssl dhparam -text
-noout 1024` a few times and see for yourself.

~~~
cyphar
Then do it on first run and store it in ~/.socat_prime and do a primality
check each time it's loaded.

------
yyin
The older version I was using, 1.7.2.4, is not affected. So much for
"updates".

