Hacker News new | past | comments | ask | show | jobs | submit login
GnuTLS considered harmful (2008) (openldap.org)
185 points by calpaterson on Mar 5, 2014 | hide | past | web | favorite | 115 comments

The annoying thing about GnuTLS is that it normally might not be very widely used, except that the Debian project initiated a huge push to make software linkable with GnuTLS instead of OpenSSL, because of issues with the OpenSSL license[1]. So if you're a Debian or Ubuntu user, you're probably relying on GnuTLS a lot more than users of any other distribution, or people who compile the upstream sources themselves. (Not that OpenSSL is a panacea, but at least it gets more attention than GnuTLS).

[1] The OpenSSL license is incompatible with the GPL, making it technically illegal to distribute binaries of GPL programs linked with OpenSSL (so Debian refuses to do so), unless the GPL program has an OpenSSL license exception.

I don't understand why NSS[1] isn't more highly regarded. It's the crypto library that both Chrome and Firefox use, and it has a comparatively good security record[2].

[1] https://developer.mozilla.org/en-US/docs/NSS [2] http://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=nss

NSS was designed for use in single-user browsers and the API is completely unsuitable for more general, multi-user system-wide deployment.


RedHat has misguidedly chosen to base all of its security infrastructure on NSS, even though NSS was never designed for such a use. It is completely inappropriate for servers, multi-user workstations, etc.

It's regarded enough to be used by Java, Flash, Networkmanager, and Libreoffice... API is very low level though when all you want is a secure socket.


Some critical (to me) dependency chains on my system (Arch, KDE) seems to be

    Networkmanager → libsoup → glib-networking → GnuTLS
    Eclipse → WebkitGTK → libsoup ...
    VLC → FFmpeg → GnuTLS
    kdelibs (core KDE libs) → upower → libimobiledevice → GnuTLS
I have a feeling there are greater dependencies within GNOME distros and, as you said, Debian. Networkmanager is an especially annoying one because it uses NSS directly and gnuTLS indirectly.

If any Debian user is curious they can use the following commands to see gnutls dependents:

  $ apt-cache rdepends libgnutls27 libgnutls26 libcurl3-gnutls

  $ apt-cache --installed rdepends libgnutls27 libgnutls26 libcurl3-gnutls

The latter invocation only displays packages which are installed locally, the former prints all packages apt is aware of.

While some usage comparison between GnuTLS and OpenSSL is good, it is important to remember that each project support different features. For example, OpenSSL do not support OpenPGP.

The problems of linking GPL-code with OpenSSL was one of the reasons why ruby is dual-licenced today afaik.

Among others. Ruby was GPL/Ruby-licensed and is now licensed BSD/Ruby, which makes shipping OpenSSL easier.

Actually, I don't think that's generally true if you're talking about the libs. A cursory glance at the output of `apt-cache rdepends libgnutls26` suggests that wget is the only relatively "popular" package depending on it.

However, some other distros such as Arch do not have wget depending on it, so you do have a point about Debian.

The irony of what's happening here, that dogmatism about a belief is causing an inferior solution to be used, is infuriating and one of the reasons people have such a problem with dogmatic personalities like rms. It's technically illegal to use a better solution because of something as relatively unimportant as a license. Think of it like a Maslov's hierarchy - having strong security is way more important to most people than having a proper copyleft license. But instead of being pragmatic, we're stuck with a ridiculously dogmatic solution that ends up harming way more than the ill it was trying to cure.

It reminds me a lot of environmentalists going crazy to ban nuclear power in the 70s before we had as clear a grasp on the impact of dumping carbon dioxide into the air.

> The irony of what's happening here, that dogmatism about a belief is causing an inferior solution to be used, is infuriating and one of the reasons people have such a problem with dogmatic personalities like rms. It's technically illegal to use a better solution because of something as relatively unimportant as a license.

Why do you jump to blame the GPL and rms, when one could just as easily fault the OpenSSL authors for using the 4-clause BSD instead of the far more common 3-clause?

> It's technically illegal to use a better solution because of something as relatively unimportant as a license.

No, it is technically illegal to distribute compiled binaries that use OpenSSL, because the OpenSSL authors wanted to retain the advertising privileges. But it is not illegal to use the software as long as it is distributed in source and compiled by the end user.

I would not call licensing unimportant. As long as software is copyrightable, licensing terms are highly important.

I believe the OpenSSL team uses the 4 clause BSD license because they rely on SSLeay, which uses the 4 clause license. And if they have to advertise the SSLeay name, they might as well advertise the OpenSSL name as well.

The thing is, relicensing isn't likely to happen any time soon, regardless of what RMS says.

To be precise, SSLeay was discontinued when its authors were hired by RSA. They're now under non-competes and couldn't change the license if they wanted to.

How would a noncompete agreement stop them from relicensing a work they did before signing the agreement?

Noncompetes are generally illegal and get thrown out if it ever goes to court, although I understand why people don't want the hassle.

The reason the GPL is annoying is that free license with an advertizing clause have existed for a very long time and are actually widely used. A quick look at the about box of various software will usually show you a long list of mandatory acknowledgements for various open source licenses.

The problem is that the GPL willingly refuses to permit advertizing clauses. Is there a congent argument about why an advertizing clause is a limitation of freedom? The GPL is more often than other free licenses putting restirctions on usage of diversely licensed software. It is an impediment. And, as we see, it has real-world consequences. There is more risk for freedom using bad software security than wielding to innocuous clauses.

> The problem is that the GPL willingly refuses to permit advertizing clauses. Is there a congent argument about why an advertizing clause is a limitation of freedom

The advertising clause is not a limitation on freedom. The 4-clause BSD license is a free software license; it just happens not to be compatible with the GPL (not all free software licenses are).

The reasons for this are very practical: not only does it place additional restrictions on the software (which is not permitted by the GPL), but if multiple 4-clause BSD projects are used, each project requires its own separate advertising statement (the 4-clause license does not permit combining these into a single sentence): https://www.gnu.org/philosophy/bsd.html

> The reason the GPL is annoying is that free license with an advertizing clause have existed for a very long time and are actually widely used.

Most modern projects using permissive licenses use 3-clause BSD, MIT/X11, or Apache, all of which are compatible with the GPL. In this day and age, choosing a 4-clause BSD license is a fairly conscious decision to make the project incompatible with the GPL.

Choosing the 4-clause BSD license is a conscious decision to continue to receive credit for all your hard work, when a proprietary software company comes along and includes your code in their product. To me this is a fair compromise for proprietary companies who refuse to open up their source code (i.e., would never touch GPL at all).

As I mention in a reply to the sibling comment, I don't fault the developer for choosing a free software license that suits their purposes. I just don't think it's fair to blame the GPL for the incompatibility that happens when a developer chooses a 4-clause BSD license.

(Also, remember that the developer could always dual-license - ie, "GPL or 4-clause BSD - if you want to use my software in proprietary code, then you have to advertise me").

It's a fair compromise for anyone. Being credited for your own work isn't as evil as RMS thinks (arguably somewhat ironic as he wants the FSF to be credited with Linux).

Huh. This is the first licensing-related thread I've read on HN in months where someone said something I found interesting and informative before I gave up and stopped reading.

Thanks for your even-keeled comments here; helpful and refreshing.

> In this day and age, choosing a 4-clause BSD license is a fairly conscious decision to make the project incompatible with the GPL.

Complaining about this seems a bit strange, since GPL is deliberately incompatible with everything else when it comes to sharing. OpenSSL's license, although kooky, is freer than the GPL in terms of who can use the stuff covered by it.

> Complaining about this seems a bit strange,

For the record, I'm not complaining. I'm just saying that it's unfair to blame the incompatibility solely on the GPL (as OP seemed to be), when the developer is the one who chooses the license for their software. (And I presume the OpenSSL authors are experienced enough to be familiar with the compatibility differences between the 3-clause vs. 4-clause BSD license).

> OpenSSL's license, although kooky, is freer than the GPL in terms of who can use the stuff covered by it.

No, both are equally free. Both of them respect the four freedoms, so they are both free licenses.

(The 4-clause BSD is arguably more permissive, but on the other hand, the GPL permits one to advertise the software without any restrictions, so it really depends on which of those two one values more. Generally the copyleft clause is what people care about more than advertising, but it's important to note both).

> A quick look at the about box of various software will usually show you a long list of mandatory acknowledgements for various open source licenses.

You have mistaken what "advertizing clause" means. The GPL requires that the about box list the copyright holders, so that can't be the types of advertising at issue.

No, the complaint is about:

   * 3. All advertising materials mentioning features or use of this
   *    software must display the following acknowledgment:
   *    "This product includes software developed by the OpenSSL Project
   *    for use in the OpenSSL Toolkit. (http://www.openssl.org/)"
If you have software which uses OpenSSL, and to promote it you send out a tweet, then the license requires you to include the above two lines in the tweet.

In practice, a project might have 20 such advertising requirements. It gets boring.

An advertising clause can be used as a weapon. Suppose I distribute "free" software to you, but require you to include a 100 page manifesto every time you make an advertisement. Is that really "free"?

"If you have software which uses OpenSSL, and to promote it you send out a tweet, then the license requires you to include the above two lines in the tweet. "

No. If you have software that uses OpenSSL, and to promote it you send out a tweet that says "Use our product instead of our competitors, We use SSL to make things secure", then you must include the above two lines

For the clause to apply 1. It has to be an advertisement 2. It has to advertise the features that use openssl

hyc_symas gave essentially the same correction in a parallel post, a few minutes before you.

I pointed out that the edge cases are fuzzier than I would like. If my product is called "SecureTalk", and uses OpenSSL for secure connections, then it sounds like almost any mention of the name which might be advertising needs to include that line.

As in, "Secure Systems, the developers of the NSA-proof SecureTalk, are hiring."

Isn't that "mentioning features" of OpenSSL? If so, it needs that line. If not, why not? What does it mean to mention a feature? Can I get away with

"Secure Systems, the developers of SecureTalk, are hiring."

After all, the only reason it's secure is because it uses OpenSSL.

In this case we can consider this requirement to be a public service. Suggesting that someone believes that an app is secure because it uses OpenSSL is a somewhat common form of mockery in crypto circles. If you just announce that you are clueless about security then no one needs to bother looking at your website in the off chance that you aren't.

I didn't say it was secure "because it uses OpenSSL". I said the much more limited "and it uses OpenSSL for secure connections."

Copyright is sticky. The hypothetical "SecureTalk" program might only use 500 lines of OpenSSL, where that 500 lines was security audited by crypto experts, static code checkers, and formal program analysis, and run in a chroot'ed jail.

A clueful re-use of OpenSSL for secure connections still needs that advertising clause, even if the software really is more secure than anything else out there. In that case, the required advertisement is a false clue to experts, no?

I would say if you make the claim that the security is from more than just the use of OpenSSL then there would be no need to put in the OpenSSL notice when just talking generically about security. You might still need to if you specifically mention encrypted connections, say, if you are using OpenSSL to encrypt connections. The advertising clause can still be annoying, but I don't think it is quite as bad as you are making it out to be. At least when there is only one or two projects you are using that require them... I think the main reason they are less popular now is that it gets really awkward when you need pages and pages of advertisement clauses.

I also doubt that anything that uses OpenSSL as the primary crypto could possibly be "more secure than anything else out there". This isn't so much a slam of OpenSSL, which may overall be doing a better job of implementing TLS than anything else available right now (at least open source) but of TLS in general which is complex and not designed with current best practices. Using TLS is often an easy way to make things a lot more secure than they are without much effort and as such is often a good choice, but it is unlikely to result in the most secure thing possible. OTR is a well known alternative in chat that has a number of advantages (and some disadvantages too). Various others are under construction. Importantly, there are significant tradeoffs involved and it is often not a simple matter of X is more secure than Y.

Neither you nor I have the legal experience to really determine if there is no need. What constitutes an "advertisement"? If I am a security consultant and I develop a no-cost open source tool using OpenSSL, and I do it deliberately as a way to get my name out into the field and find clients, then is that advertising?

What constitutes "mentioning features of this software"? If I use another package for SSL and advertise that my software has SSL support, but have OpenSSL in my code for other reasons (let's say, the SHA-1 digest code), then do I need to mention OpenSSL? After all, SSL is a supposed feature of OpenSSL.

No, it's not as bad as I make it out to be, but that's in large part because we are generally lazy when it comes to the particulars of licenses. Just look at the number of GPLv2 software distributions which don't follow the letter of the license. (Section 3 assumes physical distribution, not network. GPLv3 clarified this problem.)

It's also because license holders are lazy. Enforcing the GPL takes a lot of time and effort. Many violations occur because few actively enforce the license.

If your expectations are based on what people do in a lazy world, then you are perhaps a realist (or a cynic), but it still violates the license.

The "pages and pages of advertisement clauses" affects only to those who actually follow the license. These might be nitpickers like me, or organizations with lots of money and who are easy pickings and worried about liability.

These also happen to be the people who are likely to give acknowledgements, especially when the license so requires it (as the GPL does).

Not quite. If you tweet and brag about SSL or crypto support, then you must credit OpenSSL. If you brag about something that is not a feature derived from/dependent on OpenSSL, then the clause is irrelevant.

So corrected.

If my project is "SecureTalk" with the tag line "the NSA will never know", and it's secure because of OpenSSL, then will I have to mention that text every time I use the word "SecureTalk" in a tweet/ advertisement?

What about "HushTalk"? "MumsTheWord"? "SafeBanking"?

If I add optional rot-13 encryption, so there are now two cryptosystems, then can I pretend that SecureTalk doesn't "really" require OpenSSL, so I don't need the advertising?

The problem is that the GPL willingly refuses to permit advertizing clauses. Is there a congent argument about why an advertizing clause is a limitation of freedom?

The GPL doesn't specifically set out to prevent advertising clauses. It is a side-effect of being incompatible with "other restrictions" - for example, a requirement that you license some third party software or patent in order to redistribute GPL-covered code. Instead of trying to specifically enumerate and disallow all such restrictions that someone might come up with, which is a fool's errand, the GPL disallows any other restrictions.

> the GPL disallows any other restrictions

As a minor quibble, section 7 of GPLv3 allows a few other restrictions. That is, there's a general blacklist, as you say, with a specific whitelist of what additional restrictions are allowed.

For example, "b) Requiring preservation of specified reasonable legal notices or author attributions in that material or in the Appropriate Legal Notices displayed by works containing it;"

Well, GPL is far from the most free license anyway (that would probably be the WTFPL, MIT license, or 2-clause BSD license). GPL is arguably a restrictive license, albeit not one that seeks to prevent copying.

Where is the irony? Even if I substitute Alanis Morissette's definition for the dictionary definition I can not identify any irony. Technical superiority was never the primary goal of the Free Software Movement.

I also don't understand the environmental anecdote. That seems less about dogma and more about imperfect scientific knowledge. Were the environmentalists opposing nuclear energy on principle or because at the time the evidence made nuclear power look unsafe and detrimental to the health of the environment?

Rev. Dr. King had this to say about pragmatism: http://www.africa.upenn.edu/Articles_Gen/Letter_Birmingham.h...

It's not dogmatism. Debian just a conservative interpretation of the law the conditions of the GPL. It's fair to point out that Debian's interpretation doesn't seem to be very widely held outside the project, but believing "we should obey the law" doesn't count as dogmatism.

The irony of what's happening here, that dogmatism about a belief is causing an inferior solution to be used

How is that irony? I Don't think rms or anyone who pushes for Copyleft does so because they believe it always results in a superior solution.

I had the same response. Whenever I see "irony/ironic" I make a conscientious effort to not use the dictionary definition and give the author a lot of semantic leeway. After I read the comment for the third time I still could not identify any irony.

The irony is actually that GnuTLS is panned for 'not doing it right' in the article, and here you are panning Debian for 'doing it right' when it comes to licensing. Debian is following licenses as they should be followed and not cutting corners.

I think the real lesson here is not to write your own license but to use well-known ones. There are lots of permissive licenses that are also compatible with the GPL.

4-clause BSD is actually the original BSD license, even though it's not very common nowadays. According to wikipedia, it was first used in 1990 or before, so roundabout the same time as the GPL v1 (1989). It's certainly not self-written.


The principle of "pick from among the most common" still holds, though, of which drafting your own license is the most extreme violation.

Why aren't post such as parent comment simply killed by the moderators?

There are a BSD vs GPL discussion about once every week on HN. Out of those several hundred threads and thousands comments, has a single users been convinced about the preference of either license type? Has a single person said "o, sorry, I will now change my opinion and use your license of choice because your arguments is so good".

Hate or love RMS, but can you keep it in your pants and do it elsewhere?

Man, if you thought using GPL software is painful now, you should have tried things 20 years ago.

Most software was pretty painful in 1994...

It's a shame this is being upvoted so highly when it's factually incorrect. A rebuttal can be found here: http://nmav.gnutls.org/2011/05/is-really-gnutls-considered-h...

It's rather silly that the news of a critical bug in GnuTLS that was caused by a goto somehow makes non-news and factually wrong information from 5 years ago popular.

It addresses the use of the libc str* functions but not the first point about the prototype of set_subject_alt_name.

It's also discussed later in the thread: http://www.openldap.org/lists/openldap-devel/200802/msg00100...

> You note that there's really a small number of instances of strcat() in the code. That's true, but that's because you've provided your own _gnutls_str_cat() function instead, which is also heavily used. Assuming that strlen() isn't going to SEGV on you (which depends on dumb luck) this becomes just a question of efficiency.

I also think that in the rebuttal the example is extremely poorly chosen since the code is equivalent to the much simpler

    char str[256] = "PKIX1.CRLDistributionPoints.?1.distributionPoint.fullName";
Assuming you even need str to be 256 char long, otherwise you would use 'char str[] = "...";' or 'const char *str = ' if you don't modify the string.

Maybe the use of the concatenation is legitimate in the real code but I cannot judge that since it appears to have changed since the article was written:


No strcat in there. Maybe it wasn't such a good idea after all? :)


Actually, I dug into the git repo to find the old code and looked to revert to a commit around the date the blog post was written. Obviously I don't intend to get any work done this afternoon. I found this commit on the same day (2011/05/10):


"eliminated last instances of strcpy() and strcat() to keep pendantics happy."

So the reason he says the problem is not here anymore is because he fixed it just before writing this blog post, more than 3 years after the openldap rant. I'm sure nmav was well-intentioned but it does weaken his rebuttal somewhat.

> A rebuttal can be found here

Only that this rebuttal completely misses the point. They misunderstood the criticism being about buffer overflow vulnerability

>> So what is the issue? Howard claims that GnuTLS makes liberal use of strcpy(), strcat() and strlen(). Those functions are known to be responsible for several attacks via buffer overflows in current programs.

while it was in fact about the nature of the data to be processed, namely that it may not be NUL terminated strings but arbitrary binary data for which the whole bunch of `str…` functions and any other string processing that expects to operate on NUL terminated strings will miserably fail

> Looking across more of their APIs, I see that the code makes liberal use of strlen and strcat, when it needs to be using counted-length data blobs everywhere. In short, the code is fundamentally broken; most of its external and internal APIs are incapable of passing binary data without mangling it. The code is completely unsafe for handling binary data, and yet the nature of TLS processing is almost entirely dependent on secure handling of binary data.

That rebuttal doesn't address the more important claim that the library uses NUL-terminated strings for potentially arbitrary binary data.

Not that this affects your point, but the critical bug was not caused by a goto. Rather, it was caused by a mismatch in return value semantics, where a variable was used to store a value where 0 meant success, and then later used to return a value where non-zero meant success.

If you dive into the exchange following http://www.openldap.org/lists/openldap-devel/200802/msg00100... you'll see that the criticism does not only allude to the liberal use of strcat() and strcopy() but has more fundamental problems about the general quality and efficiency of the code. While I do agree that digging up a 5 year old rant with a catchy tagline does not tell us everything about the projects current state, the rebuttal also misses some of the points brought up in the initial criticism

There was an extensive discussion between Howard and nmav on that blog post a few months ago. nmav has completely deleted that discussion from the blog because he didn't like the fact that additional problems with the GnuTLS code based were pointed out in that discussion. It is quite interesting that someone who is so tied into an open source project is against keeping a public discussion available to the world.

Misremembered, the discussion was on G+ https://plus.google.com/112912252727709520367/posts/RGBXrLTh...

I was a bit curious about this quote:

"It turns out that their corresponding set_subject_alt_name() API only takes a char \ pointer as input, without a corresponding length. As such, this API will only work for string-form alternative names, and will typically break with IP addresses and other alternatives."

Yes, an API designed for strings will break if you pass it a struct in_addr or something, but it should be fine with a dotted-decimal string, right?

The issue is that they designed a API taking a NUL terminated string in the first place, as it should have been something more generic. They knew little enough of X.509 they didn't bother to handle every cases.

My understanding of RFC 3280 is pretty old, but the relevant ASN.1 type describing a subjectAltName seems to be :

SubjectAltName ::= GeneralNames

GeneralNames ::= SEQUENCE SIZE (1..MAX) OF GeneralName

GeneralName ::= CHOICE { otherName [0] AnotherName, rfc822Name [1] IA5String, dNSName [2] IA5String, x400Address [3] ORAddress, directoryName [4] Name, ediPartyName [5] EDIPartyName, uniformResourceIdentifier [6] IA5String, iPAddress [7] OCTET STRING, registeredID [8] OBJECT IDENTIFIER }

The IP address case is represented as an octet string, and the octet 0 is legitimate, making their API broken...

That's the X.509 certificate format, right? It's not a code interface.

My point was that it's not reasonable to expect an interface that appears to be accepting a string to also accept random bytes; "" isn't the same as 0x0a000008.

Maybe he means a binary representation of an IP address (4 bytes for an IPv4). In this case serializing would break.

Yes, lots of APIs that take char * will break if you pass them some arbitrary other type of data that you assume they will handle.

How about if they are implementing a public standard that states this other arbitrary type of data is valid?

It is a logical leap to assume that because the spec says other types of data are valid that means you should be able to pass arbitrary data to this function that is documented as requiring a 0 terminated string. Let's say you wanted to pass an IPv4 address. Would you expect to pass it a uint32_t pointer? A struct in_addr pointer? Host or network byte order?

Upvotes have many reasons. Sometimes it's to find out what the discussion will reveal (my own motivation here).

The rebuttal completely missed the point. https://plus.google.com/112912252727709520367/posts/RGBXrLTh...

Exactly. I came to say the same thing.

Money quote from the message thread: "I can't even find the words to express how gross this is." (http://www.openldap.org/lists/openldap-devel/200802/msg00100...) Interestingly, the conversation stays somewhat civil even after that quote, looks like professionals at work :)

Any time I see code like:

      if ((len_len + (int) strlen (str)) <= max_len)
with int's I immediately start worrying about integer overflow leading to buffer overflow. Nobody seems to have mentioned this though.

The quote seems to simply be complaining about the repeated use of the subexpression "strlen(str)", by implication because it's needless and inefficient.

Except that it's not. At least on glibc, strlen() is declared "pure" to the compiler and (unless otherwise defeated by pointer aliasing) repeated calls will be optimized away.

That's not to say that this is the best way to write the code, but the concern seems poorly informed.

> (unless otherwise defeated by pointer aliasing)

In practice calling any other non-pure function, which the compiler can't see in to, will defeat this optimisation, whether there's aliasing within your function or not.

It's not an efficiency concern, it's a security concern.

Common subexpression elimination isn't normally considered a security technique, so I guess I don't understand your point.

Maybe you're saying it's a "code smell" kind of thing and that being sloppy here indicates more subtle problems elsewhere? Which then hits the argument about whether this is really "sloppy" or just intentionally simple.

Shrug. My point was just that this needs better evidence. There is no demonstrated bug in the linked code, and the assertion that it is "gross" (OP) or "insecure" (you) seems poorly justified.

The basic concern is that strlen() shouldn't be used at all on the data that's passed to the given function since that data may be binary and not - as the function assumes - null-terminated string. The code is "sloppy" and the complete certificate handling seems to be sloppy. I don't want to be the judge of that, but if you read the whole post and the ensuing thread, the argument is made quite well and convincingly. And seriously, the last place I'd like to see a sloppy implementation that assumes that the given data is benevolent and does not contain a malicious payload is - guess what - a TLS library.

Yes yes yes, but the specific "money quote" above wasn't talking about that, it was talking about the number of redundant calls to strlen() in this one particular function. Extending it to mean something else by our implication makes it something rather different than a "money quote".

There's only so much an API can do with garbage input. If this function took a pointer and a length, that's not a magic fix, you could still pass it a bad pointer and/or a bad length..

The problem here is that binary input is valid according to the spec [1]. It's not malicious input in the sense that a programmer is using an interface deliberately wrong but rather in the sense that a counterparty could send you a non-garbage certificate that contains that data - which would be valid, but still break this code. That's not comparable to passing a bad pointer or a bad length.

[1] at least according to the post. The fact that gnutls added a binary interface later seems to support that reading.

Yes, the spec says the field can be string or binary. The API only handles string fields. The API should be (and was) updated with a function that handles binary fields, but there's nothing wrong with the original function.

If this was C++, and the function took a std::string, would you say it was horribly broken because you serialized a 4 byte IP address into a 4 byte std::string buffer and the function didn't handle it correctly?

Seriously, if the function deals with untrusted user input and pretends to conform to a spec, weasel wording around by saying that the function does only partially conform to the spec and will blow up in the users face when passed other, spec compliant input and then deferring all responsibility to the user, yes, that counts as horribly broken in my books (even though you were the one to introduce those words). That's what we have libraries for, so that me and you don't have to deal with this mess that x509 cert parsing is - I've seen enough of it that I know I don't want to go down this particular hell hole - and I've just been standing at the sideline and watching others wrestle with it.

functions don't conform to specs, the API and library as a whole should conform to the spec. It's perfectly valid to have one function that supports strings and another function that supports binary data.

Did they have such a pair of functions?

They do now, and the first one is still using strlen(). Does the existence of the second function mean it's now ok for the first to use strlen()? You can still crash the program if you send binary data to the string function instead of the binary function..

So essentially it's an issue with C being weakly typed?

Probably helps that the message containing that quote also contained a handful of significant examples of poor coding choices along with suggestions for improvement.

No matter how you look at it, C is the wrong choice for security relevant systems software. But where is the alternative?

Java obviously, there are no security issues with java...

No wait! Ruby, that should be perfect, can't cause buffer overflow there....

Ermm.. wait I got , well use python!

I can't speak for Ruby, but in both Java and Python I've never seen a bug this embarrassingly bad. I mean, this bug simply would not have happened in a language with a working boolean type, or with exceptions (I trust the ZeroMQ guy's judgement about as far as I can throw him).

What do you think the Java runtime is written in? The Ruby runtime? The Python runtime?

Hint: it starts with a C.

If you build a house in a swamp, you are in trouble.

Wow, you're daft. Everything is, ultimately, getting turned into machine code. Not even Lisp Machines could get away from this fundamental fact: http://funcall.blogspot.com/2009/04/some-lisp-machine-minuti...

> Bogus objects — properly tagged words with invalid addresses that pointed at uninitialized memory or into the middle of object of a different type — which would cause the GC to corrupt memory would be left in registers or on the stack. These sort of problems were everywhere in the microcode.

Go build your cabin in the woods now.

Sure one can corrupt memory on a Lisp Machine. System software like the micro code or the garbage collector can contain bugs.

Something like a buffer overflow was very very rare. This was not a systemic problem like it is in C.

> The Python runtime?

Why, Python, of course![0]

[0] - http://pypy.org/

(I jest, your point is entirely valid)

I was making a joke....

If C is the wrong choice, what is the right choice?

In a couple years or so, my bet would be on Rust.

Java has several runtimes. The most popular one is Hotspot and written in C++.

You can write the Python runtime in something other than C:


And there are Java compilers and runtimes written in Java, and Lisp compilers written in Lisp, etc.

Actually C is an excellent choice for security relevant systems software because the issues for developing in C are well understood and can easily be mitigated by following 30 years worth of best practice patterns and using the correct development tools.

The issue is developers are not using the tools or following the best practices because they think they know better than 30 years worth of experience or get caught up in bikeshedding about ideology, licenses and which line the curly braces go on.

"Actually C is an excellent choice for security relevant systems software because the issues for developing in C are well understood and can easily be mitigated by following 30 years worth of best practice patterns and using the correct development tools."

Nevermind the copious undefined behavior, the fact that C programmers sometimes struggle to figure out what a valid C expression actually does, the fact that C programmers have to choose between code bloat and using "goto" for finalization, the fact that there are no standard error handling constructs, the fact that strings are null terminated, the lack of a standardized way to determine array lengths at runtime, etc., etc., etc. Even something as simple as this:

int f(int x, int y) { return x + y; }

Can lead to undefined behavior in C:


Basically C should be at the bottom of the list of languages that programmers choose for cryptography or security software.

Can you link to the 30 years of best practice document, I must have missed it.

Another good read (it probably does not reflect how you want to write C code, the rule about dynammic allocation is probably extreme if you are not writing code to fly spaceships, but I think it is good to read regardless): http://lars-lab.jpl.nasa.gov/JPL_Coding_Standard_C.pdf

C and C++ are specifically discouraged as programming languages in "Cryptography Engineering" (Niels Ferguson, Bruce Schneier, Tadayoshi Kohno, 2010).

Rust seems like it might be a good choice in the not too distant future.

A brand new language is the best choice for security software?

Preferably something generated from a formal proof tool. It wouldn't be perfect (nothing is), but the mistakes would be less stupid. Which is a big jump!

There's a bit of research in this, but I don't know of anything that's ready to use.

Well, there's Coq, which (per my understanding) may or may not make the cut depending on how you define "ready to use".

I know Coq as a tool to help generate proofs. Does it generate code? Either way, it's not a verified implementation of TLS.

For that, we'd need: (1) A set of properties to verify for an implementation (2) An implementation to verify

Looking at http://coq.inria.fr/related-tools, there may be some related tools that could do the job. Still, there's a research project there.

People absolutely use Coq to generate code. I don't know what else is involved.

A few substantiating links from my search results for "coq generating code":




Oh great! Thanks. Upvoted.

In a few years, Go. It's native crypto libraries are impressive already, though not thoroughly vetted. There's nothing wrong with C though, as long as carefully written. The biggest asset is that every popular language can link to it.

Any updates to this 6-year old post? I would hope that, with systems like Debian forcing a move to GnuTLS from OpenSSL (for licensing reasons), it would have since received more care.

I really don't see the Point of GnuTLS. Sure, if you want to directly integrate OpenSSL with GPL code, you can't, but as it's a library, you should be using it normally anyway. OpenSSL is far more widely used, has a longer history, and I would say is better tested and understood.

For the morbidly curious, ITS#5361 which triggered the above email: http://www.openldap.org/its/index.cgi/Incoming?id=5361

Other gems - ITS#5991 http://www.openldap.org/its/index.cgi/Software%20Bugs?id=599... which required us to hack up a workaround. It also triggered ITS #5992 http://www.openldap.org/its/index.cgi/Software%20Bugs?id=599...

Discussion of the GnuTLS bug is summarized here https://www.debian-administration.org/users/dkg/weblog/42

And people still wonder that GnuTLS certificate verification bugs continue to surface?

Fixing the strlen()s is easy but the binary stuff is harder. Assuming it hasn't already been done.

If you're focusing on the strlen() you're missing the forest for the trees. The problem is someone who knew nothing about security or good programming practices decided to write a security library and somehow convinced the community at large to trust his code. Everyone's a beginner at some point but no sane person trusts their system security to code written by someone so demonstrably incompetent, and no honest beginner would attempt such an undertaking and then advertise it as production-ready or secure.

The fact that there are still certificate validation bugs in GnuTLS today indicates that the GnuTLS developers still haven't learned the essentials of X.509 certificates. Even with a rapidly deployed fix for this most recent CVE, you'd be a fool to rely on GnuTLS for anything. The code and the developers have proven themselves not to be trustworthy. Multiple times.

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact