Hacker News new | past | comments | ask | show | jobs | submit login
The Heartbleed Bug (heartbleed.com)
1768 points by tptacek on April 7, 2014 | hide | past | favorite | 528 comments

There was a discussion here a few years ago (https://news.ycombinator.com/item?id=2686580) about memory vulnerabilities in C. Some people tried to argue back then that various protections offered by modern OSs and runtimes, such as address space randomization, and the availability of tools like Valgrind for finding memory access bugs, mitigates this. I really recommend re-reading that discussion.

My opinion, then and now, is that C and other languages without memory checks are unsuitable for writing secure code. Plainly unsuitable. They need to be restricted to writing a small core system, preferably small enough that it can be checked using formal (proof-based) methods, and all the rest, including all application logic, should be written using managed code (such as C#, Java, or whatever - I have no preference).

This vulnerability is the result of yet another missing bound check. It wasn't discovered by Valgrind or some such tool, since it is not normally triggered - it needs to be triggered maliciously or by a testing protocol which is smart enough to look for it (a very difficult thing to do, as I explained on the original thread).

The fact is that no programmer is good enough to write code which is free from such vulnerabilities. Programmers are, after all, trained and skilled in following the logic of their program. But in languages without bounds checks, that logic can fall away as the computer starts reading or executing raw memory, which is no longer connected to specific variables or lines of code in your program. All non-bounds-checked languages expose multiple levels of the computer to the program, and you are kidding yourself if you think you can handle this better than the OpenSSL team.

We can't end all bugs in software, but we can plug this seemingly endless source of bugs which has been affecting the Internet since the Morris worm. It has now cost us a two-year window in which 70% of our internet traffic was potentially exposed. It will cost us more before we manage to end it.

From a quick reading of the TLS heartbeat RFC and the patched code, here's my understanding of the cause of the bug.

TLS heartbeat consists of a request packet including a payload; the other side reads and sends a response containing the same payload (plus some other padding).

In the code that handles TLS heartbeat requests, the payload size is read from the packet controlled by the attacker:

  n2s(p, payload);
  pl = p;
Here, p is a pointer to the request packet, and payload is the expected length of the payload (read as a 16-bit short integer: this is the origin of the 64K limit per request).

pl is the pointer to the actual payload in the request packet.

Then the response packet is constructed:

  /* Enter response type, length and copy payload */
  *bp++ = TLS1_HB_RESPONSE;
  s2n(payload, bp);
  memcpy(bp, pl, payload);
The payload length is stored into the destination packet, and then the payload is copied from the source packet pl to the destination packet bp.

The bug is that the payload length is never actually checked against the size of the request packet. Therefore, the memcpy() can read arbitrary data beyond the storage location of the request by sending an arbitrary payload length (up to 64K) and an undersized payload.

I find it hard to believe that the OpenSSL code does not have any better abstraction for handling streams of bytes; if the packets were represented as a (pointer, length) pair with simple wrapper functions to copy from one stream to another, this bug could have been avoided. C makes this sort of bug easy to write, but careful API design would make it much harder to do by accident.

It is indeed astonishing how simple-minded this bug is. But these bugs come in all levels of complexity, from simple overstuffed buffers to logical ping-pong that hurts your brain when you try to follow it. We need to get rid of them once and for all. If the whole world can't use a certain tool effectively, then the whole world isn't broken; the tool is bad.

Machine level languages like C and C++ aren't necessarily bad tools, even in their current states. However, I agree that they might be bad tools for the purpose of writing security libraries.

There are not bad tools, but not the best either. If you spend mental stamina on trivial things, you have less for the important ones, the ones a compiler cannot check.

This kind of tool (SSL) should be written in ada or haskell.

Why not Go, or JavaScript? I'm sorry, but specifying which language should be used is petty.

C and C++ are just fine, the fact that the OpenSSL guys cocked it up is not the language's fault, it is theirs. There are efficient ways to prevent this type of bug.

What are the efficient ways of preventing this kind of bug, if not type systems?

The parent had a good point and you should really try to look at Haskell before you say that kind of nonsense.

All the tools that are available for static analysis are basically extra type systems bolted on top of existing languages.

If you try to detect buffer overflows using static analysis of the linux kernel what you need to do is to is go through the source code and define invariants. Those invariants are TYPES in languages powerful enough to express them.

For example the invariant that memory, or any resource allocated must be freed can be expressed in Haskell.

In C++ it cannot be expressed. There are workarounds like RAII, but that does not give any guarantees.

If you do not think type systems and thus languages make any differences, you also cannot believe that formal verification makes any difference, because type systems are a weak form of formal verification. How "weak" depends on the language.

You should also read up on the Curry-Howard correspondence to learn something about the deep connections between types, programs, and proofs.


JavaScript would be terrible, because it's easy to hide unwanted behaviour in counter-intuitive corners of the language.

Besides the language peculiarities, a garbage collected or interpreted language is very vulnerable to side channel attacks because of the large amount of complicated behaviour that is being glossed over by the language runtime. (One example would be garbage collection rounds and timing attacks, but I'm sure smarter people would find tons of features that leak secret information. Another example is on-demand JIT'ing when code becomes hot in certain runtimes. The timing of such a JIT stall could publish information you thought secure.)

C and AsmJS are just as open to side channel attacks. AsmJS is safe, C is unsafe.

I'd take javascript over C any day.

Guarding against side-channel attacks in any language is hard. Guarding against them in Javascript is probably impossible. Whether you would take Javascript over C is irrelevant. It would still be a terrible choice for a security framework. Perhaps modern system languages, such as D or Go might be suitable.

I've felt that C makes this code easy to write because it makes doing the right thing hard. What you are describing is just a lot of work in C, compared to a language with something akin to Java's generics, which are in turn an afterthought in the ML family of languages. What we're asking for is not that complicated from a PL standpoint. A generic streams library?

Economics plays an invisible part here. Someone writing a library has a limited amount of time to implement some set of features, and to balance that against other needs, like making the code "clean"/pretty and secure. In this case, pretty code and secure code are akin. Consumers would likewise have to balance out feature needs with how likely the code is going to explode. What it comes down to is that you aren't likely to have secure, stable code in a language that doesn't inherently encourage it.

It starts to be clearer then, that the more modern, "prettier" languages offer material benefits in their efforts to be more elegant.

That's what I like about Ruby. ;)

Even in C, Go or Python, I column align any text that is remotely similar, so differences are obvious.

Clean code might be extra work but the net work (maintenance) should amortize less. Reducing cognitive load for large supportable production codebase cannot be underscored enough.

There's no "just use X" type of answer in security.

Sep 2013

"All versions of the open source Ruby on Rails Web application framework released in the past six years have a critical vulnerability that an attacker could exploit to execute arbitrary code, steal information from databases and crash servers."


Nov 2013

"A lingering security issue in Ruby on Rails..."


Dec 2013

"Ruby on Rails security updates patch XSS, DoS vulnerabilities"


Ruby != Rails. We do a lot of ruby, but practically no rails.

C != OpenSSL. Some [1] would argue OpenSSL is not representative at all what C can do. Maybe you should check out Redis for beauty [2] and joy [3].

On the same note C != C++ either and you can write large systems in C++ without ever using memory allocation. You can use only bounds checked functions.

And you can have large security holes if you're not careful, no matter which language you pick.

[1] http://news.ycombinator.com/item?id=7556407

[2] http://johnpwood.net/2012/07/18/the-beauty-of-redis/

[3] http://news.ycombinator.com/item?id=2275413

But rails is written in ruby. So it can't have security bugs, right?

Observing that Ruby eliminates entire classes of bugs doesn't mean that Ruby eliminates all bugs; just that your attack surface is smaller.

-smaller +different.

Sure it can and I'm not surprised it has. But if you're trying to point out flaws in ruby, at least use examples for flaws in ruby - not flaws in something written in ruby. It's not like web frameworks in other languages magically don't suffer from XSS injection attacks.

The issue at hand is a flaw in something written in C, though. I agree the point wasn't well made (is there any reason to think those errors would not have been made had the project been written in C?) but your objection isn't quite right.

The issue at hand is an error that is typical for C (unchecked out of bound memory access). It's a class of error that does usually not occur in other languages. The vulnerabilities in Rails were XSS vulnerabilities and an information leak - both classes of errors typically found in web application frameworks.

The first is an example of an error made more common by the language design, the other an example of errors typical for a class of applications. There's a fundamental difference here. There's a ton of reasons to criticize ruby and it brings its own set of flaws and problems, some rooted in the language and some rooted in its ecosystem - but the given examples just show that web applications are hard to get right. That's why this is not "a point not well made" but rather "sorry, you're attacking a strawman here".

I'm not arguing the other side. I think you are correct. I just think you needed to point to the reason the parallel construction didn't work.

There is a "just use X". If you code in a language where you can express the invariants in your code, and make the compiler check those invariants, then your code is immune to all of the vulnerabilities that we have seen in OpenSSL.

The fact that these languages don't automatically do all my system administration tasks for me is not an argument against using them.


Rails does plenty of "make life easier for the programmer" things that I would expect to increase the risk of security issues. Do you have those kind of problems for e.g. Haskell?

Haskell problably has/would have the same kind of problems, but finding examples will be a lot harder in the absence of large well-used web platform à la RoR

If you worked at it, you could create this problem in Haskell. However, it is in fact the case that Haskell would be, in its own way, screaming at you; your configuration (or whatever) parser takes in some text and then returns something of type "IO Configuration"... what is that IO doing there? You don't have to be very skilled in Haskell to stop right there and have a serious think about what's going on. And in the absence of IO, or some other really obviously wrong type signature, there isn't much malicious stuff you can do in the parser layer. You could still have a vulnerability by doing something wrong when given certain configurations, but there's not much we can do about straight-up bugs. Even a proof language will let you make straight-up errors, they'll just force you to deeply, profoundly make the error instead of superficially make it... but we humans are up to the task!

No it simply does not, because the language forces you to write pure functions. The type system invites you to express invariants.

There are very fundamental connections between strong typing, program verification, and proofs.


Thus, the argument that Haskell probably has the same, is simply false.

There are large web platforms in Haskell. Yesod is probably the largest eco-system. It is clearly not as well used as RoR, but anyone can dig through large amounts of code to try to find these bugs.

What Haskell has that everyone else has are bugs/misunderstandings in how protocols are implemented. Sometimes there can be fundamental bugs in the run-time-system. However, large classes of bugs are fundamentally less likely to appear than in less safe languages.

Once you are doing functional programming a bunch of classes of problems including a bunch of classes of security problems go away.

For example, here, if the guarantee of functional programming is that a given input leads to a given output and has no memory side effects, then your attack surface area is a lot, lot smaller.

Remember - Rails is a framework for webapps. Haskell is a language. You should be comparing ruby to haskell.

Write everything in Coq.

Perhaps more reasonably: write the core of your application in Coq and have it expose a DSL for writing business logic atop this infallible core.

you get it but the opinion of "C and C++ are fine for SSL, the OpenSSL guys just screwed up" is plain wrong.

This is a question of priorities. We have speed and security. If you chose C/C++ (non-existent automated checking of memory access) you are chosing speed first, security second.

If security is critical then you need to chose a language that makes array out of bounds access well nigh impossible. This is an easy problem -- we have languages that will give this to us.

What percentage of exploits in the wild come from array (and pointer) access out of bounds? I'd venture to say it is above 50%.

Rather than have programmers everywhere "try hard to be careful" writing this code, let them use a safer language and have a few really smart folk work on optimizing the compiler for said language to make the safety checks faster (e.g. removing provably unnecessary/redundant checks).

People think that chosing C/C++ has a better business case (i.e. better performance / scaling) because "being really careful" works most of the time. The problem is when heartbleed (or the next array out of bounds access bug) hits the the business case's ROI no longer looks so much better than the safer path.

A better language won't eliminate all security holes but it can eliminate a huge class of them and allow engineers to focus the energy they used to spend on "being really careful about array access and pointers" on other tasks (be they security, performance or feature related).

EDIT: stating the obvious .. there are good uses for C style languages but writing large bodies of software that needs to be resistant to malicious user attacks is not one of them.

Thanks for this. How is this reading arbitrary memory locations though? Isn't this always reading what is near the pl? As in, can you really scan the entire process's memory range this way or just a small subset where malloc (or the stack, whichever this is) places pl?

The latter, and AFAIK the buffer doesn't get reallocated on every connection, so it should be unlikely that any private keys actually get dumped. However, I could be missing a way to exploit it.

Reading between the lines in the announcement it sounds like dropping and reconnecting may cause it to read memory freed up from a prior connection. It may "just" be a matter of keep trying or it may be a matter of opening lots of connections to consume resources dropping them all then connecting and seeing what was left on the beach after the tide went out.

BTW Amazon AWS/ELM is vulnerable, confirmed publically by their support.

> It may "just" be a matter of keep trying

I gave this some thought earlier today, and expect that address space randomisation can make this bug eventually expose the server keys. You need to hit an address that has been just vacated from a (crashed) httpd worker.

Most implementations clear encryption key material on exit, but a crashed process never got to run that code.

In most systems, this will only work within the same process. Contemporary Unix kernels always allocate zeroed pages to processes, so it's impossible for a process to recover data from another unless there's a kernel bug.

If it just reads the up-to-64KB after that allocation, wouldn't you expect to see the server process segfault before too long?

Of course, servers helpfully just start themselves back up again.

As for scanning for key material, I wonder how to tell that 256-bit random data is the 256-bit random data you want.

When for instance an AES-key is being used by OpenSSL, it is put into a 'struct aes_key_st' which is not random at all but quite easily recognizable when scanning memory.

The Cold Boot attack paper by Halderman, Schoen et al. here


...discusses this in detail in chapter 6, Identifying Keys in Memory.

EDIT: fixed the reference

Well, one way is to brute iterate through every potential 256-bit string you dredge out of the canal against the known public key.

If you can dredge up 64kB of fresh data every time, that's 511,744 tests per shovelful which is quite a bit to sift through from a performance perspective but it's also a trivially parallel task.

Additionally, folk might know of even better ways to narrow that down. For example, the data representation in memory might have easy to grep for delimiters.

65505 tests as I work it out as it's unlikely to be non-byte aligned:-

256 bits is 32 bytes

If you get 64kB of payload data back each time then it can only contain 65536-31=65505 different consecutive strings of 32 bytes.

I successfully obtained the private key for my local Apache install this way once, though I'm having trouble getting anything reliable.

How many tries did it take to get the whole key?

This reminds me of what another programmer told me a long time ago when we were discussing C; "The problem with C is that people make terrible memory managers.". So true.

I agree that this seems like an abstraction for this is missing, but I always have the feeling that what you're doing in covering holes in a leaking dam you might get good at it, but you'll always have leaks.

I have always detested C (also C++) because it's so unreadable... the snippets of code you cite are just so dense ie. a function like n2s() gives pretty much no indication of what it does to a casual reader. Just reading the RFC (it is pretty much written in a C style) gives me the creeps.

The RFC doesn't mention why there has to be a payload, why the payload has to be random size, why they are doing an echo of this payload, why there has to be a padding after the payload. If this data is just a regular C struct like the RFC makes it out to be (I didn't know you could have a struct with a variable size, but apparently the fields are really pointers or it's just a mental model and not a real struct).

Apparently the purpose of the payload is path MTU discovery. Something that is supposed to happen at the IP layer, but I don't know enough about datagram packets. I guess an application may want to know about the MTU as well...

I'm not here to point fingers, I'm just saying C is a nightmare to me and a reason for me to never be involved with system programming or something like drafting RFC's ;-).

But if one can argue that C is a bad choice for writing this stuff, then that is not an isolated thing. "C" is also the language of the RFCs. "C" is also the mindset of the people doing that writing. After all, the language you speak determines how you think. It introduces concepts that become part of your mental models. I could give many examples, but that's not really the point.

And it's about style and what you give attention to. To me, that RFC is a real bad document. It starts to explain requirements to exceptional scenario's (like when the payload is too big) before even having introduced and explained the main concepts and the how and why's.

So while you may argue that this is a C problem and not a protocol problem, it is really all related.

And you may also say, in response to someone blaming these coders, that blame is inappropriate (and it is) because these are volunteers and they are donating their free time to something to find valuable, the whole distribution and burden of responsibility is, naturally, also part of the culture and how people self-organize and so on.

As someone else explained (https://news.ycombinator.com/item?id=7558394) the protocol is real bad but it is the result of more or less political limitations around submitting RFCs for approval. There is no reason for the payload in TLS (but apparently there is in DTLS) but my point is simply this:

If you are doing inelegant design this will spill over into inelegant implementation. And you're bound to end up with flaws.

Rather than trying to isolate the fault here or there, I would say this is a much larger cultural thing to become aware of.

This sort of argument is becoming something of a fashion statement amongst some security people. It's not a strictly wrong argument: writing code in languages that make screwing up easy will invariably result in screwups.

But it's a disingenuous one. It ignores the realities of systems. The reality is that there is currently no widely available memory-safe language that is usable for something like OpenSSL. .NET and Java (and all the languages running on top of them) are not an option, as they are not everywhere and/or are not callable from other languages. Go could be a good candidate, but without proper dynamic linking it cannot serve as a library callable from other languages either. Rust has a lot of promise, but even now it keeps changing every other week, so it will be years before it can even be considered for something like this.

Additionally, although the parsing portions of OpenSSL need not deal with the hardware directly, the crypto portions do. So your memory-safe language needs some first-class escape hatch to unsafe code. A few of them do have this, others not so much.

It's fun to say C is inadequate, but the space it occupies does not have many competitors. That needs to change first.

First, I do realize that rewriting the software stack from the ground up to have only managed code is a huge task. I do think that as an industry, we should set a goal of having at least one server implementation along these lines (where 'set a goal' may mean, say, grants or calls for proposals). Microsoft Research implemented an experimental OS like that, although it probably didn't have all the features a modern OS would need. I don't know if we need a new language, but we do need a huge rethink of the server architecture, and not just a piece-by-piece rewrite, which I think will founder on the interface issues that you mentioned.

Anyway, I am quite realistic about the prospect of my comment having that kind of effect on the industry - I don't suffer from delusions of grandeur. I was aiming the comment more at people who choose C/C++ for no good reason to write a user-level app; that app is nearly certain to have memory use errors, and if it has any network or remote interface, chances are they can be easily exploited. I'd like as many people as possible to understand that they can't expect to avoid such errors, any more than one of the most heavily audited pieces of software avoided them. We have had decades of exploits of this vulnerability, and yet most programmers are oblivious to it, or think only bad programmers are at risk. So just as tptacek goes around telling people not to write their own crypto, I go around telling people - with less authority and effectiveness, unfortunately - not to write C/C++ code unless they really need to.

As for the performance issues forcing OpenSSL to use C, well, we apparently exposed all our secrets in the pursuit of shaving off those cycles. I hope we are happy.

That was a very reasonable response, I can roll with that.

Just one thing: when I brought up talking directly to the hardware, I did not mean just for performance's sake. Avoiding side-channel attacks often requires to have high control over the generated machine code, and that is the primary reason to not do it in higher-level languages (unless they also permit that level of control).

In Java 8 the JVM knows how to compile AES straight through to AES-NI invocations, so the CPU itself is doing the hardware accelerated crypto (in constant time). It's not necessarily the case that higher level languages have to be unsafer: especially not on something like a server where the overhead of JIT compilers get amortized out.

Okay, but what about other crypto algorithms not implemented in hardware? Eventually someone has to figure out how to generate machine code that operates in constant time across code paths, and they need a language that will let them do that.

Do not like the term "C/C++", and especially in this context. Modern C++ makes avoiding this sort of bug as easy as doing so in the "managed" languages already discussed.

This is as much a cultural as a technical problem; C really is in the last chance saloon for this sort of problem, we have the solution to hand, but a strong cadre of developers will still only consider C for this sort of work.

Thank you, this really had to be said. You can do C in C++ (and if you do that, chances are that you're doing it wrong), but you can't do C++ in C. The ugliest and unsafe parts of C++ are invariably those coming almost untouched from C (raw pointers, casts from/to void ptrs, etc). C is close to the machine, but C++ is largely a vertical language where you can do things low-level or high-level (and yes, you can do a lot of interesting things with templates, no matter the bad rap they've got); and most of the current C++ community vastly prefers high-level-like code, for good reasons (for starters, it may be even more performant). A LOT of the sources of unsafe code go away by two simple techniques: use RAII-managed smart pointers instead of raw ones (some of them are as lightweight as a raw pointer), and prefer vector (or other container) to array.

I love C and C++, and each one has its place, but really, they're very different. Almost as much as C++ is to Java, for example.

The problem, and I think this may have been touch on somewhere else in the thread, is that C++ can be really complex to wrap. So embedding a C++ library in another, higher-level language can be very tricky. It often requires wrapping the parts of the API you want to use in C.

I'm fine with low-level libraries being written in C++, but would hope that developers expose a C API around everything.

I don't think this is true. When it comes to copying bytes from a buffer supplied from the network, there isn't a wrapper/manager class that can do this for you. Somewhere down the pipeline some piece of code has to copy the unstructured, variable length byte stream into a manageable data structure. C++ does not give any way to do this beyond the mechanisms available in C.

What you can do is lower the surface area of vulnerability. Low level byte wrangling is kept in a small subset of generic classes and functions. Application logic then only uses the safe interfaces.

Thank you for pointing out that this is not an issue in C++, only in C. There are times when we need C and one can write C code and compile it with a C++ compiler. It's an option to help ease people to modern, idiomatic C++, but please don't lump the two together. C and C++ are two entirely different languages and C++ is much safer.

How about Ada? It is time tested! GNU's Ada shares the same backend as GCC so it can be pretty fast. Good enough for DoD. =P

Edit: I say this having used VHDL quite a bit. I appreciate its type strictness and ranges.

My first thought too was Ada, it's easily callable from anything that can call C afaik, and has infinitely better support for catching these sorts of bugs than C or C++ do. It's basically made for this kind of project, and yet no one in the civilian population really cares; it's a real shame. Not only does Ada make it easy to do things safely, it makes it very hard to do things unsafely.

I've been advocating Ada's use on HN for a few years now, but it always falls on deaf ears. People seem to think it's old and dead like COBOL or old FORTRAN, but it's really a quite modern language that's extremely well thought out. Its other drawback is that it's pretty ugly and uses strange names for things (access is the name given to pointer like things, but Ada specifies if you have say a record with a 1 bit Bool in it, you must be able to create an access to it, so a pointer is not sufficient).

Tony Hoare (Mr. Quicksort, CSP, etc...) has softened his stance since "The Emperor's Old Clothes", but his concern was that ADA is too complicated to be understandable and safe. I hated Pascal because the array length was part of its type... but maybe that kind of thinking is apparently what it takes to avoid bugs like Heartbleed.

Can I suggest you take a quick look at ATS? The language itself is kind of horrid (and I am a ML fan) and the learning curve is way steep, but the thin, dependently typed layer over C aspect is actually quite nice.

Note: I'm not suggesting it for current production use, but rather as something that could be expanded further in the future.

An interesting new project written in Ada is a DNS server called Ironsides, written specifically to address all the vulnerabilities found in Bind and other DNS servers [1].

"IRONSIDES is an authoritative DNS server that is provably invulnerable to many of the problems that plague other servers. It achieves this property through the use of formal methods in its design, in particular the language Ada and the SPARK formal methods tool set. Code validated in this way is provably exception-free, contains no data flow errors, and terminates only in the ways that its programmers explicitly say that it can. These are very desirable properties from a computer security perspective."

Personally, I like Ada a lot, except for two issues:

1. IO can be very, very verbose and painful to write (in that way, it's a bit like Haskell, although not for the same reason). Otherwise, it's a thoroughly modern language that can be easily parallelized.

2. Compilers for modern Ada 2012 are only available for an extravagant amount of money (around $20,000 last I heard) or under the GPL. Older versions of the compiler are available under GPL with a linking exception, but they lack the modern features of Ada 2012 and are not available for most embedded targets. And the Ada community doesn't seem to think this is much of a problem (the attitude is often, "well, if you're a commercial project, just pony up the money"). The same goes for the most capable Ada libraries (web framework, IDE, etc.) -- they're all commercial and pretty costly. Not an ideal situation for a small company.

But yes, Ada is exceptionally fast and pretty safe. There's a lot of potential there, but to be honest my hopes are pinned on Rust at this point.

1. http://ironsides.martincarlisle.com/

Yikes, I didn't know the latest GPL compiler doesn't contain a linking exception. That is disappointing.

This is wrong: the compilers distributed by the FSF (as part of GCC) have the linking exception. The ones distributed by AdaCore don't: they exercise their right to transform the modified GPL (with exception) into the GPL before redistribution. But the one in your Linux distribution is likely the FSF one.

No, it's absolutely correct. He wrote that the latest compiler doesn't have a linking exception, which is correct. The FSF compiler generally lags at least a year or two behind the AdaCore one.

For example, it has only just gotten (partial) Ada 2012 features, a full 3-4 years after the AdaCore GPL compiler.

We might be stuck with C for quite a while but then maybe the more interesting question is 'how does this sort of thing get past review?'. It's not hard to imagine how semantic bugs (say, the debian random or even the apple goto bug) can be missed. This one, on the other hand, hits things like 'are the parameters on memcpy sane' or 'is untrusted input sanitized' which you'd think would be on the checklist of a potential reviewer.

I think you mentioned the right keyword: "checklist". If you scan the http://wiki.openssl.org website carefully, you will be scanning it carefully. (ie. not finding anything). It doesn't seem to be a good practice yet to use checklists for code reviews. Could this change? I hope: http://www.infoq.com/presentations/agile-code-reviews

"Rust has a lot of promise, but even now it keeps changing every other week..."

A larger problem, in my opinion, is that things like OpenSSL are used (And should be!) from N other languages. As a result, calling into the library requires almost by definition lowest-common denominator interfaces. Which is C.

C code calling into Rust can certainly be done, but I believe it currently prohibits using much of the standard library, which also removes a lot of the benefits.

C++ doesn't, I think, have as much of a problem there, but I'm somewhat skeptical of C++ as a silver bullet in this case.

I don't know about Ada and any other options.

[ATS, anyone?]

Why not write the code in C# (for example) and extract it to $SYSTEM_PROGRAMMING_LANGUAGE? It wouldn't be much different than what Xamarin are doing now for creating iOS and Android apps with C#.

Using C as an output language, backed by guarantees at the higher level, could certainly work. I believe ATS [1] works this way, and can even avoid garbage collection altogether if desired. I understand it is not an easy language, though.

Nimrod [2] also generates C, but as I understand it garbage collection is unavoidable.

[1] http://www.ats-lang.org/

[2] http://nimrod-lang.org/

This is one reason I'd like to see the removed LLVM C backend brought back and modernized, with Rust as the source language. Rust is safe, has no mandatory garbage collector, and has a much lower impedance mismatch with C or C++ than most higher level languages, so it should work well for libraries that are expected to integrate with C code.

I'm not clear why you'd want to compile your rust down to C, only to then compile that C again? Surely you're better off with a single compiler invocation, and taking rust straight to object code?

I know that historically it's been easier to write a code generator than a compiler backend, but with LLVM you get the backend just as cheaply as the code gen.

Makes it easier to target embedded arches. For example, LLVM doesn't target the MSP430 from TI, but there is a gcc fork for it. Sure, you can write a new backend for LLVM, but that's a whole different ballgame.

This way it would be Rust -> LLVM IR -> C -> GCC for MSP430.

> removed LLVM C backend

What are you referring to here?

LLVM used to have a backend that could convert its low-level bitcode to portable C.

Because if you write that in C#, you have to bundle 54MB common runtime and GC withit.

Did you actually read my entire comment, or did you just see "C#" and post a knee-jerk reaction? If you extract code from C# to C, you don't need a GC or any of the .NET class libraries -- you'd just have a standalone C file to use like any other.

> If you extract code from C# to C, you don't need a GC or any of the .NET class libraries

The problem is, you do need them. There's no 1:1 C# -> C extraction. You have to extract at least some part of the CLR along with.

Perhaps another way of saying this is that you will always need a runtime. And if you reimplement the portions of the runtime that you need in C, you've essentially re-implemented .NET.

>Additionally, although the parsing portions of OpenSSL need not deal with the hardware directly, the crypto portions do. So your memory-safe language needs some first-class escape hatch to unsafe code. A few of them do have this, others not so much.

For the other points there is some debate, but don't most serious languages have a C FFI?

OpenSSL and similar libraries spend most of their time processing short packets. For example, encrypting a few hundred bytes using AES these days should take only a few hundred CPU cycles. This means that the overhead of calling the crypto code should be minimal, preferably 0. This is in part what I meant by "first-class". Perhaps I should have written "zero-overhead" instead.

I googled around just now for some benchmarks on the overhead of FFIs. I found this project [1] which measures the FFI overhead of a few popular languages. Java and Go do not look competitive there; Lua came surprisingly on top, probably by inlining the call.

Before you retort with an argument that a few cycles do not matter that much, remember that OpenSSL does not run only in laptops and servers; it runs everywhere. What might be a small speed bump on x86 can be a significant performance problem elsewhere, so this is something that cannot be simply ignored.

[1] https://github.com/dyu/ffi-overhead

those linked tests are extremely disingenuous, it only shows the fixed cost of FFIs.

Considering that in C the plusone call is 4 or so cycles, and the Java example is 5 times slower, that's only 20 or so cycles. If the function we're FFIing into is 400 cycles, that's only a 1% decrease in speed. I'm willing to pay that price if it means not having to wake up to everything being vulnerable every couple of months.

This project attempts to measure the overhead of calling out from $LANGUAGE and into C, which is the reverse of what's necessary to solve the problem stated here — to write a low-level library in a high-level language.

There are other means of achieving a secure implementation, such as programming in a very high-level language, such as Cryptol, and compiling to a low-level language:


No, it's the exact problem we're faced with here: Calling OpenSSL from the outside is something you do a handful of times. The OP was concerned about parts of OpenSSL that require direct hardware access (thus, should be written in C). Because those parts of the code are extremely hot, having to cross FFI boundaries to reach them might be prohibitively expensive.

Fair enough. Here's one attempt at a fast high-level AES library, with hardware acceleration using AESNI, and some benchmarks:


Lua being on top shouldn't be surprising. Its entire purpose is to call into (and to be called from) C, and this case has been highly optimized.

I think that was part of his point, but yeah I don't see why you couldn't do the pure parts in a safer language. FFIs to C are fairly easy to do I think, probably partly because of how simple the calling convention is.

I believe Haskell could be up to the job, but I heard that there were some difficulties in guarding against timing attacks. However those could have just been noise. I know that a functional (I believe and haha) operating system was made in Haskell.

Aren't Operating Systems lower level than OpenSSL?

I look forward to reading the hilarious threads that will be spawned when you take to linux-kernel, freebsd-hackers, openbsd-misc, etc. and inform them they should be developing their kernels in Haskell.

Functional programming's unpopularity is not rooted in any real or imagined inability to write operating systems.

Well you also have to account for the fact that they've spent a lot of time working with their tools and are quite invested in them.

Perhaps they would balk at the idea of writing a kernel in Haskell, but it has been done before.

If you mean Metasepi, it's still under development:

http://metasepi.org (mostly in Japanese) http://www.ipa.go.jp/files/000036232.pdf (Slides in English)

Thanks, I had forgotten about Metasepi!

There is HalVM, but it's only for Xen.

Other than C there is also C++ and D if you don't want to stray to far from C. The problem with C++ is that even though it is possible to adapt to a memory safe programming style with C++ the concepts are not prevalent in the community.

Secure coding is very prevalent in C++ (outside the group of old C programmers who write C++ code as if it were C). C++ is far safer than C.

Sorry, let me be more clear: In the courses and books I've seen and read novices don't get taught secure coding techniques as C++ is often introduced as a superset of C and security in general is not of interest to the teacher. Then later on when they transition to the web as a primary source of information there is a lot of legacy C++ code lying around that does not use modern memory management concepts. Also, as nobody has ever told them the importance of strict coding styles for security they also don't start looking for them, even though it would be possible to find them with the right keywords.

I don't know. I was a C++ coder 15 years ago, left for managed languages, then came back to it a couple years ago. Rusty would be an understatement, so I had to come at it in what may be a worse state than a noob: a person with outdated knowledge of how things work.

If you looked at all for "best practices", things like RIAA, stl/boost, and other concepts became very clear, very quickly, and these are the types of thing that limit these kinds of bugs (RIAA, in particular). Now, to be fair, I was writing crypto-related software, so I was paying very close attention, but I didn't really have to hunt.

C++ does not have a reasonable memory safe subset. No, smart pointers do not provide real memory safety.

They are generally better than nothing, however. And they do generally offer better performance, better predictability, an a simpler implementation than other approaches to memory management and garbage collection.

Maybe a language like Rust will offer a safer alternative at some point, but that point surely isn't today, and probably not tomorrow, either. Maybe there are other languages that offer better safety, but they often bring along their own set of very serious drawbacks.

In terms of writing relatively safe code today, that performs relatively well, that can integrate easily with other libraries/frameworks/code, and can be readily maintained, the use of C++ with modern C++ techniques is often the only truly viable option.

>Maybe a language like Rust will offer a safer alternative at some point, but that point surely isn't today, and probably not tomorrow, either.

Rust absolutely does offer a safer alternative today. The only problem with Rust at this point is that the standard library is in a state of great flux, which makes it hard to use the language for serious projects. But the memory safety is there.

And even with all the changes, there are at least a couple of companies using the last tagged release of Rust in production.

That said, I fervently hope that Rust can hit 1.0 soon (as in this year). A lot of people are looking to move on from C and C++ at this point, but a lot are moving to Go, D, or Nimrod because Rust has been beta for so long (yes, I know Go is technically not in the same tier as Rust, D, and Nimrod). Once they put in the effort of learning these languages, they're unlikely to switch to Rust, thus missing out on all the safety guarantees that Rust offers.

There are three current known production deployments of Rust, yes. I don't know if they're using 0.10 yet, but they exist.

What would you advocate as memory safe programming styles? Strong guarantee? RAII?

RAII is a minimum. You also have to treat any direct and indirect (unchecked) pointer arithmetic as a potential security vulnerability though. For example if you use the [] operator of a vector you can still access memory outside of the allocated space. Instead you would have to use the at() method which actually checks the bounds. Even iterator are problematic as the iterator on a vector also ignores the actual bounds iirc (though with most idioms the comparison against the end iterator is pretty error proof). This lends itself to constructs where you do not work with any indexes at all in the way foreach loops abstract your position in a container away.

Ah I tend to not use [] operators on vectors etc. but as you say iterators can be problematic. The best way is to start at the end of the container and work backwards, particularly if you are removing items from the list (thereby wrecking the iterator's idea of the end position if you were moving forwards through it).

I should probably finish Bjarne's C++11 book - I am maintaining a codebase of old style C++ and seem to be stuck in the old methods of doing it, mainly because of using compilers that don't have C++11 support.

Is there any recommended reading on new style C++ other than Bjarne's book?

What about ADA? GNAT looked pretty good a few years back when I was trying to get into that sort of thing.

>This sort of argument is becoming something of a fashion statement amongst some security people.

Just the ones who don't understand how good API designs can work well to solve these problems, don't worry not all of us are like that :)

What you say can easily be disproved, and you are simply asking for too much if you ask for something to be a drop-in replacement for OpenSSL. Some re-architecting is requred simply because of the insecurity of C.

For example, a shared library that implements SSL would have to be a shim for something living in a separate process space.


That is a Haskell implementation of TLS. It is written in a language that has very strong guarantees about mutation, and a very powerful type system which can express complex invariants.

Yes, crypto primitives must be written in a low level language. C is not low level enough to write crypto, neither securely nor fast, so that's not an argument in its favor.

There are several languages that do fill that gap, but security people never use it. For example, Cyclone is pretty good. (http://cyclone.thelanguage.org/).

How about D?


> C and other languages without memory checks are unsuitable for writing secure code

I vehemently disagree. Well-written C is very easy to audit. Much much moreso than languages like C# and Java, where something I could do with 200 lines in a single C source file requires 5 different classes in 5 different files. The problem with C is that a lot of people don't write it well.

Have you looked at the OpenSSL source? It's an ungodly f-cking disaster: it's very very difficult to understand and audit. THAT, I think, is the problem. BIND, the DNS server, used to have huge security issues all the time. They did a ground-up rewrite for version 9, and that by and large solved the problem: you don't read about BIND vulnerabilities that often anymore.

OpenSSL is the new BIND; and we desperately need it to be fixed.

(If I'm wrong about BIND, please correct me, but AFICS the only non-DOS vulnerability they've had since version 9 is CVE-2008-0122)

> but we can plug this seemingly endless source of bugs which has been affecting the Internet since the Morris worm.

If we're playing the blame game, blame the x86 architecture, not the C language. If x86 stacks grew up in memory (that is, from lower to higher addresses), almost all "stack smashing" attacks would be impossible, and a whole lot of big security bugs over the last 20 years could never have happened.

(The SSL bug is not a stack-smashing attack, but several of the exploits leveraged by the Morris worm were)

> The problem with C is that a lot of people don't write it well.

Including people responsible for one of the most important security-related library in the world. No matter how good and careful a programmer is, they are still human and prone to errors. Why not put every chance on our side and use languages (e.g. Rust, Ada, ATS, etc.) that make entire classes of errors impossible? They won't fix all problems, and definitely not those associated with having a bad code base, but it'd still be many times better than hoping people don't screw up with pointers lifetime.

> Why not put every chance on our side and use languages (e.g. Rust, Ada, ATS, etc.) that make entire classes of errors impossible?

I don't think intentionally preventing the programmer from doing certain things the computer is capable of doing on the theory it makes errors impossible makes sense.

As I've said several times in this thread, somebody has to deal with the pointers and raw memory because that's the way computers work. Using a language where the language runtime itself handles such things only serves to abstract away potential errors from the programmer, and prevents the programmer from doing things in more efficient ways when she wants to. It can also be less performant, since the runtime has to do things in more generic ways than the programmer would.

> Including people responsible for one of the most important security-related library in the world.

I think you've hit on a crucial part of the problem: practically every software company on Earth uses OpenSSL, but not many of them pay people to work on it.

  calvinow@Mozart ~/git/openssl $ git log --format='%aE' | grep -Po "@.*" | sort -u - 
I was very surprised how short that list is. There are a lot of big names that make heavy use of this software that are not on that list.

  > I don't think intentionally preventing the programmer 
  > from doing certain things the computer is capable of 
  > doing on the theory it makes errors impossible makes 
  > sense.
With arguments like this, we'd all be back in the days of non-structured programming languages (enjoy writing all your crypto in MUMPS). Every modern language, including C, restricts itself in some way in order to make programs more predictable and errors less likely. Some simply impose more restrictions than others, though these restrictions can actually make programs more efficient (see, for instance, alias analysis in Fortran vs. alias analysis in C).

  > somebody has to deal with the pointers and raw memory 
  > because that's the way computers work
All three of the languages listed previously (Rust, Ada, ATS) are systems programming languages with the capability of manipulating pointers and raw memory (though I don't personally have any experience with the latter two). What they have in common is that they provide compile-time guarantees that certain aspects of your code are correct: for example, the guarantee that you never attempt to access freed memory. These are static checks that require no runtime to perform, and impose no overhead on running code.

> With arguments like this, we'd all be back in the days of non-structured programming languages

You're confusing the difference between syntactical restrictions and actual restrictions on what one can make the computer do.

I define "things the computer is capable of" as "arbitrary valid object code executable on the CPU". (Valid here meaning "not an illegal instruction".) Any language that prevents me from producing any arbitrary valid object code is inherently restrictive. C allows me to do this. I can even write functionless programs in C, although it's often non-portable and requires mucking with the linker. If the CPU has some special instruction I want to use, I can use inline assembly.

Any language that prevents me from doing arbitrary pointer arithmetic and memory accesses prevents me from doing a lot of useful things I can do in C. See my other comment about linked lists with 16-bit pointers on a 64-bit CPU.

My understanding of Rust is that its pointers have similar semantics to the "safe_pointers" in C++. If that's the case, my understanding is that it would prevent me from doing things like the 16-bit linked list (please, correct me if I'm wrong).

Quoting your other post:

  > in C, I can make the computer do absolutely anything I 
  > want it to in exactly the way I want it to. Maybe I 
  > don't like 8-byte pointers on my 64-bit CPU, and I want 
  > to implement a linked list allocating nodes from a 
  > sparse memory-mapped pool with a known base address 
  > using 16-bit indexes which I add to the base to get the 
  > actual addresses when manipulating the list? That could 
  > be a big win depending on what you're doing, and 
  > (correct me if I'm wrong) there is no way to do that in 
  > Java or Haskell.
This is possible in Rust, you'll just need to drop into an "unsafe" block when you want to do the pointer arithmetic. In the meantime, everywhere that isn't in an "unsafe" block is guaranteed to be as safe as normal Rust code. Furthermore, even Rust's "unsafe" blocks are safer than normal C code. Rust is a systems programming language, so we know that you need to do this stuff. We have inline assembly too! Our goal is to isolate the unsafety and thereby make it easier to audit.

Interesting. Clearly I need to explore this more. :)

Rust forces you to draw safety boundaries between safe and unsafe code, but you can do almost strictly more than C in unsafe Rust. It has support for inline assembly, SIMD, packed structs, and well-defined signed integer overflow. None of these is part of standard C or C++, and is only available through compiler-specific dialects. There wasn't even a well-defined memory model with support for atomics before C11/C++11.

> Why not put every chance on our side and use languages (e.g. Rust, Ada, ATS, etc.) that make entire classes of errors impossible?

Bugs will still occur, just in a different way: Java is advocated as being a much "safer" language, but how many exploits have we seen in the JRE? Going to more restrictive, more complex languages in an attempt to fix these problems will only lead to a neverending cycle of increasing ignorance and negligence, combined with even more restrictive languages and complexity. I believe the solution is in better education and diligence, and not technological.

> Java is advocated as being a much "safer" language, but how many exploits have we seen in the JRE?

Very few. I don't think I can remember ever seeing an advisory for Java's SSL implementation.

Yes, bugs are possible in all languages, but that doesn't mean there's no difference between languages. I'm reminded of Asimov: "When people thought the earth was flat, they were wrong. When people thought the earth was spherical, they were wrong. But if you think that thinking the earth is spherical is just as wrong as thinking the earth is flat, then your view is wronger than both of them put together."

(There are a large number of bugs in the browser plugin used for java applets, but they have no relation to the JRE itself)

>The problem with C is that a lot of people don't write it well.

There are languages that make it very very hard to write bad code. Haskell is a good example of where if your program type-checks, there's a high chance it's probably correct.

C is a language that doesn't offer many advantages but offers very many disadvantages for its weak assurances. Things like the Haskell compiler show that you can get strong typing for free, and there's no longer many excuses to run around with raw pointers except for legacy code.

> Haskell is a good example of where if your program type-checks, there's a high chance it's probably correct.

I really wish people would stop saying this. It's not in the slightest bit true. Making this assertion makes Haskellers seem dangerously naive.

This is especially true in the field of crypto, where timing attacks are a major issue. Knowing that your program will produce the correct result isn't enough, you need to know that the amount of time taken to compute that result doesn't leak information, and I don't think Haskell provides any way to ensure this.

Haskell in fact does the opposite. Changes in compiler version can drastically alter the performance of your code, even changing its O() class.

it's just probably correct ;)

How do I embed your Haskell library in my Lisp program? This is where C shines...it can be used everywhere, including embedded in other programs, now matter what language it's in.

Like this: http://www.haskell.org/ghc/docs/latest/html/users_guide/ffi-.... Haskell's FFI works both ways.

(It's not free: you need to link in the entire Haskell runtime system, which is not small. But you can absolutely do it.)

Cool, thanks for the link. Didn't know this was possible.

> There are languages that make it very very hard to write bad code. Haskell [...]

Sure, but how much slower is Haskell than an equivalent implementation in C? Some quick searching suggests numbers like 1000% slower... and no amount of security is worth a 1000% performance hit, let alone a vague "security mistakes are less likely this way" sort of security. Being secure is useless if your code is so slow that you have to run so many servers you don't make a profit.

Could the Haskell compiler be improved to the point that this isn't a problem? Maybe. Ultimately I think the problem is that Haskell code is very unlike object code, and that makes writing a good compiler very difficult. C is essentially portable assembler; translating it to object code is much more trivial.

> C is a language that doesn't offer many advantages but offers very many disadvantages for its weak assurances.

C offers simplicity. Sure, there are some quirks that are complex, but by and large it is one of the simplest languages in existence. Once you understand the syntax, you've essentially learned the language. Contrast that to Java and C#: you are essentially forced by the language to use this gigantic and complicated library all the time. You are also forced to write your code in pre-determined ways, using classes and other OOP abstractions. In C, I don't have to do that: I can write my code in whatever way I feel makes it maximally readable and performant.

C also offers flexibility: in C, I can make the computer do absolutely anything I want it to in exactly the way I want it to. Maybe I don't like 8-byte pointers on my 64-bit CPU, and I want to implement a linked list allocating nodes from a sparse memory-mapped pool with a known base address using 16-bit indexes which I add to the base to get the actual addresses when manipulating the list? That could be a big win depending on what you're doing, and (correct me if I'm wrong) there is no way to do that in Java or Haskell.

> there's no longer many excuses to run around with raw pointers except for legacy code.

If by "raw pointer" you mean haphazardly casting different things to (void * ) or (char * ) and doing arithmetic on them to access members of structures or something, I agree, 99.9% of the time you shouldn't do that.

Or are you talking about that "auto_ptr" and so-called "smart pointer" stuff in C++? In that case, your definition of "raw pointer" is every pointer I've ever defined in all the C source code I've ever written.

Pointers exist because that's the way computer hardware works: they will never go away. I'd rather deal with them directly, since it allows me to be clever sometimes and make things faster.

>Sure, but how much slower is Haskell than an equivalent implementation in C? Some quick searching suggests numbers like 1000% slower...

With things like stream fusion (http://research.microsoft.com/en-us/um/people/simonpj/papers...) , which I imagine would capture a lot of crypto calls, GHC can generate some very performant code (paper contains examples of hand-written C code being beat by Haskell code, and the C code is far from naive).

There are a lot of tricks at your disposal when you know more about the state of the code. And compilers are usually better than humans in this regard.

That paper is extremely fascinating, thanks for sharing.

As an aside, I'm reasonably sure that GCC can use SIMD instructions for certain bulk memory operations in certain circumstances if you feed it the right -march= parameters... I don't think it's as clever as the techniques in this paper, however.

GHC (the main Haskell compiler at this point) does some extremely amazing stuff on the compiler front, it is probably at the forefront of static analysis. And the Haskell community is extremely motivated to making amazing machinery.

A lot of Haskell stuff is based around its lazy evaluation model though

>no amount of security is worth a 1000% performance hit

Yes it is. Most of the applications I use take roughly 0% of my processor's capacity. I can spare a multiple like that outside of a few hot loops.

Saying 10x slowdown isn't worth it is like saying no one would ever compute on a phone.

Also random BS benchmark http://benchmarksgame.alioth.debian.org/u32/benchmark.php?te... says haskell is at least half as fast as C.

> Most of the applications I use take roughly 0% of my processor's capacity.

We're talking about a server-side vulnerability in OpenSSL here, not applications running on your personal computer.

Roughly speaking, making your server code twice as slow means it will cost you twice as much money to run your servers. Of course, that depends a lot on what exactly you're doing and is obviously not always true... but OpenSSL is a very performance-critical piece of most server-side software in the wild.

If OpenSSL suddenly became twice as slow, it would cost a lot of people a lot of money.

> Also random BS benchmark says haskell is at least half as fast as C.

I never said my "quick searching" was exhaustive. I suspect the relative performance is heavily dependent on what exactly is being done in the code.

"If OpenSSL suddenly became twice as slow, it would cost a lot of people a lot of money."

Perspective check: We are talking about a situation in which OpenSSL had NO SECURITY, has had NO SECURITY for two years, and an unknown amount of existing caught traffic is now vulnerable. If the NSA did not already know about this bug (and given that it is not hard to imagine the static analysis tool that could have caught this years ago, it's plausible they've known for a while), they are certainly quite busy today collecting private keys before we fix things, so what security there may have been is now retroactively undone. (Unless you used PFS, which I gather is still rare. In other news, turn that on!)

Do not argue as if you're in a position in which OpenSSL experienced a minor bug, so let's all just calm down here and not make such radical arguments. We are in a position in which OpenSSL was ENTIRELY INSECURE and has been for years, because of a trivial bug that can pretty much ONLY happen in the exact language that OpenSSL was implemented in! Virtually no other language still in use could even have had this bug. This is not a minor thing. This is not something to wave away. This is a profound, massive failure. This is the sort of thing that ought to bury C once and for all, not be glossed over. (As for theoretical arguments that C could be programmed in ways that don't expose this, if the OpenSSL project is not using them, I'm not entirely convinced they really do exist in any practical way.)

If we're OK with bugs that are this critical, heck, I can speed up your encryption even more!

> We are in a position in which OpenSSL was ENTIRELY INSECURE and has been for years, because of a trivial bug that can pretty much ONLY happen in the exact language that OpenSSL was implemented in!

This is incredibly, incredibly false.

Pointers exist. Raw memory accesses exist. Even if you're writing code in a language that hides them from you, they still exist, and there is still potential for somebody to have done something stupid with them. I guarantee you that there are JVM's in the wild with vulnerabilities as severe as this one. Arguing for the use of languages that intentionally cripple the programmer on the theory they make vulnerabilities less likely is silly.

I'm not denying the severity of this issue. But bugs happen. All we can do is fix them, learn from them, and move on. The lesson to be learned here is that really messy code is a big problem that needs to be fixed, because it makes auditing the code prohibitively difficult.

The proper response IMHO is a ground-up rewrite of OpenSSL. A lot of big players use OpenSSL; financing such an endeavor would not be difficult.

But OpenSSL is embedded in many other languages and applications because it's written in C. Show me a low-level language with a stable syntax that fixes the problems caused by using C that can also be embedded in Java, Python, Lisp, Ruby, etc etc. I don't think you can.

Some things need to be in C because they need to be run everywhere, including embedded in applications. No other language does this, to my knowledge.

SSL itself is a tiny fraction of the CPU load on a normal webserver. The idea that it's expensive has done more harm than many other things.

It wouldn't cost anything except on accelerator boxes.

Is it really a tiny load? Have you ever looked at the throughput values quoted for VPN routers that do their encryption in software (not hardware like the expensive Cisco ASAs)? If you compare the non-VPN throughput with the VPN-throughput, the software encryption is massively massively slower, so I would argue that software encryption is not a tiny load on a normal webserver, unless the webserver was not getting any hits...?

yes encryption (and database) is the hot spot when I do load testing, the bit in the middle can be optimized to near nothing.

> If OpenSSL suddenly became twice as slow, it would cost a lot of people a lot of money.

All other things being equal, yes. But responding to security incidents also costs a lot of people a lot of money.

"Performance is a quality of a working system." And doubly so for a cryptographic system.

What good is a cryptographic library that's not secure? Worse than no good. If you're not encrypting your data, you (should) be aware of that, and act accordingly. On the other hand, if you think you're encrypting your data...

A 1000% performance hit means that you have to spend 10x more on hardware, and you have to spend more time engineering for scalability. That extra cost outright kills projects in the womb. If the choice is launching something valuable to users and that pulls in revenue but is flawed, even seriously so, and doing nothing because it's just not feasible to do what you want within any reasonable cost/performance metrics... well then, you have your own anthropic principle right there.

> Sure, but how much slower is Haskell than an equivalent implementation in C?

Writing Haskell to generate proven-correct C is an approach that is known to work.

C offers simplicity. Sure, there are some quirks that are complex, but by and large it is one of the simplest languages in existence.

People keep saying this, and it keeps not being true.

C has something like over 200 undefined and implementation defined behaviors. That alone makes it a minefield of complexity in practice.

> Haskell is a good example of where if your program type-checks, there's a high chance it's probably correct.

For a value of 'correct' that includes "uses all available memory and falls over".

Agreed. Simple code is easy to understand and just as easy to find any bugs in. After looking at the heartbeat spec and the code, I can already see a simplification that, had it been written this way, would've likely avoided introducing this bug. Instead of allocating memory of a new length, how about just validating the existing message fields as per the spec:

> The total length of a HeartbeatMessage MUST NOT exceed 2^14 or max_fragment_length when negotiated as defined in [RFC6066].

> The padding_length MUST be at least 16.

> The sender of a HeartbeatMessage MUST use a random padding of at least 16 bytes.

> If the payload_length of a received HeartbeatMessage is too large, the received HeartbeatMessage MUST be discarded silently.

Then if it's all good, modify the buffer to change its type to heartbeat_response, fill the padding with new random bytes, and send this response. No need to copy the payload (which is where the bug was), no need to allocate more memory.

(Now I'm sure someone will try to find a flaw in this approach...)

My favorite is that the Morris worm dates back to late 1988 when MS was starting the development of OS/2 2.0 and NT. Yea, I am talking about the decision to use a flat address space instead of segmented.

That's why I have high hopes for Rust. We really need to move away from C for critical infrastructure. Perhaps C++ as well, though the latter does have more ways to mitigate certain memory issues.

Incidentally, someone on the mailing list brought up the issue of having a compiler flag to disable bounds checking. However, the Rust authors were strictly against it.

I'm excited about Rust for this reason as well, but in practice I find myself thinking a lot about data moving into and out of various C libraries. The great but inevitably imperfect theory is that those call sites are called out explicitly and should be as limited as possible. It works well but isn't a silver bullet. I'm hopeful that as the language ecosystem matures there will be increasingly mature C library wrappers and (even better!) native, memory-safe, Rust replacements for things.

>I'm hopeful that as the language ecosystem matures there will be increasingly mature C library wrappers and (even better!) native, memory-safe, Rust replacements for things.

This is Rust's greatest promise. Not only is writing memory-safe code possible, but it's also possible for Rust to do anything C is currently doing -- from systems, to embedded, to hard real-time, and so one. The promise of Rust cannot be overstated. And having finally grasped the language's pointer semantics, I've started to really appreciate its elegance. It compares very favorably to OCaml and other mixed paradigm languages with strong functional capabilities.

Promises can be encouraging, but they really do us no good in practice. And it's practice that truly matters.

We really need at least a stable (in terms of the language and its standard libraries) of Rust before it can even be considered as a viable option. Even then, we'll need to see it used seriously in industry for at least a few years by early adopters before it'll be more widely trusted.

We keep hearing about how Rust 1.0 will be available sometime this year, and how there are only a relatively small handful of compatibility-breaking issues to resolve. But until those issues are all resolved and until Rust 1.0 is actually released and usable, Rust just isn't something we can take seriously, I'm afraid to say.

I agree completely. In my opinion, Rust should be looking at reaching a stable language as soon as possible instead of searching for some hard-to-define perfection.

Perfect is the enemy of the good definitely applies here. Any of the last two releases of Rust (0.9 and 0.10) would have made a nice 1.0 release, particularly once managed pointers were moved out from the language core to the standard library.

I also worry about more complexity being added to the language, so the sooner it can reach 1.0, the better. Unfortunately, the Rust community seems to really enjoy bikeshedding, so my hopes for a 1.0 release this year are not very high.

Nonetheless, I've already been wrong about Rust once (re: complexity -- once you learn the admittedly tricky pointer semantics, it's really not that horribly complex). I would love to be proven wrong again.

Good enough can also be the enemy of great. It's a tricky balance. My feeling is that there are already plenty of languages that are mature and stable enough to be good choices for industry but few (if any) that are actively and inclusively defining themselves the way Rust is. It's true that it won't be viable for a good while yet, and that's ok. What's the rush?

A data point: I have a few little Rust projects that rely on some patches to some other libraries; whereas I used to spend above a half-hour compiling and sometimes an hour or two freshening making things compile for the new version, I'm now typically down to about 10 seconds to install the newest nightly and 5 minutes or to fix up some warnings and standard library changes. A stable 1.0 is starting to feel imminent and inevitable to me.

I'd disagree about C++. In my experience, the only things it adds is (1) a false sense of security (since the compiler will flag so many things which are not really big problems, but will happily ignore most overrun issues), (2) lots of complicated ways to screw up, such as not properly allocating/deleting things deep in some templated structure, and (3) interference with checking tools - I got way more false positives from Valgrind in C++ code than in C.

I wish godspeed to Rust and any other language which doesn't expose the raw underlying computer the way C/C++ does, which is IMO insane for application programming.

I'd disagree with your disagreement ;-) C++ has constructs that let you build safer and more secure systems while maintaining good performance. You can write high performance systems without ever touching a raw pointer or doing manual memory management which is something that you can't really do in any other language. Yes, you need to trust the underlying library code which is something you have to do for pretty much any other language.

In my experience well written C++ has a lot less security issues vs. similar C code. We've had third party security companies audit our code base so my statement has some anecdotal support in real world systems.

I second this point. The keyword here is "modern" C++, which encourages people to write shared-memory symantics, and to create strategies that make it impossible to screw up.

"New" is a thing of the past, along with the need to even see T *.

These days good code in C++ is so high-level, one almost never sees a pointer, much less an arbitrarily-sized one. This is of course unless you're dealing with an ancient library.

Another thing of importance:

If you're working with collection iteration correctly (which tends to be the basis for a lot of these out of bounds errors), there is no contesting the beginning offset, or the end offset - much less evaluating whether that end is in sight. Even comparisons missing stronger types showing "this thing is null terminated vs that is not" can be eliminated if you just create enough definitions for the type-system to defend your own interests. If you're coding defensively, these shouldn't even be on the menu. One buffer either fits in the other for two compatible types, or you simply don't try it at all.

It'd be amazing what standard algorithms can leave to the pages of history, if they were actually put to good use.

Python has some nice high-level concepts with their "views" on algorithmic data streams that show where modern C++ is headed with respect to bounds checking, minus the exceptions of course :)

But the problem is that there are people who have been coding C++ for 20 years, who never quite got all of these newfangled smart pointer things and just want the "C with classes" parts of C++.

Or you have the cargo cult programmers, who don't know the language very well, so just pick up idioms from random internet postings or from some old part of the codebase that's been working for a while, so it must be a good source of design tips.

Remember, any time you talk about the safety of a language, you have to think about its safety in the hands of a mediocre programmer who's in a hurry to meet a deadline. Because that's where the bulk of the problems slip in; not when people are actively and deliberately coding defensively.

Does anyone have experience with (auditing) systems built using pascal/delphi? I realize that Ada might be a better choice if headed in that direction -- but it always felt to me like pascal actually had a pretty sane trade-off between high and low level, and that C generally gave a lot of the bad parts of assembly while still managing to obfuscate the code without really giving that much in terms of higher level structure.

Pascal gets a bad rap, but that's due to the limitations of the original "pure" language. Something like Turbo Pascal which took features from Modula-2 is actually be a very good alternative to C for systems programming.

- Stricter type checking than C, (e.g. an error to add a centigrade value to a Fahrenheit value without casting)

- Bounds checked arrays and strings

Turbo Pascal added:

- Selectively turn off bounds checking for performance

- Inline assembler

- Pointers and arbitrary memory allocation

I don't think there's anything you can do in C that you can't do in TP. For example, it was easy to write code that hung off the timer or keyboard interrupts in MSDOS, which is pretty low level stuff.

The important thing is that the safe behaviour should be the default, you have to mark unsafe areas of code explicitly. This is the opposite to how it works with C.

To be clear, Rust _does_ expose the raw underlying computer the way C/C++ does, it's just off by default rather than on.

>(2) lots of complicated ways to screw up, such as not properly allocating/deleting things deep in some templated structure

Wow, that sounds scary. Do you have any references or further reading about this?

I don't have a reference, but I've seen this myself. Problems can happen whenever pointers combined with the STL or other modern C++ stuff. In the thread from several years ago, I gave as an example pushing a pointer to a local variable into a vector which is then returned somewhere outside of scope. Compilers don't warn about this, or at least didn't then, although Valgrind catches it. And of course this can be a more complicated case, like a vector of maps from strings to structures containing pointers which is passed by reference somewhere - which will make it harder to catch. And Valgrind won't help if it doesn't see it happening in the execution path that you ran.

Now, combining pointers and STL is not a good idea. In fact, using raw pointers in C++ is not a good idea, at least IMO (but you've seen I am a bit concerned about memory safety). However, this is perfectly supported by compilers, and not even seriously discouraged (some guides tell you to be careful out there). I've seen difficult bugs produced by this, in my case, in a complicated singleton object.

> In fact, using raw pointers in C++ is not a good idea, at least IMO

Not just your opinion; it's become the "standard of practice" in the C++ development community.

They're trying to get it that between STL, make_shared, C++14's make_unique, etc., that you won't actually be using "naked new"s in any but the rarest cases. For the rest of memory management you'd use types to describe the ownership semantics and let the compiler handle the rest.

"The fact is that no programmer is good enough to write code whic is free from such vulnerabilities."

"...you are kidding yourself if you think you can handle this better than the OpenSSL team."

Well, I can think of at least one example that counters this supposition. As someone points out elsewhere in this thread, BIND is like OpenSSL. And others wrote better alternatives, one of which offered a cash reward for any security holes and has afaik never had a major security flaw.

What baffles me is that no matter how bad OpenSSL is shown to be, it will not shake some programmmers' faith in it.

I wonder if the commercial CA's will see a rise in the sale of certificates because of this.

Sloppy programmer blames language for his mistakes. News at 11.

Nothing in the standard prevents a C compiler + tightly coupled malloc implementation from implementing bounds checks. Out-of-bounds operations result in undefined behavior, and crashing the program is a valid response to undefined behavior. If your malloc implementation cooperates, you can even bounds-check pointer arithmetic without violating calling conventions.

It's quite a shame that there isn't a compiler that does this, and it's a project I've considered spending some time on if I can find a big enough block of that to get a solid start.

Unrestricted pointer arithmetic is indeed incompatible with memory safety. You set a pointer to point to one structure, then you change it and it now points to another structure or array. The compiler doesn't know the semantics of your code, so how can it tell if you meant to do that? And malloc/memcpy is way too low to check this stuff. It only sees memory addresses; it has no idea what variables are in them. Tightly coupled would mean passing information like "variable secret_key occupies address such-and-such" into the libc, which does violate POSIX standards, and will result in lots of code breaking. I don't see why we wouldn't just write in C# or Java or Rust, instead of a memory-safe subset of C (and it would have to be a subset).

Edit: here's one project for making a memory-safe C: http://www.seclab.cs.sunysb.edu/mscc/ . Interesting, but (a) it is a subset of C, (b) it doesn't remove all vulnerabilities, and (c) I still don't grok the advantage of using this over a language actually designed for modern, secure application programming.

I'll assume the case you're concerned about is the one legitimately tricky case (where you have an array of structs that include arrays, and you perform arithmetic on a pointer into the inner array), because the other readings I'm coming up with necessarily invoke undefined behavior, either by running off the end of an allocation (what we're checking) or breaking strict aliasing (in which case false positives are OK). Depending on what you do with this pointer (e.g. passing it into a custom memcpy), the compiler may not be able to enforce runtime checks by itself.

This is where we do need some extra help, in the form of a library that holds state for the compiler so that we don't need to instrument our pointers. Nothing in the C standard prevents the compiler from doing this. The library you pass the pointer into may ignore this information, if it doesn't have the necessary instrumentation, but we at least get the capability.

Re: other languages, Rust I will grant. It's the only one of those that's compelling for C's use-cases (Java and C# are both entirely unusable for the best uses of C and C++).

>You set a pointer to point to one structure, then you change it and it now points to another structure or array. The compiler doesn't know the semantics of your code, so how can it tell if you meant to do that?

If you changed it via arithmetic or anything other than direct assignment you have violated the standard. Assuming of course that they are part of separate allocations, pointers from one may not interact with pointers from another except through equality testing and assignment.

You can do it, although at a considerable performance hit. The usual approach is "fat" pointers that include bounds information. Memory safe pointer arithmetic is achieved by checking that any constructed pointer lies within those bounds, and dying noisily if it does not (alternatively, you can test on dereference).

C language environments that worked like this have been commercially available in the past: Saber-C in the '90s, and perhaps earlier, was one example.

One problem is that the obvious implementation technique is to change the representation of pointers (to include base and bounds information, or a pointer to that), which means that you need to redo a lot of the library as well. (Or convert representations when entering into a stock library routine, and accept that whatever it does with the pointer won't get bounds-checked.)

But it's certainly doable.

I implemented this once in my C interpreter picoc. Users hated it because it also prevented them from doing some crazy C memory access tricks, so I ended up taking it out.

If you have a char* buf; block you got from network stack and you have to copy buf[3] bytes from the position buf+15 then the compiler doesn't know what to check for if you don't cross the boundary of that buffer.

Oncoming Intel memory protection extensions: http://software.intel.com/en-us/articles/introduction-to-int...

"Intel MPX is a set of processor features which, with compiler, runtime library and OS support, brings increased robustness to software by checking pointer references whose compile time normal intentions are usurped at runtime due to buffer overflow."

I think clang's AddressSanitizer gets pretty close to what you want. It misses some tricky cases on use-after-return, but other than that it offers pretty robust memory safety model for bounds checks, double free, and so on.

> This vulnerability is the result of yet another missing bound check. It wasn't discovered by Valgrind or some such tool, since it is not normally triggered - it needs to be triggered maliciously or by a testing protocol which is smart enough to look for it (a very difficult thing to do, as I explained on the original thread).

You could also look at this bug as an input sanitization failure. The author didn't consider what to do when the length field in the header is longer than what comes over the wire (even when writing the code in a secure language, this case should be handled somehow, maybe by logging or dropping the packet).

The defined behaviour would be to discard the packet. In a secure language, the buffer would have had a "length" property, and the code would have crashed when a read beyond the buffer's end was attempted. But in C, buffers are just pointers, so there is fundamentally nothing wrong with reading beyond the end of the buffer. So instead of a crash, we get silent memory exposure.

Isn't this basically the whole point of QuickCheck-like testing frameworks? They're basically a specification that is attempted to be falsified in some way by a fuzzer. I don't see why most C projects couldn't be doing this.

I think they don't do this because it's not a widely known testing method, and it's kind of tricky to implement these tests correctly.

But then again, with some dedication, quickcheck-like testing can do a huge amount of work. At work I've implemented these tests for the entire low-level IO framework for our servers and these few tests are a pure meat grinder. It triggered one severe bug that would have downed production in the middle of the night and then some more.

Speaking of proofs, how about we write security critical code in haskell? You need a very simple runtime, but beyond that it would work pretty much wherever.

Most memory-related bugs are automatically eliminated, and security proofs are easier.

If you haven't seen it already, check out Cryptol from Gallois: http://corp.galois.com/cryptol/

It's a crypto DSL that I believe is implemented in Haskell (it compiles to Haskell, C, C++ and a few others).

Particularly relevant example: TLS/SSL implementation in Haskell.


Virtually all code exposed to the Internet is security critical, however.

Agree a bazillion times.

Go or Java on top. Coding in C is like juggling chainsaws to say you can juggle them. C is certainly better than old school Fortran where memory management wasn't developed until later, but platforms like Erlang, Go and JRuby are really hard to beat.

The only problem is convincing people to migrate to different tools and transition codebases to another language. It would take a large project like FreeBSD, LLVM or the Linux kernel to move the needle.

Fortran was not meant to be a systems programming language. The fact that it did not have memory management does actually make sense in scientific applications, where you typically know your problem size in advance or can just recompile before a day long computation.

Is anyone working on an OpenSSL port in rust, which lacks the memory vulnerabilities of C?

Why port all the security vulns over to Rust? There are already a handful of SSL implementations, it isn't horribly hard to do. Maybe start with http://hackage.haskell.org/package/tls

I think over the last few months we've seen some pretty concrete evidence that implementing SSL securely is horribly hard to do.

Be that as it may, porting OpenSSL to any other language is Not Recommended. The code is hideous and the documentation is practically non-existent.

The only reason anyone can recommend using OpenSSL is that it's so widely used and battle worn that vulnerabilities are more likely to be patched than in some arbitrary obscure SSL library without all the warts. If it had been published as-is for the first time in 2014 then no one would touch it.

In addition to that, if you're going to create an SSL implementation in a new language, it would be much preferable to do it without the BSD advertising clause, which you're stuck with if you start with OpenSSL.

I am saying that the benefits of porting a codebase that has had so many security vulnerabilities doesn't outweigh the cost of reimplementation.

Reimplementing a secure SSL implementation in a secure language is cheaper than porting the broken code.

Also the library he pointed to has only had 33k downloads total. Can you really suggest that as a replacement for one of the most heavily and read crypto libraries on earth? I wouldn't be surprised if OpenSSL had more than 33k programs that use it as a dependency.

I am recommending it as an existence proof that one can construct a usable crypto library in a performant safe language.

They are still making breaking changes to the language so I really doubt it.

we can plug this seemingly endless source of bugs which has been affecting the Internet since the Morris worm. It has now cost us a two-year window in which 70% of our internet traffic was potentially exposed. It will cost us more before we manage to end it.

Could one make a new kind of OS where C programs are compiled to some intermediate representation then when run this is JIT compiled within a managed hypervisor sandbox? Could Chrome OS become something like this? Does this already exist? MS had a managed code OS called Singularity.

> My opinion, then and now, is that C and other languages without memory checks are unsuitable for writing secure code.

I think they can be used to write secure code, but it has to be done carefully, with really thorough checks and unit tests, and a constant awareness of the vulnerabilities.

Everything I've heard about OpenSSL so far, suggests it was done by a bunch of cowboys who don't care about code quality. Those people shouldn't be writing C, but a safer language.

I don't think C should be blamed for the HeartBleed bug. Please see http://www.pixelstech.net/article/1397465547-HeartBleed%3A-S...

You make good points.

However, qmail is written in C and has a very good record. So I would disagree with The fact is that no programmer is good enough to write code which is free from such vulnerabilities.

There seem to be at least two programmers who are capable of that.

Java, yes, hmmm.. Oh wait, but Java VM is written in C and is a host to some of the worst web browser zero days we know of.

Fundamentally, I think we're going to have to give up on security and start handing out drivers licenses to anyone who wants to use the internet.

If that would work Virtual Machines and runtimes wouldn't have vulnerabilities.

So uhm. Yeah, that doesn't work either.

Edit: Btw since HN has this obessions with Tarsnap, it's written in C btw. So you should stop obessing about it and downvote me some more.

This argument came up in the thread from a few years ago. It is quite wrong-headed. I would like to give a clear answer to it:

Virtual machines and runtimes may be vulnerable to malicious CODE. That's bad. Programs written in unmanaged languages are vulnerable to malicious DATA. That's horrible and unmitigatable.

Vulns to malicious code are bad, but they may be mitigated by not running untrusted code (hard, but doable in contexts of high security). They are also mitigated by the fact that the runtime or VM is a small piece of code which may even be amenable to formal verification.

Vulns to malicious data, or malicious connection patterns, are impossible to avoid. You can't accept only trusted data in anything user-facing. Also, these vulnerabilities are spread through billions of lines of application and OS code, as opposed to core runtime/VM.

  Virtual machines and runtimes may be vulnerable to malicious CODE. That's bad. 
  Programs written in unmanaged languages are vulnerable to malicious DATA. 
Not exatly true. You can still write code vulnerable to input (data) in a "secure" language by accident. C is just especially vulnerable to buffer stuff.

Steps for better security

    1. use a managed language
    2. use a provable language (Haskell, Idris, etc)
Effective Psyops Against Standards and Open Software https://www.youtube.com/watch?v=fwcl17Q0bpk

What? Sorry, that doesn't make the least amount of sense.

If you have a VM your code is DATA! You can have say format string bugs in managed languages too, see for example: http://blog.stalkr.net/2013/06/golang-heap-corruption-during...

I'm sorry, but you're agrument here is plain bullshit.

Using your logic the whole kernel/browsers etc should be written in a managed language.

A browser is basically a VM for webpages though. I'd bet that chrome/firefox/IE have more severe vulnerabilities than openssl per time though.

I just... You can't argue with bullshit sorry.

Your might get less buffer overruns okay, but is it more secure in the end without any doubt?

I doubt it.

I am afraid you are the one who is not showing signs of having thought about this deeply. What is the ratio of the number of application programs, libraries, and services to the number of VMs and runtimes? Thousands, tens of thousands, millions? Depends on how you count, but it's huge. Reducing the attack surface like this is a big win.

And it is indeed a bad idea to install a browser on a critical server, and to load untrusted sites in it. You can mitigate the problem by not doing that. You can't stop the server from dealing with user data, though, since for many servers, that's what they are for. (If you are not going to deal with untrusted data, it is preferable to disable untrusted connections at as low a level as you can manage).

> Your might get less buffer overruns okay, but is it more secure in the end without any doubt?

If you hold everything else constant, yes, less buffer overruns == more secure.

So reducing the attack surface isn't a laudable goal in your book, because hey the VM itself can have vulnerabilities so there isn't a point? I think the point is that programmers will always make these mistakes and we should limit as much as possible the type of unsafe code that is written to as small an attack vector as possible. You're never going to eliminate vulnerabilities, but we sure can try and reduce the likelihood of them occurring. If there is some objective measurement to be made that says this isn't the case, i.e. the number of JVM vulnerabilities like this outstrip or is on par with client side vulnerabilities that occur in purely C/C++ applications I would love to see it.

Ultimately, I think the better answer will ultimately be a language that inherently provides the primitives for safe memory management but that's low-level and highly peformant, i.e. Rust or something like it.

There's a problem with your logic here.

You're not neccesarily reducing the attack surface. You're adding complexity. While you might reduce the attack surface on low level bugs. You might open yourself to new bug classes.

Downvote me however you guys want. It's just not that easy.

If you could eliminiate the common cold by killing the guy with the running nose, don't you think someone would have done it?

In keeping with the tradition of bad car analogies, that's like saying "Driving cars with automatic traction control won't make accidents go away, so automatic traction control is pointless".

Languages with bounds checks on array accesses don't solve everything, but that doesn't mean that they don't work. They do remove entire classes of silent failures that can potentially slip through the cracks in C-like languages. VMs aren't needed for this -- most of the strongly typed functional languages, D, Go, Rust, and others all compile down to native machine code.

Careful API design, discipline, and good coding in C can also mitigate this sort of problem manually, although (like most things in C), it's extra work, and needs careful thought to ensure correctness.

Do you know of any controlled experiments to test the safety claims for automatic traction control? People used to say similar things about ABS. Then the experiments were done, it turned out to be pointless or possibly dangerous, and people started talking about traction control instead.

Automatic bounds checking could well fail the same way that ABS did: programmers won't bother defining a packet data type, because the compiler will catch any mistakes they make fiddling with arrays. So, like drivers with ABS, programmers with ABC would go faster, but they wouldn't be any safer.

Maybe a better analogy would be roll bars or seat belts: If they help prevent something from breaking, you've already screwed up.

Nothing can prevent bad drivers from driving poorly, and nothing can prevent apathetic programmers from writing insecure code. However, even though I tend to program in C, I can still appreciate environments that will catch dumb mistakes for me and prevent them from turning into security issues.

ABS pointless? The Wikipedia article disagrees, so I'd certainly like to know more: http://en.wikipedia.org/wiki/Anti-lock_braking_system#Effect...

ESC is certainly shown to be effective, although I don't know about traction control: http://en.wikipedia.org/wiki/Electronic_stability_control#Ef...

VMs generally do not have this type of vulnerability (buffer overrun).

Also, most vulnerabilities in (e.g.) the JVM can only be exploited by running malicious code inside the VM. Here, the attacker is supplying data used by OpenSSL, but is not able to supply arbitrary code.

Given the severity of this bug, the UX of the site is failing anyone who isn't a fulltime sysadmin.

Suggestion: big, bold TLDR ("The sky is falling. Check your OpenSSL version right now") with a link on what to do sorted by OS vendor.

Step 1: Here's a command to spit out your OpenSSL version. If it is the following string, go to step 2.

Step 2: Here's how to update your OpenSSL. Here are links to guides on reissuing keys.

Probably OK the whole remediation bit links to a wiki that gets updated as the various vendors push their patches.

Agree. This needs a big fat the world is coming to an end stlye of warning.

I've just shut down the webservers running SSL that I can control. If you are vuln and don't want to build openssl from source and can afford the outage. I'd reccomend to do the same.


Let's hope CA's don't get swamped by all the CSR's. Or rather let's hope they do so we see people are doing something...

For me right now these are just my hobby projects. So I don't care if they're down. But I imagine it will be fun tomorrow.

And when it's fixed, get new keys.

Btw: I'm a dev. Not a sysadmin though :P

Edit: Debian is patched. I'm online again \o/

Ok, anyone could assist me on how to update openssl without breaking anything? I've fetched newest sources from openssl.org and compiled them, but "make install" doesn't actually install it, it only got compiled, but issuing "openssl version" still gives me the old version.

What I want to do is to patch it so our webserver uses new version.

I would tread lightly here if you aren't comfortable with compiling. Rather than break your website, it might be better to take it down until your distro's packages are available.

You should probably spend your time investigating a good method of reissuing keys for when you get to a stable OpenSSL version.

Some apps have OpenSSL statically compiled into the binaries. Beware that what you think is fixed may not be.

Well, I'm not really in position of taking the whole service down at this moment, I would really like to have a way to patch it instead.

Depending on the distro on which you're based, you may find that making a new package from a source package (e.g. srpm) would be the safest route even if you're in a hurry.

If you're on Ubuntu, it would appear at least the updated base (OpenSSL itself) packages are now in the repos.


Not to sound like a commercial for Cloudflare or anything. But putting your infrastructure behind their services can protect users while they perform their patching. According to their latest blog post http://blog.cloudflare.com/staying-ahead-of-openssl-vulnerab...

On a linux box: [For each set of certs used for each of your public facing sites...]

1. Open a terminal[cd into] /etc/path_to_ssl_certs_folder[per site].

Ex. /etc/ssl/nginx

2. Regen the certs [example nginx mail server]

openssl req -x509 -sha256 -nodes -days 3650 -newkey rsa:4096 -keyout mailkey.pem -out mailcert.pem

[this command generates a private key and server cert and outputs to pem's] [Note also the key sizes are 4096, you may want 2048. AND I use -sha256, as sha1 is considered too weak nowadays. These certs are valid for 3650 days...10 years]

Since the command overwrites certs/keys in the current directory of the same name as the outfiles...that's it...you're done. Just restart nginx.

If you change a self-signed cert, like above, expect a new warning from the client on the next connection...this is just your new cert being encountered. Click permantly accept..blah blah.


On a Windows box:

1. open an admin cmd window and run 'mmc'.

2. Add a new snap-in for Certificates as local machine.

3. Find and 'Disable all purposes for this cert'.

4. Import your new certs from your 3rd party or that you rolled yourself from your enterprise CA.

5. Test new cert.

6. Delete old cert.

[If you run your own CA, you should already know what to do...]

Agreed. They should reorder their headings, first should be What is it? and second should be How to stop it?

On my CentOS boxes I ran 'yum list | grep openssl'

This is the standard command:

  $ openssl version

  > OpenSSL 1.0.1f 6 Jan 2014

@stormbrew is correct about ubuntu, use -a or -v -b

    openssl version -v -b

    OpenSSL 1.0.1 14 Mar 2012
    built on: Wed Jan  8 20:45:51 UTC 2014

I'm totally confused by this. I'm running ubuntu LTS 12.04 and did

    sudo aptitude update
    sudo aptitude upgrade openssl
and then ran

    openssl version -a
and got the same results as you. How can it be built on January 8th if the patch was just made today?

[EDIT] running

    sudo aptitude upgrade
upgraded properly and now I'm getting a version that was compiled earlier today. I'm guessing I needed to update another package as well. Probably `libssl`?

upgrade will work because it updates libssl1.0.0 which is the package you want upgraded :) Openssl is the command line package and libssl1.0.0 is the library. i was able to upgrade openssl without upgrading libssl1.0.0.

  ben@ip-10-0-0-76:~$ dpkg -s libssl1.0.0 |grep Version
  Version: 1.0.1e-3ubuntu1

  ben@ip-10-0-0-76:~$ dpkg -s openssl |grep Version
  Version: 1.0.1e-3ubuntu1

  ben@ip-10-0-0-76:~$ sudo apt-get install openssl
  ben@ip-10-0-0-76:~$ dpkg -s libssl1.0.0 |grep Version
  Version: 1.0.1e-3ubuntu1

  ben@ip-10-0-0-76:~$ dpkg -s openssl |grep Version
  Version: 1.0.1e-3ubuntu1.2

  ben@ip-10-0-0-76:~$ openssl version -a
  OpenSSL 1.0.1e 11 Feb 2013
  built on: Mon Jul 15 12:44:45 UTC 2013
  platform: debian-amd64
  options:  bn(64,64) rc4(16x,int) des(idx,cisc,16,int) blowfish(idx)
  OPENSSLDIR: "/usr/lib/ssl"

  ben@ip-10-0-0-76:~$ sudo apt-get install libssl1.0.0

  ben@ip-10-0-0-76:~$ dpkg -s libssl1.0.0 |grep Version
  Version: 1.0.1e-3ubuntu1.2

  ben@ip-10-0-0-76:~$ openssl version -a
  OpenSSL 1.0.1e 11 Feb 2013
  built on: Mon Apr  7 20:33:19 UTC 2014
  platform: debian-amd64
  options:  bn(64,64) rc4(16x,int) des(idx,cisc,16,int) blowfish(idx)
  OPENSSLDIR: "/usr/lib/ssl"
i wonder how many people will do apt-get update openssl and assume they have fixed it

Thank you. That makes more sense now.

I'm guessing that tons of people will run into this. I bet a blog post would get you some traffic... :)

The package is called libssl1.0.0 -- it holds the shared libraries, while the openssl package contains utilities.

As far as I can tell, on ubuntu this reports "OpenSSL 1.0.1 14 Mar 2012" for all ubuntu versions, including the fixed one.

With "openssl version -a" you can see the built time.

  root@x ~ # openssl version -a
  OpenSSL 1.0.1 14 Mar 2012
  built on: Mon Apr  7 20:33:29 UTC 2014

Same here.

I got a "security warning" update when I logged in to the server (good), ran apt-get and installed, did openssl version, got the string as noted above (which seemed just a tad out of date).

So... I built and installed from source, and got... the same string.


My Linux Mint machine (based on 13.10) went from 1.0.1e Feb 2014 to 1.01 Mar 2012 int the last 2 hours, so that's definitely new.

I think someone screwed up on the version string big time.

try: dpkg -s openssl

I've built a web tester for this bug, find it at


It actually exploit the bug, since it was quite trivial, and echo some memory.

It's written in Go, no more than 100 lines. I'll release code in some time.

Interestingly, your tool claims our website (SSL-terminated at our ELB instance) is still vulnerable; while this other tool (http://possible.lv/tools/hb) claims we are unaffected.

Another, known unpatched, app is reported to be affected by both tools.

Is it possible that FiloSottile/Hearbleed may report false positives?

From what I've learned, it reports back if it gets something, when it should get nothing.

How vulnerable a specific site is depends on luck. Yahoo must have broken a whole bunch of mirrors because total amateurs can send mail.yahoo.com a certain blob of code and it has a good chance of returning a stranger's password.

My upgraded debian and ubuntu boxes are still reported as vulnerable.... Who's wrong, who's right?

Have you restarted the services linked against openssl?

lsof | grep ssl | grep DEL

It was indeed the restart step that was missing

Would love to see the code and test it against a rebuilt a patched nginx.

Filippo has hosted it with github.


Just run it against it?

Well, I was interested in actually testing it out in code. I got it working with the pyOpenSSL bindings (I had to expose struct ssl_method_st, SSL_get_ssl_method, ssl_write_bytes and rebuild cryptography for pyOpenSSL.) Fun times.

It says that the heartbleed.com site itself is vulnerable.

Looks like its fixed

Exactly what I was looking for, thanks! This should be part of the official heartbleed site not hidden away in comments here.

Nice work

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact