There was a discussion here a few years ago (https://news.ycombinator.com/item?id=2686580) about memory vulnerabilities in C. Some people tried to argue back then that various protections offered by modern OSs and runtimes, such as address space randomization, and the availability of tools like Valgrind for finding memory access bugs, mitigates this. I really recommend re-reading that discussion.
My opinion, then and now, is that C and other languages without memory checks are unsuitable for writing secure code. Plainly unsuitable. They need to be restricted to writing a small core system, preferably small enough that it can be checked using formal (proof-based) methods, and all the rest, including all application logic, should be written using managed code (such as C#, Java, or whatever - I have no preference).
This vulnerability is the result of yet another missing bound check. It wasn't discovered by Valgrind or some such tool, since it is not normally triggered - it needs to be triggered maliciously or by a testing protocol which is smart enough to look for it (a very difficult thing to do, as I explained on the original thread).
The fact is that no programmer is good enough to write code which is free from such vulnerabilities. Programmers are, after all, trained and skilled in following the logic of their program. But in languages without bounds checks, that logic can fall away as the computer starts reading or executing raw memory, which is no longer connected to specific variables or lines of code in your program. All non-bounds-checked languages expose multiple levels of the computer to the program, and you are kidding yourself if you think you can handle this better than the OpenSSL team.
We can't end all bugs in software, but we can plug this seemingly endless source of bugs which has been affecting the Internet since the Morris worm. It has now cost us a two-year window in which 70% of our internet traffic was potentially exposed. It will cost us more before we manage to end it.
From a quick reading of the TLS heartbeat RFC and the patched code, here's my understanding of the cause of the bug.
TLS heartbeat consists of a request packet including a payload; the other side reads and sends a response containing the same payload (plus some other padding).
In the code that handles TLS heartbeat requests, the payload size is read from the packet controlled by the attacker:
n2s(p, payload);
pl = p;
Here, p is a pointer to the request packet, and payload is the expected length of the payload (read as a 16-bit short integer: this is the origin of the 64K limit per request).
pl is the pointer to the actual payload in the request packet.
Then the response packet is constructed:
/* Enter response type, length and copy payload */
*bp++ = TLS1_HB_RESPONSE;
s2n(payload, bp);
memcpy(bp, pl, payload);
The payload length is stored into the destination packet, and then the payload is copied from the source packet pl to the destination packet bp.
The bug is that the payload length is never actually checked against the size of the request packet. Therefore, the memcpy() can read arbitrary data beyond the storage location of the request by sending an arbitrary payload length (up to 64K) and an undersized payload.
I find it hard to believe that the OpenSSL code does not have any better abstraction for handling streams of bytes; if the packets were represented as a (pointer, length) pair with simple wrapper functions to copy from one stream to another, this bug could have been avoided. C makes this sort of bug easy to write, but careful API design would make it much harder to do by accident.
It is indeed astonishing how simple-minded this bug is. But these bugs come in all levels of complexity, from simple overstuffed buffers to logical ping-pong that hurts your brain when you try to follow it. We need to get rid of them once and for all. If the whole world can't use a certain tool effectively, then the whole world isn't broken; the tool is bad.
Machine level languages like C and C++ aren't necessarily bad tools, even in their current states. However, I agree that they might be bad tools for the purpose of writing security libraries.
There are not bad tools, but not the best either. If you spend mental stamina on trivial things, you have less for the important ones, the ones a compiler cannot check.
This kind of tool (SSL) should be written in ada or haskell.
Why not Go, or JavaScript? I'm sorry, but specifying which language should be used is petty.
C and C++ are just fine, the fact that the OpenSSL guys cocked it up is not the language's fault, it is theirs. There are efficient ways to prevent this type of bug.
What are the efficient ways of preventing this kind of bug, if not type systems?
The parent had a good point and you should really try to look at Haskell before you say that kind of nonsense.
All the tools that are available for static analysis are basically extra type systems bolted on top of existing languages.
If you try to detect buffer overflows using static analysis of the linux kernel what you need to do is to is go through the source code and define invariants. Those invariants are TYPES in languages powerful enough to express them.
For example the invariant that memory, or any resource allocated must be freed can be expressed in Haskell.
In C++ it cannot be expressed. There are workarounds like RAII, but that does not give any guarantees.
If you do not think type systems and thus languages make any differences, you also cannot believe that formal verification makes any difference, because type systems are a weak form of formal verification. How "weak" depends on the language.
You should also read up on the Curry-Howard correspondence to learn something about the deep connections between types, programs, and proofs.
JavaScript would be terrible, because it's easy to hide unwanted behaviour in counter-intuitive corners of the language.
Besides the language peculiarities, a garbage collected or interpreted language is very vulnerable to side channel attacks because of the large amount of complicated behaviour that is being glossed over by the language runtime. (One example would be garbage collection rounds and timing attacks, but I'm sure smarter people would find tons of features that leak secret information. Another example is on-demand JIT'ing when code becomes hot in certain runtimes. The timing of such a JIT stall could publish information you thought secure.)
Guarding against side-channel attacks in any language is hard. Guarding against them in Javascript is probably impossible. Whether you would take Javascript over C is irrelevant. It would still be a terrible choice for a security framework. Perhaps modern system languages, such as D or Go might be suitable.
I've felt that C makes this code easy to write because it makes doing the right thing hard. What you are describing is just a lot of work in C, compared to a language with something akin to Java's generics, which are in turn an afterthought in the ML family of languages. What we're asking for is not that complicated from a PL standpoint. A generic streams library?
Economics plays an invisible part here. Someone writing a library has a limited amount of time to implement some set of features, and to balance that against other needs, like making the code "clean"/pretty and secure. In this case, pretty code and secure code are akin. Consumers would likewise have to balance out feature needs with how likely the code is going to explode. What it comes down to is that you aren't likely to have secure, stable code in a language that doesn't inherently encourage it.
It starts to be clearer then, that the more modern, "prettier" languages offer material benefits in their efforts to be more elegant.
Even in C, Go or Python, I column align any text that is remotely similar, so differences are obvious.
Clean code might be extra work but the net work (maintenance) should amortize less. Reducing cognitive load for large supportable production codebase cannot be underscored enough.
There's no "just use X" type of answer in security.
Sep 2013
"All versions of the open source Ruby on Rails Web application framework released in the past six years have a critical vulnerability that an attacker could exploit to execute arbitrary code, steal information from databases and crash servers."
C != OpenSSL. Some [1] would argue OpenSSL is not representative at all what C can do. Maybe you should check out Redis for beauty [2] and joy [3].
On the same note C != C++ either and you can write large systems in C++ without ever using memory allocation. You can use only bounds checked functions.
And you can have large security holes if you're not careful, no matter which language you pick.
Sure it can and I'm not surprised it has. But if you're trying to point out flaws in ruby, at least use examples for flaws in ruby - not flaws in something written in ruby. It's not like web frameworks in other languages magically don't suffer from XSS injection attacks.
The issue at hand is a flaw in something written in C, though. I agree the point wasn't well made (is there any reason to think those errors would not have been made had the project been written in C?) but your objection isn't quite right.
The issue at hand is an error that is typical for C (unchecked out of bound memory access). It's a class of error that does usually not occur in other languages. The vulnerabilities in Rails were XSS vulnerabilities and an information leak - both classes of errors typically found in web application frameworks.
The first is an example of an error made more common by the language design, the other an example of errors typical for a class of applications. There's a fundamental difference here. There's a ton of reasons to criticize ruby and it brings its own set of flaws and problems, some rooted in the language and some rooted in its ecosystem - but the given examples just show that web applications are hard to get right. That's why this is not "a point not well made" but rather "sorry, you're attacking a strawman here".
There is a "just use X". If you code in a language where you can express the invariants in your code, and make the compiler check those invariants, then your code is immune to all of the vulnerabilities that we have seen in OpenSSL.
The fact that these languages don't automatically do all my system administration tasks for me is not an argument against using them.
Rails does plenty of "make life easier for the programmer" things that I would expect to increase the risk of security issues. Do you have those kind of problems for e.g. Haskell?
Haskell problably has/would have the same kind of problems, but finding examples will be a lot harder in the absence of large well-used web platform à la RoR
If you worked at it, you could create this problem in Haskell. However, it is in fact the case that Haskell would be, in its own way, screaming at you; your configuration (or whatever) parser takes in some text and then returns something of type "IO Configuration"... what is that IO doing there? You don't have to be very skilled in Haskell to stop right there and have a serious think about what's going on. And in the absence of IO, or some other really obviously wrong type signature, there isn't much malicious stuff you can do in the parser layer. You could still have a vulnerability by doing something wrong when given certain configurations, but there's not much we can do about straight-up bugs. Even a proof language will let you make straight-up errors, they'll just force you to deeply, profoundly make the error instead of superficially make it... but we humans are up to the task!
Thus, the argument that Haskell probably has the same, is simply false.
There are large web platforms in Haskell. Yesod is probably the largest eco-system. It is clearly not as well used as RoR, but anyone can dig through large amounts of code to try to find these bugs.
What Haskell has that everyone else has are bugs/misunderstandings in how protocols are implemented. Sometimes there can be fundamental bugs in the run-time-system. However, large classes of bugs are fundamentally less likely to appear than in less safe languages.
Once you are doing functional programming a bunch of classes of problems including a bunch of classes of security problems go away.
For example, here, if the guarantee of functional programming is that a given input leads to a given output and has no memory side effects, then your attack surface area is a lot, lot smaller.
you get it but the opinion of "C and C++ are fine for SSL, the OpenSSL guys just screwed up" is plain wrong.
This is a question of priorities. We have speed and security. If you chose C/C++ (non-existent automated checking of memory access) you are chosing speed first, security second.
If security is critical then you need to chose a language that makes array out of bounds access well nigh impossible. This is an easy problem -- we have languages that will give this to us.
What percentage of exploits in the wild come from array (and pointer) access out of bounds? I'd venture to say it is above 50%.
Rather than have programmers everywhere "try hard to be careful" writing this code, let them use a safer language and have a few really smart folk work on optimizing the compiler for said language to make the safety checks faster (e.g. removing provably unnecessary/redundant checks).
People think that chosing C/C++ has a better business case (i.e. better performance / scaling) because "being really careful" works most of the time. The problem is when heartbleed (or the next array out of bounds access bug) hits the the business case's ROI no longer looks so much better than the safer path.
A better language won't eliminate all security holes but it can eliminate a huge class of them and allow engineers to focus the energy they used to spend on "being really careful about array access and pointers" on other tasks (be they security, performance or feature related).
EDIT: stating the obvious .. there are good uses for C style languages but writing large bodies of software that needs to be resistant to malicious user attacks is not one of them.
Thanks for this. How is this reading arbitrary memory locations though? Isn't this always reading what is near the pl? As in, can you really scan the entire process's memory range this way or just a small subset where malloc (or the stack, whichever this is) places pl?
The latter, and AFAIK the buffer doesn't get reallocated on every connection, so it should be unlikely that any private keys actually get dumped. However, I could be missing a way to exploit it.
Reading between the lines in the announcement it sounds like dropping and reconnecting may cause it to read memory freed up from a prior connection. It may "just" be a matter of keep trying or it may be a matter of opening lots of connections to consume resources dropping them all then connecting and seeing what was left on the beach after the tide went out.
BTW Amazon AWS/ELM is vulnerable, confirmed publically by their support.
I gave this some thought earlier today, and expect that address space randomisation can make this bug eventually expose the server keys. You need to hit an address that has been just vacated from a (crashed) httpd worker.
Most implementations clear encryption key material on exit, but a crashed process never got to run that code.
In most systems, this will only work within the same process. Contemporary Unix kernels always allocate zeroed pages to processes, so it's impossible for a process to recover data from another unless there's a kernel bug.
When for instance an AES-key is being used by OpenSSL, it is put into a 'struct aes_key_st' which is not random at all but quite easily recognizable when scanning memory.
The Cold Boot attack paper by Halderman, Schoen et al. here
Well, one way is to brute iterate through every potential 256-bit string you dredge out of the canal against the known public key.
If you can dredge up 64kB of fresh data every time, that's 511,744 tests per shovelful which is quite a bit to sift through from a performance perspective but it's also a trivially parallel task.
Additionally, folk might know of even better ways to narrow that down. For example, the data representation in memory might have easy to grep for delimiters.
This reminds me of what another programmer told me a long time ago when we were discussing C; "The problem with C is that people make terrible memory managers.". So true.
I agree that this seems like an abstraction for this is missing, but I always have the feeling that what you're doing in covering holes in a leaking dam you might get good at it, but you'll always have leaks.
I have always detested C (also C++) because it's so unreadable... the snippets of code you cite are just so dense ie. a function like n2s() gives pretty much no indication of what it does to a casual reader. Just reading the RFC (it is pretty much written in a C style) gives me the creeps.
The RFC doesn't mention why there has to be a payload, why the payload has to be random size, why they are doing an echo of this payload, why there has to be a padding after the payload. If this data is just a regular C struct like the RFC makes it out to be (I didn't know you could have a struct with a variable size, but apparently the fields are really pointers or it's just a mental model and not a real struct).
Apparently the purpose of the payload is path MTU discovery. Something that is supposed to happen at the IP layer, but I don't know enough about datagram packets. I guess an application may want to know about the MTU as well...
I'm not here to point fingers, I'm just saying C is a nightmare to me and a reason for me to never be involved with system programming or something like drafting RFC's ;-).
But if one can argue that C is a bad choice for writing this stuff, then that is not an isolated thing. "C" is also the language of the RFCs. "C" is also the mindset of the people doing that writing. After all, the language you speak determines how you think. It introduces concepts that become part of your mental models. I could give many examples, but that's not really the point.
And it's about style and what you give attention to. To me, that RFC is a real bad document. It starts to explain requirements to exceptional scenario's (like when the payload is too big) before even having introduced and explained the main concepts and the how and why's.
So while you may argue that this is a C problem and not a protocol problem, it is really all related.
And you may also say, in response to someone blaming these coders, that blame is inappropriate (and it is) because these are volunteers and they are donating their free time to something to find valuable, the whole distribution and burden of responsibility is, naturally, also part of the culture and how people self-organize and so on.
As someone else explained (https://news.ycombinator.com/item?id=7558394) the protocol is real bad but it is the result of more or less political limitations around submitting RFCs for approval. There is no reason for the payload in TLS (but apparently there is in DTLS) but my point is simply this:
If you are doing inelegant design this will spill over into inelegant implementation. And you're bound to end up with flaws.
Rather than trying to isolate the fault here or there, I would say this is a much larger cultural thing to become aware of.
This sort of argument is becoming something of a fashion statement amongst some security people. It's not a strictly wrong argument: writing code in languages that make screwing up easy will invariably result in screwups.
But it's a disingenuous one. It ignores the realities of systems. The reality is that there is currently no widely available memory-safe language that is usable for something like OpenSSL. .NET and Java (and all the languages running on top of them) are not an option, as they are not everywhere and/or are not callable from other languages. Go could be a good candidate, but without proper dynamic linking it cannot serve as a library callable from other languages either. Rust has a lot of promise, but even now it keeps changing every other week, so it will be years before it can even be considered for something like this.
Additionally, although the parsing portions of OpenSSL need not deal with the hardware directly, the crypto portions do. So your memory-safe language needs some first-class escape hatch to unsafe code. A few of them do have this, others not so much.
It's fun to say C is inadequate, but the space it occupies does not have many competitors. That needs to change first.
First, I do realize that rewriting the software stack from the ground up to have only managed code is a huge task. I do think that as an industry, we should set a goal of having at least one server implementation along these lines (where 'set a goal' may mean, say, grants or calls for proposals). Microsoft Research implemented an experimental OS like that, although it probably didn't have all the features a modern OS would need. I don't know if we need a new language, but we do need a huge rethink of the server architecture, and not just a piece-by-piece rewrite, which I think will founder on the interface issues that you mentioned.
Anyway, I am quite realistic about the prospect of my comment having that kind of effect on the industry - I don't suffer from delusions of grandeur. I was aiming the comment more at people who choose C/C++ for no good reason to write a user-level app; that app is nearly certain to have memory use errors, and if it has any network or remote interface, chances are they can be easily exploited. I'd like as many people as possible to understand that they can't expect to avoid such errors, any more than one of the most heavily audited pieces of software avoided them. We have had decades of exploits of this vulnerability, and yet most programmers are oblivious to it, or think only bad programmers are at risk. So just as tptacek goes around telling people not to write their own crypto, I go around telling people - with less authority and effectiveness, unfortunately - not to write C/C++ code unless they really need to.
As for the performance issues forcing OpenSSL to use C, well, we apparently exposed all our secrets in the pursuit of shaving off those cycles. I hope we are happy.
That was a very reasonable response, I can roll with that.
Just one thing: when I brought up talking directly to the hardware, I did not mean just for performance's sake. Avoiding side-channel attacks often requires to have high control over the generated machine code, and that is the primary reason to not do it in higher-level languages (unless they also permit that level of control).
In Java 8 the JVM knows how to compile AES straight through to AES-NI invocations, so the CPU itself is doing the hardware accelerated crypto (in constant time). It's not necessarily the case that higher level languages have to be unsafer: especially not on something like a server where the overhead of JIT compilers get amortized out.
Okay, but what about other crypto algorithms not implemented in hardware? Eventually someone has to figure out how to generate machine code that operates in constant time across code paths, and they need a language that will let them do that.
Do not like the term "C/C++", and especially in this context. Modern C++ makes avoiding this sort of bug as easy as doing so in the "managed" languages already discussed.
This is as much a cultural as a technical problem; C really is in the last chance saloon for this sort of problem, we have the solution to hand, but a strong cadre of developers will still only consider C for this sort of work.
Thank you, this really had to be said. You can do C in C++ (and if you do that, chances are that you're doing it wrong), but you can't do C++ in C. The ugliest and unsafe parts of C++ are invariably those coming almost untouched from C (raw pointers, casts from/to void ptrs, etc). C is close to the machine, but C++ is largely a vertical language where you can do things low-level or high-level (and yes, you can do a lot of interesting things with templates, no matter the bad rap they've got); and most of the current C++ community vastly prefers high-level-like code, for good reasons (for starters, it may be even more performant). A LOT of the sources of unsafe code go away by two simple techniques: use RAII-managed smart pointers instead of raw ones (some of them are as lightweight as a raw pointer), and prefer vector (or other container) to array.
I love C and C++, and each one has its place, but really, they're very different. Almost as much as C++ is to Java, for example.
The problem, and I think this may have been touch on somewhere else in the thread, is that C++ can be really complex to wrap. So embedding a C++ library in another, higher-level language can be very tricky. It often requires wrapping the parts of the API you want to use in C.
I'm fine with low-level libraries being written in C++, but would hope that developers expose a C API around everything.
I don't think this is true. When it comes to copying bytes from a buffer supplied from the network, there isn't a wrapper/manager class that can do this for you. Somewhere down the pipeline some piece of code has to copy the unstructured, variable length byte stream into a manageable data structure. C++ does not give any way to do this beyond the mechanisms available in C.
What you can do is lower the surface area of vulnerability. Low level byte wrangling is kept in a small subset of generic classes and functions. Application logic then only uses the safe interfaces.
Thank you for pointing out that this is not an issue in C++, only in C. There are times when we need C and one can write C code and compile it with a C++ compiler. It's an option to help ease people to modern, idiomatic C++, but please don't lump the two together. C and C++ are two entirely different languages and C++ is much safer.
My first thought too was Ada, it's easily callable from anything that can call C afaik, and has infinitely better support for catching these sorts of bugs than C or C++ do. It's basically made for this kind of project, and yet no one in the civilian population really cares; it's a real shame. Not only does Ada make it easy to do things safely, it makes it very hard to do things unsafely.
I've been advocating Ada's use on HN for a few years now, but it always falls on deaf ears. People seem to think it's old and dead like COBOL or old FORTRAN, but it's really a quite modern language that's extremely well thought out. Its other drawback is that it's pretty ugly and uses strange names for things (access is the name given to pointer like things, but Ada specifies if you have say a record with a 1 bit Bool in it, you must be able to create an access to it, so a pointer is not sufficient).
Tony Hoare (Mr. Quicksort, CSP, etc...) has softened his stance since "The Emperor's Old Clothes", but his concern was that ADA is too complicated to be understandable and safe. I hated Pascal because the array length was part of its type... but maybe that kind of thinking is apparently what it takes to avoid bugs like Heartbleed.
Can I suggest you take a quick look at ATS? The language itself is kind of horrid (and I am a ML fan) and the learning curve is way steep, but the thin, dependently typed layer over C aspect is actually quite nice.
Note: I'm not suggesting it for current production use, but rather as something that could be expanded further in the future.
An interesting new project written in Ada is a DNS server called Ironsides, written specifically to address all the vulnerabilities found in Bind and other DNS servers [1].
"IRONSIDES is an authoritative DNS server that is provably invulnerable to many of the problems that plague other servers. It achieves this property through the use of formal methods in its design, in particular the language Ada and the SPARK formal methods tool set. Code validated in this way is provably exception-free, contains no data flow errors, and terminates only in the ways that its programmers explicitly say that it can. These are very desirable properties from a computer security perspective."
Personally, I like Ada a lot, except for two issues:
1. IO can be very, very verbose and painful to write (in that way, it's a bit like Haskell, although not for the same reason). Otherwise, it's a thoroughly modern language that can be easily parallelized.
2. Compilers for modern Ada 2012 are only available for an extravagant amount of money (around $20,000 last I heard) or under the GPL. Older versions of the compiler are available under GPL with a linking exception, but they lack the modern features of Ada 2012 and are not available for most embedded targets. And the Ada community doesn't seem to think this is much of a problem (the attitude is often, "well, if you're a commercial project, just pony up the money"). The same goes for the most capable Ada libraries (web framework, IDE, etc.) -- they're all commercial and pretty costly. Not an ideal situation for a small company.
But yes, Ada is exceptionally fast and pretty safe. There's a lot of potential there, but to be honest my hopes are pinned on Rust at this point.
This is wrong: the compilers distributed by the FSF (as part of GCC) have the linking exception. The ones distributed by AdaCore don't: they exercise their right to transform the modified GPL (with exception) into the GPL before redistribution. But the one in your Linux distribution is likely the FSF one.
No, it's absolutely correct. He wrote that the latest compiler doesn't have a linking exception, which is correct. The FSF compiler generally lags at least a year or two behind the AdaCore one.
For example, it has only just gotten (partial) Ada 2012 features, a full 3-4 years after the AdaCore GPL compiler.
We might be stuck with C for quite a while but then maybe the more interesting question is 'how does this sort of thing get past review?'. It's not hard to imagine how semantic bugs (say, the debian random or even the apple goto bug) can be missed. This one, on the other hand, hits things like 'are the parameters on memcpy sane' or 'is untrusted input sanitized' which you'd think would be on the checklist of a potential reviewer.
I think you mentioned the right keyword: "checklist". If you scan the http://wiki.openssl.org website carefully, you will be scanning it carefully. (ie. not finding anything). It doesn't seem to be a good practice yet to use checklists for code reviews. Could this change? I hope: http://www.infoq.com/presentations/agile-code-reviews
"Rust has a lot of promise, but even now it keeps changing every other week..."
A larger problem, in my opinion, is that things like OpenSSL are used (And should be!) from N other languages. As a result, calling into the library requires almost by definition lowest-common denominator interfaces. Which is C.
C code calling into Rust can certainly be done, but I believe it currently prohibits using much of the standard library, which also removes a lot of the benefits.
C++ doesn't, I think, have as much of a problem there, but I'm somewhat skeptical of C++ as a silver bullet in this case.
Why not write the code in C# (for example) and extract it to $SYSTEM_PROGRAMMING_LANGUAGE? It wouldn't be much different than what Xamarin are doing now for creating iOS and Android apps with C#.
Using C as an output language, backed by guarantees at the higher level, could certainly work. I believe ATS [1] works this way, and can even avoid garbage collection altogether if desired. I understand it is not an easy language, though.
Nimrod [2] also generates C, but as I understand it garbage collection is unavoidable.
This is one reason I'd like to see the removed LLVM C backend brought back and modernized, with Rust as the source language. Rust is safe, has no mandatory garbage collector, and has a much lower impedance mismatch with C or C++ than most higher level languages, so it should work well for libraries that are expected to integrate with C code.
I'm not clear why you'd want to compile your rust down to C, only to then compile that C again? Surely you're better off with a single compiler invocation, and taking rust straight to object code?
I know that historically it's been easier to write a code generator than a compiler backend, but with LLVM you get the backend just as cheaply as the code gen.
Makes it easier to target embedded arches. For example, LLVM doesn't target the MSP430 from TI, but there is a gcc fork for it. Sure, you can write a new backend for LLVM, but that's a whole different ballgame.
This way it would be Rust -> LLVM IR -> C -> GCC for MSP430.
Did you actually read my entire comment, or did you just see "C#" and post a knee-jerk reaction? If you extract code from C# to C, you don't need a GC or any of the .NET class libraries -- you'd just have a standalone C file to use like any other.
Perhaps another way of saying this is that you will always need a runtime. And if you reimplement the portions of the runtime that you need in C, you've essentially re-implemented .NET.
>Additionally, although the parsing portions of OpenSSL need not deal with the hardware directly, the crypto portions do. So your memory-safe language needs some first-class escape hatch to unsafe code. A few of them do have this, others not so much.
For the other points there is some debate, but don't most serious languages have a C FFI?
OpenSSL and similar libraries spend most of their time processing short packets. For example, encrypting a few hundred bytes using AES these days should take only a few hundred CPU cycles. This means that the overhead of calling the crypto code should be minimal, preferably 0. This is in part what I meant by "first-class". Perhaps I should have written "zero-overhead" instead.
I googled around just now for some benchmarks on the overhead of FFIs. I found this project [1] which measures the FFI overhead of a few popular languages. Java and Go do not look competitive there; Lua came surprisingly on top, probably by inlining the call.
Before you retort with an argument that a few cycles do not matter that much, remember that OpenSSL does not run only in laptops and servers; it runs everywhere. What might be a small speed bump on x86 can be a significant performance problem elsewhere, so this is something that cannot be simply ignored.
those linked tests are extremely disingenuous, it only shows the fixed cost of FFIs.
Considering that in C the plusone call is 4 or so cycles, and the Java example is 5 times slower, that's only 20 or so cycles. If the function we're FFIing into is 400 cycles, that's only a 1% decrease in speed. I'm willing to pay that price if it means not having to wake up to everything being vulnerable every couple of months.
This project attempts to measure the overhead of calling out from $LANGUAGE and into C, which is the reverse of what's necessary to solve the problem stated here — to write a low-level library in a high-level language.
There are other means of achieving a secure implementation, such as programming in a very high-level language, such as Cryptol, and compiling to a low-level language:
No, it's the exact problem we're faced with here: Calling OpenSSL from the outside is something you do a handful of times. The OP was concerned about parts of OpenSSL that require direct hardware access (thus, should be written in C). Because those parts of the code are extremely hot, having to cross FFI boundaries to reach them might be prohibitively expensive.
I think that was part of his point, but yeah I don't see why you couldn't do the pure parts in a safer language. FFIs to C are fairly easy to do I think, probably partly because of how simple the calling convention is.
I believe Haskell could be up to the job, but I heard that there were some difficulties in guarding against timing attacks. However those could have just been noise. I know that a functional (I believe and haha) operating system was made in Haskell.
Aren't Operating Systems lower level than OpenSSL?
I look forward to reading the hilarious threads that will be spawned when you take to linux-kernel, freebsd-hackers, openbsd-misc, etc. and inform them they should be developing their kernels in Haskell.
Functional programming's unpopularity is not rooted in any real or imagined inability to write operating systems.
Other than C there is also C++ and D if you don't want to stray to far from C. The problem with C++ is that even though it is possible to adapt to a memory safe programming style with C++ the concepts are not prevalent in the community.
Sorry, let me be more clear: In the courses and books I've seen and read novices don't get taught secure coding techniques as C++ is often introduced as a superset of C and security in general is not of interest to the teacher. Then later on when they transition to the web as a primary source of information there is a lot of legacy C++ code lying around that does not use modern memory management concepts. Also, as nobody has ever told them the importance of strict coding styles for security they also don't start looking for them, even though it would be possible to find them with the right keywords.
I don't know. I was a C++ coder 15 years ago, left for managed languages, then came back to it a couple years ago. Rusty would be an understatement, so I had to come at it in what may be a worse state than a noob: a person with outdated knowledge of how things work.
If you looked at all for "best practices", things like RIAA, stl/boost, and other concepts became very clear, very quickly, and these are the types of thing that limit these kinds of bugs (RIAA, in particular). Now, to be fair, I was writing crypto-related software, so I was paying very close attention, but I didn't really have to hunt.
They are generally better than nothing, however. And they do generally offer better performance, better predictability, an a simpler implementation than other approaches to memory management and garbage collection.
Maybe a language like Rust will offer a safer alternative at some point, but that point surely isn't today, and probably not tomorrow, either. Maybe there are other languages that offer better safety, but they often bring along their own set of very serious drawbacks.
In terms of writing relatively safe code today, that performs relatively well, that can integrate easily with other libraries/frameworks/code, and can be readily maintained, the use of C++ with modern C++ techniques is often the only truly viable option.
>Maybe a language like Rust will offer a safer alternative at some point, but that point surely isn't today, and probably not tomorrow, either.
Rust absolutely does offer a safer alternative today. The only problem with Rust at this point is that the standard library is in a state of great flux, which makes it hard to use the language for serious projects. But the memory safety is there.
And even with all the changes, there are at least a couple of companies using the last tagged release of Rust in production.
That said, I fervently hope that Rust can hit 1.0 soon (as in this year). A lot of people are looking to move on from C and C++ at this point, but a lot are moving to Go, D, or Nimrod because Rust has been beta for so long (yes, I know Go is technically not in the same tier as Rust, D, and Nimrod). Once they put in the effort of learning these languages, they're unlikely to switch to Rust, thus missing out on all the safety guarantees that Rust offers.
RAII is a minimum. You also have to treat any direct and indirect (unchecked) pointer arithmetic as a potential security vulnerability though. For example if you use the [] operator of a vector you can still access memory outside of the allocated space. Instead you would have to use the at() method which actually checks the bounds. Even iterator are problematic as the iterator on a vector also ignores the actual bounds iirc (though with most idioms the comparison against the end iterator is pretty error proof). This lends itself to constructs where you do not work with any indexes at all in the way foreach loops abstract your position in a container away.
Ah I tend to not use [] operators on vectors etc. but as you say iterators can be problematic. The best way is to start at the end of the container and work backwards, particularly if you are removing items from the list (thereby wrecking the iterator's idea of the end position if you were moving forwards through it).
I should probably finish Bjarne's C++11 book - I am maintaining a codebase of old style C++ and seem to be stuck in the old methods of doing it, mainly because of using compilers that don't have C++11 support.
Is there any recommended reading on new style C++ other than Bjarne's book?
What you say can easily be disproved, and you are simply asking for too much if you ask for something to be a drop-in replacement for OpenSSL. Some re-architecting is requred simply because of the insecurity of C.
For example, a shared library that implements SSL would have to be a shim for something living in a separate process space.
That is a Haskell implementation of TLS. It is written in a language that has very strong guarantees about mutation, and a very powerful type system which can express complex invariants.
Yes, crypto primitives must be written in a low level language. C is not low level enough to write crypto, neither securely nor fast, so that's not an argument in its favor.
There are several languages that do fill that gap, but security people never use it. For example, Cyclone is pretty good. (http://cyclone.thelanguage.org/).
> C and other languages without memory checks are unsuitable for writing secure code
I vehemently disagree. Well-written C is very easy to audit. Much much moreso than languages like C# and Java, where something I could do with 200 lines in a single C source file requires 5 different classes in 5 different files. The problem with C is that a lot of people don't write it well.
Have you looked at the OpenSSL source? It's an ungodly f-cking disaster: it's very very difficult to understand and audit. THAT, I think, is the problem. BIND, the DNS server, used to have huge security issues all the time. They did a ground-up rewrite for version 9, and that by and large solved the problem: you don't read about BIND vulnerabilities that often anymore.
OpenSSL is the new BIND; and we desperately need it to be fixed.
(If I'm wrong about BIND, please correct me, but AFICS the only non-DOS vulnerability they've had since version 9 is CVE-2008-0122)
> but we can plug this seemingly endless source of bugs which has been affecting the Internet since the Morris worm.
If we're playing the blame game, blame the x86 architecture, not the C language. If x86 stacks grew up in memory (that is, from lower to higher addresses), almost all "stack smashing" attacks would be impossible, and a whole lot of big security bugs over the last 20 years could never have happened.
(The SSL bug is not a stack-smashing attack, but several of the exploits leveraged by the Morris worm were)
> The problem with C is that a lot of people don't write it well.
Including people responsible for one of the most important security-related library in the world. No matter how good and careful a programmer is, they are still human and prone to errors. Why not put every chance on our side and use languages (e.g. Rust, Ada, ATS, etc.) that make entire classes of errors impossible? They won't fix all problems, and definitely not those associated with having a bad code base, but it'd still be many times better than hoping people don't screw up with pointers lifetime.
> Why not put every chance on our side and use languages (e.g. Rust, Ada, ATS, etc.) that make entire classes of errors impossible?
I don't think intentionally preventing the programmer from doing certain things the computer is capable of doing on the theory it makes errors impossible makes sense.
As I've said several times in this thread, somebody has to deal with the pointers and raw memory because that's the way computers work. Using a language where the language runtime itself handles such things only serves to abstract away potential errors from the programmer, and prevents the programmer from doing things in more efficient ways when she wants to. It can also be less performant, since the runtime has to do things in more generic ways than the programmer would.
> Including people responsible for one of the most important security-related library in the world.
I think you've hit on a crucial part of the problem: practically every software company on Earth uses OpenSSL, but not many of them pay people to work on it.
> I don't think intentionally preventing the programmer
> from doing certain things the computer is capable of
> doing on the theory it makes errors impossible makes
> sense.
With arguments like this, we'd all be back in the days of non-structured programming languages (enjoy writing all your crypto in MUMPS). Every modern language, including C, restricts itself in some way in order to make programs more predictable and errors less likely. Some simply impose more restrictions than others, though these restrictions can actually make programs more efficient (see, for instance, alias analysis in Fortran vs. alias analysis in C).
> somebody has to deal with the pointers and raw memory
> because that's the way computers work
All three of the languages listed previously (Rust, Ada, ATS) are systems programming languages with the capability of manipulating pointers and raw memory (though I don't personally have any experience with the latter two). What they have in common is that they provide compile-time guarantees that certain aspects of your code are correct: for example, the guarantee that you never attempt to access freed memory. These are static checks that require no runtime to perform, and impose no overhead on running code.
> With arguments like this, we'd all be back in the days of non-structured programming languages
You're confusing the difference between syntactical restrictions and actual restrictions on what one can make the computer do.
I define "things the computer is capable of" as "arbitrary valid object code executable on the CPU". (Valid here meaning "not an illegal instruction".) Any language that prevents me from producing any arbitrary valid object code is inherently restrictive. C allows me to do this. I can even write functionless programs in C, although it's often non-portable and requires mucking with the linker. If the CPU has some special instruction I want to use, I can use inline assembly.
Any language that prevents me from doing arbitrary pointer arithmetic and memory accesses prevents me from doing a lot of useful things I can do in C. See my other comment about linked lists with 16-bit pointers on a 64-bit CPU.
My understanding of Rust is that its pointers have similar semantics to the "safe_pointers" in C++. If that's the case, my understanding is that it would prevent me from doing things like the 16-bit linked list (please, correct me if I'm wrong).
> in C, I can make the computer do absolutely anything I
> want it to in exactly the way I want it to. Maybe I
> don't like 8-byte pointers on my 64-bit CPU, and I want
> to implement a linked list allocating nodes from a
> sparse memory-mapped pool with a known base address
> using 16-bit indexes which I add to the base to get the
> actual addresses when manipulating the list? That could
> be a big win depending on what you're doing, and
> (correct me if I'm wrong) there is no way to do that in
> Java or Haskell.
This is possible in Rust, you'll just need to drop into an "unsafe" block when you want to do the pointer arithmetic. In the meantime, everywhere that isn't in an "unsafe" block is guaranteed to be as safe as normal Rust code. Furthermore, even Rust's "unsafe" blocks are safer than normal C code. Rust is a systems programming language, so we know that you need to do this stuff. We have inline assembly too! Our goal is to isolate the unsafety and thereby make it easier to audit.
Rust forces you to draw safety boundaries between safe and unsafe code, but you can do almost strictly more than C in unsafe Rust. It has support for inline assembly, SIMD, packed structs, and well-defined signed integer overflow. None of these is part of standard C or C++, and is only available through compiler-specific dialects. There wasn't even a well-defined memory model with support for atomics before C11/C++11.
> Why not put every chance on our side and use languages (e.g. Rust, Ada, ATS, etc.) that make entire classes of errors impossible?
Bugs will still occur, just in a different way: Java is advocated as being a much "safer" language, but how many exploits have we seen in the JRE? Going to more restrictive, more complex languages in an attempt to fix these problems will only lead to a neverending cycle of increasing ignorance and negligence, combined with even more restrictive languages and complexity. I believe the solution is in better education and diligence, and not technological.
> Java is advocated as being a much "safer" language, but how many exploits have we seen in the JRE?
Very few. I don't think I can remember ever seeing an advisory for Java's SSL implementation.
Yes, bugs are possible in all languages, but that doesn't mean there's no difference between languages. I'm reminded of Asimov: "When people thought the earth was flat, they were wrong. When people thought the earth was spherical, they were wrong. But if you think that thinking the earth is spherical is just as wrong as thinking the earth is flat, then your view is wronger than both of them put together."
(There are a large number of bugs in the browser plugin used for java applets, but they have no relation to the JRE itself)
>The problem with C is that a lot of people don't write it well.
There are languages that make it very very hard to write bad code. Haskell is a good example of where if your program type-checks, there's a high chance it's probably correct.
C is a language that doesn't offer many advantages but offers very many disadvantages for its weak assurances. Things like the Haskell compiler show that you can get strong typing for free, and there's no longer many excuses to run around with raw pointers except for legacy code.
This is especially true in the field of crypto, where timing attacks are a major issue. Knowing that your program will produce the correct result isn't enough, you need to know that the amount of time taken to compute that result doesn't leak information, and I don't think Haskell provides any way to ensure this.
How do I embed your Haskell library in my Lisp program? This is where C shines...it can be used everywhere, including embedded in other programs, now matter what language it's in.
> There are languages that make it very very hard to write bad code. Haskell [...]
Sure, but how much slower is Haskell than an equivalent implementation in C? Some quick searching suggests numbers like 1000% slower... and no amount of security is worth a 1000% performance hit, let alone a vague "security mistakes are less likely this way" sort of security. Being secure is useless if your code is so slow that you have to run so many servers you don't make a profit.
Could the Haskell compiler be improved to the point that this isn't a problem? Maybe. Ultimately I think the problem is that Haskell code is very unlike object code, and that makes writing a good compiler very difficult. C is essentially portable assembler; translating it to object code is much more trivial.
> C is a language that doesn't offer many advantages but offers very many disadvantages for its weak assurances.
C offers simplicity. Sure, there are some quirks that are complex, but by and large it is one of the simplest languages in existence. Once you understand the syntax, you've essentially learned the language. Contrast that to Java and C#: you are essentially forced by the language to use this gigantic and complicated library all the time. You are also forced to write your code in pre-determined ways, using classes and other OOP abstractions. In C, I don't have to do that: I can write my code in whatever way I feel makes it maximally readable and performant.
C also offers flexibility: in C, I can make the computer do absolutely anything I want it to in exactly the way I want it to. Maybe I don't like 8-byte pointers on my 64-bit CPU, and I want to implement a linked list allocating nodes from a sparse memory-mapped pool with a known base address using 16-bit indexes which I add to the base to get the actual addresses when manipulating the list? That could be a big win depending on what you're doing, and (correct me if I'm wrong) there is no way to do that in Java or Haskell.
> there's no longer many excuses to run around with raw pointers except for legacy code.
If by "raw pointer" you mean haphazardly casting different things to (void * ) or (char * ) and doing arithmetic on them to access members of structures or something, I agree, 99.9% of the time you shouldn't do that.
Or are you talking about that "auto_ptr" and so-called "smart pointer" stuff in C++? In that case, your definition of "raw pointer" is every pointer I've ever defined in all the C source code I've ever written.
Pointers exist because that's the way computer hardware works: they will never go away. I'd rather deal with them directly, since it allows me to be clever sometimes and make things faster.
>Sure, but how much slower is Haskell than an equivalent implementation in C? Some quick searching suggests numbers like 1000% slower...
With things like stream fusion (http://research.microsoft.com/en-us/um/people/simonpj/papers...) , which I imagine would capture a lot of crypto calls, GHC can generate some very performant code (paper contains examples of hand-written C code being beat by Haskell code, and the C code is far from naive).
There are a lot of tricks at your disposal when you know more about the state of the code. And compilers are usually better than humans in this regard.
That paper is extremely fascinating, thanks for sharing.
As an aside, I'm reasonably sure that GCC can use SIMD instructions for certain bulk memory operations in certain circumstances if you feed it the right -march= parameters... I don't think it's as clever as the techniques in this paper, however.
GHC (the main Haskell compiler at this point) does some extremely amazing stuff on the compiler front, it is probably at the forefront of static analysis. And the Haskell community is extremely motivated to making amazing machinery.
A lot of Haskell stuff is based around its lazy evaluation model though
> Most of the applications I use take roughly 0% of my processor's capacity.
We're talking about a server-side vulnerability in OpenSSL here, not applications running on your personal computer.
Roughly speaking, making your server code twice as slow means it will cost you twice as much money to run your servers. Of course, that depends a lot on what exactly you're doing and is obviously not always true... but OpenSSL is a very performance-critical piece of most server-side software in the wild.
If OpenSSL suddenly became twice as slow, it would cost a lot of people a lot of money.
> Also random BS benchmark says haskell is at least half as fast as C.
I never said my "quick searching" was exhaustive. I suspect the relative performance is heavily dependent on what exactly is being done in the code.
"If OpenSSL suddenly became twice as slow, it would cost a lot of people a lot of money."
Perspective check: We are talking about a situation in which OpenSSL had NO SECURITY, has had NO SECURITY for two years, and an unknown amount of existing caught traffic is now vulnerable. If the NSA did not already know about this bug (and given that it is not hard to imagine the static analysis tool that could have caught this years ago, it's plausible they've known for a while), they are certainly quite busy today collecting private keys before we fix things, so what security there may have been is now retroactively undone. (Unless you used PFS, which I gather is still rare. In other news, turn that on!)
Do not argue as if you're in a position in which OpenSSL experienced a minor bug, so let's all just calm down here and not make such radical arguments. We are in a position in which OpenSSL was ENTIRELY INSECURE and has been for years, because of a trivial bug that can pretty much ONLY happen in the exact language that OpenSSL was implemented in! Virtually no other language still in use could even have had this bug. This is not a minor thing. This is not something to wave away. This is a profound, massive failure. This is the sort of thing that ought to bury C once and for all, not be glossed over. (As for theoretical arguments that C could be programmed in ways that don't expose this, if the OpenSSL project is not using them, I'm not entirely convinced they really do exist in any practical way.)
If we're OK with bugs that are this critical, heck, I can speed up your encryption even more!
> We are in a position in which OpenSSL was ENTIRELY INSECURE and has been for years, because of a trivial bug that can pretty much ONLY happen in the exact language that OpenSSL was implemented in!
This is incredibly, incredibly false.
Pointers exist. Raw memory accesses exist. Even if you're writing code in a language that hides them from you, they still exist, and there is still potential for somebody to have done something stupid with them. I guarantee you that there are JVM's in the wild with vulnerabilities as severe as this one. Arguing for the use of languages that intentionally cripple the programmer on the theory they make vulnerabilities less likely is silly.
I'm not denying the severity of this issue. But bugs happen. All we can do is fix them, learn from them, and move on. The lesson to be learned here is that really messy code is a big problem that needs to be fixed, because it makes auditing the code prohibitively difficult.
The proper response IMHO is a ground-up rewrite of OpenSSL. A lot of big players use OpenSSL; financing such an endeavor would not be difficult.
But OpenSSL is embedded in many other languages and applications because it's written in C. Show me a low-level language with a stable syntax that fixes the problems caused by using C that can also be embedded in Java, Python, Lisp, Ruby, etc etc. I don't think you can.
Some things need to be in C because they need to be run everywhere, including embedded in applications. No other language does this, to my knowledge.
Is it really a tiny load? Have you ever looked at the throughput values quoted for VPN routers that do their encryption in software (not hardware like the expensive Cisco ASAs)? If you compare the non-VPN throughput with the VPN-throughput, the software encryption is massively massively slower, so I would argue that software encryption is not a tiny load on a normal webserver, unless the webserver was not getting any hits...?
"Performance is a quality of a working system." And doubly so for a cryptographic system.
What good is a cryptographic library that's not secure? Worse than no good. If you're not encrypting your data, you (should) be aware of that, and act accordingly. On the other hand, if you think you're encrypting your data...
A 1000% performance hit means that you have to spend 10x more on hardware, and you have to spend more time engineering for scalability. That extra cost outright kills projects in the womb. If the choice is launching something valuable to users and that pulls in revenue but is flawed, even seriously so, and doing nothing because it's just not feasible to do what you want within any reasonable cost/performance metrics... well then, you have your own anthropic principle right there.
Agreed. Simple code is easy to understand and just as easy to find any bugs in. After looking at the heartbeat spec and the code, I can already see a simplification that, had it been written this way, would've likely avoided introducing this bug. Instead of allocating memory of a new length, how about just validating the existing message fields as per the spec:
> The total length of a HeartbeatMessage MUST NOT exceed 2^14 or max_fragment_length when negotiated as defined in [RFC6066].
> The padding_length MUST be at least 16.
> The sender of a HeartbeatMessage MUST use a random padding of at least 16 bytes.
> If the payload_length of a received HeartbeatMessage is too large, the received HeartbeatMessage MUST be discarded silently.
Then if it's all good, modify the buffer to change its type to heartbeat_response, fill the padding with new random bytes, and send this response. No need to copy the payload (which is where the bug was), no need to allocate more memory.
(Now I'm sure someone will try to find a flaw in this approach...)
My favorite is that the Morris worm dates back to late 1988 when MS was starting the development of OS/2 2.0 and NT. Yea, I am talking about the decision to use a flat address space instead of segmented.
That's why I have high hopes for Rust. We really need to move away from C for critical infrastructure. Perhaps C++ as well, though the latter does have more ways to mitigate certain memory issues.
Incidentally, someone on the mailing list brought up the issue of having a compiler flag to disable bounds checking. However, the Rust authors were strictly against it.
I'm excited about Rust for this reason as well, but in practice I find myself thinking a lot about data moving into and out of various C libraries. The great but inevitably imperfect theory is that those call sites are called out explicitly and should be as limited as possible. It works well but isn't a silver bullet. I'm hopeful that as the language ecosystem matures there will be increasingly mature C library wrappers and (even better!) native, memory-safe, Rust replacements for things.
>I'm hopeful that as the language ecosystem matures there will be increasingly mature C library wrappers and (even better!) native, memory-safe, Rust replacements for things.
This is Rust's greatest promise. Not only is writing memory-safe code possible, but it's also possible for Rust to do anything C is currently doing -- from systems, to embedded, to hard real-time, and so one. The promise of Rust cannot be overstated. And having finally grasped the language's pointer semantics, I've started to really appreciate its elegance. It compares very favorably to OCaml and other mixed paradigm languages with strong functional capabilities.
Promises can be encouraging, but they really do us no good in practice. And it's practice that truly matters.
We really need at least a stable (in terms of the language and its standard libraries) of Rust before it can even be considered as a viable option. Even then, we'll need to see it used seriously in industry for at least a few years by early adopters before it'll be more widely trusted.
We keep hearing about how Rust 1.0 will be available sometime this year, and how there are only a relatively small handful of compatibility-breaking issues to resolve. But until those issues are all resolved and until Rust 1.0 is actually released and usable, Rust just isn't something we can take seriously, I'm afraid to say.
I agree completely. In my opinion, Rust should be looking at reaching a stable language as soon as possible instead of searching for some hard-to-define perfection.
Perfect is the enemy of the good definitely applies here. Any of the last two releases of Rust (0.9 and 0.10) would have made a nice 1.0 release, particularly once managed pointers were moved out from the language core to the standard library.
I also worry about more complexity being added to the language, so the sooner it can reach 1.0, the better. Unfortunately, the Rust community seems to really enjoy bikeshedding, so my hopes for a 1.0 release this year are not very high.
Nonetheless, I've already been wrong about Rust once (re: complexity -- once you learn the admittedly tricky pointer semantics, it's really not that horribly complex). I would love to be proven wrong again.
Good enough can also be the enemy of great. It's a tricky balance. My feeling is that there are already plenty of languages that are mature and stable enough to be good choices for industry but few (if any) that are actively and inclusively defining themselves the way Rust is. It's true that it won't be viable for a good while yet, and that's ok. What's the rush?
A data point: I have a few little Rust projects that rely on some patches to some other libraries; whereas I used to spend above a half-hour compiling and sometimes an hour or two freshening making things compile for the new version, I'm now typically down to about 10 seconds to install the newest nightly and 5 minutes or to fix up some warnings and standard library changes. A stable 1.0 is starting to feel imminent and inevitable to me.
I'd disagree about C++. In my experience, the only things it adds is (1) a false sense of security (since the compiler will flag so many things which are not really big problems, but will happily ignore most overrun issues), (2) lots of complicated ways to screw up, such as not properly allocating/deleting things deep in some templated structure, and (3) interference with checking tools - I got way more false positives from Valgrind in C++ code than in C.
I wish godspeed to Rust and any other language which doesn't expose the raw underlying computer the way C/C++ does, which is IMO insane for application programming.
I'd disagree with your disagreement ;-) C++ has constructs that let you build safer and more secure systems while maintaining good performance. You can write high performance systems without ever touching a raw pointer or doing manual memory management which is something that you can't really do in any other language. Yes, you need to trust the underlying library code which is something you have to do for pretty much any other language.
In my experience well written C++ has a lot less security issues vs. similar C code. We've had third party security companies audit our code base so my statement has some anecdotal support in real world systems.
I second this point. The keyword here is "modern" C++, which encourages people to write shared-memory symantics, and to create strategies that make it impossible to screw up.
"New" is a thing of the past, along with the need to even see T *.
These days good code in C++ is so high-level, one almost never sees a pointer, much less an arbitrarily-sized one. This is of course unless you're dealing with an ancient library.
Another thing of importance:
If you're working with collection iteration correctly (which tends to be the basis for a lot of these out of bounds errors), there is no contesting the beginning offset, or the end offset - much less evaluating whether that end is in sight. Even comparisons missing stronger types showing "this thing is null terminated vs that is not" can be eliminated if you just create enough definitions for the type-system to defend your own interests. If you're coding defensively, these shouldn't even be on the menu. One buffer either fits in the other for two compatible types, or you simply don't try it at all.
It'd be amazing what standard algorithms can leave to the pages of history, if they were actually put to good use.
Python has some nice high-level concepts with their "views" on algorithmic data streams that show where modern C++ is headed with respect to bounds checking, minus the exceptions of course :)
But the problem is that there are people who have been coding C++ for 20 years, who never quite got all of these newfangled smart pointer things and just want the "C with classes" parts of C++.
Or you have the cargo cult programmers, who don't know the language very well, so just pick up idioms from random internet postings or from some old part of the codebase that's been working for a while, so it must be a good source of design tips.
Remember, any time you talk about the safety of a language, you have to think about its safety in the hands of a mediocre programmer who's in a hurry to meet a deadline. Because that's where the bulk of the problems slip in; not when people are actively and deliberately coding defensively.
Does anyone have experience with (auditing) systems built using pascal/delphi? I realize that Ada might be a better choice if headed in that direction -- but it always felt to me like pascal actually had a pretty sane trade-off between high and low level, and that C generally gave a lot of the bad parts of assembly while still managing to obfuscate the code without really giving that much in terms of higher level structure.
Pascal gets a bad rap, but that's due to the limitations of the original "pure" language. Something like Turbo Pascal which took features from Modula-2 is actually be a very good alternative to C for systems programming.
- Stricter type checking than C, (e.g. an error to add a centigrade value to a Fahrenheit value without casting)
- Bounds checked arrays and strings
Turbo Pascal added:
- Selectively turn off bounds checking for performance
- Inline assembler
- Pointers and arbitrary memory allocation
I don't think there's anything you can do in C that you can't do in TP. For example, it was easy to write code that hung off the timer or keyboard interrupts in MSDOS, which is pretty low level stuff.
The important thing is that the safe behaviour should be the default, you have to mark unsafe areas of code explicitly. This is the opposite to how it works with C.
I don't have a reference, but I've seen this myself. Problems can happen whenever pointers combined with the STL or other modern C++ stuff. In the thread from several years ago, I gave as an example pushing a pointer to a local variable into a vector which is then returned somewhere outside of scope. Compilers don't warn about this, or at least didn't then, although Valgrind catches it. And of course this can be a more complicated case, like a vector of maps from strings to structures containing pointers which is passed by reference somewhere - which will make it harder to catch. And Valgrind won't help if it doesn't see it happening in the execution path that you ran.
Now, combining pointers and STL is not a good idea. In fact, using raw pointers in C++ is not a good idea, at least IMO (but you've seen I am a bit concerned about memory safety). However, this is perfectly supported by compilers, and not even seriously discouraged (some guides tell you to be careful out there). I've seen difficult bugs produced by this, in my case, in a complicated singleton object.
> In fact, using raw pointers in C++ is not a good idea, at least IMO
Not just your opinion; it's become the "standard of practice" in the C++ development community.
They're trying to get it that between STL, make_shared, C++14's make_unique, etc., that you won't actually be using "naked new"s in any but the rarest cases. For the rest of memory management you'd use types to describe the ownership semantics and let the compiler handle the rest.
"The fact is that no programmer is good enough to write code whic is free from such vulnerabilities."
"...you are kidding yourself if you think you can handle this better than the OpenSSL team."
Well, I can think of at least one example that counters this supposition. As someone points out elsewhere in this thread, BIND is like OpenSSL. And others wrote better alternatives, one of which offered a cash reward for any security holes and has afaik never had a major security flaw.
What baffles me is that no matter how bad OpenSSL is shown to be, it will not shake some programmmers' faith in it.
I wonder if the commercial CA's will see a rise in the sale of certificates because of this.
Sloppy programmer blames language for his mistakes. News at 11.
Nothing in the standard prevents a C compiler + tightly coupled malloc implementation from implementing bounds checks. Out-of-bounds operations result in undefined behavior, and crashing the program is a valid response to undefined behavior. If your malloc implementation cooperates, you can even bounds-check pointer arithmetic without violating calling conventions.
It's quite a shame that there isn't a compiler that does this, and it's a project I've considered spending some time on if I can find a big enough block of that to get a solid start.
Unrestricted pointer arithmetic is indeed incompatible with memory safety. You set a pointer to point to one structure, then you change it and it now points to another structure or array. The compiler doesn't know the semantics of your code, so how can it tell if you meant to do that? And malloc/memcpy is way too low to check this stuff. It only sees memory addresses; it has no idea what variables are in them. Tightly coupled would mean passing information like "variable secret_key occupies address such-and-such" into the libc, which does violate POSIX standards, and will result in lots of code breaking. I don't see why we wouldn't just write in C# or Java or Rust, instead of a memory-safe subset of C (and it would have to be a subset).
Edit: here's one project for making a memory-safe C: http://www.seclab.cs.sunysb.edu/mscc/ . Interesting, but (a) it is a subset of C, (b) it doesn't remove all vulnerabilities, and (c) I still don't grok the advantage of using this over a language actually designed for modern, secure application programming.
I'll assume the case you're concerned about is the one legitimately tricky case (where you have an array of structs that include arrays, and you perform arithmetic on a pointer into the inner array), because the other readings I'm coming up with necessarily invoke undefined behavior, either by running off the end of an allocation (what we're checking) or breaking strict aliasing (in which case false positives are OK). Depending on what you do with this pointer (e.g. passing it into a custom memcpy), the compiler may not be able to enforce runtime checks by itself.
This is where we do need some extra help, in the form of a library that holds state for the compiler so that we don't need to instrument our pointers. Nothing in the C standard prevents the compiler from doing this. The library you pass the pointer into may ignore this information, if it doesn't have the necessary instrumentation, but we at least get the capability.
Re: other languages, Rust I will grant. It's the only one of those that's compelling for C's use-cases (Java and C# are both entirely unusable for the best uses of C and C++).
>You set a pointer to point to one structure, then you change it and it now points to another structure or array. The compiler doesn't know the semantics of your code, so how can it tell if you meant to do that?
If you changed it via arithmetic or anything other than direct assignment you have violated the standard. Assuming of course that they are part of separate allocations, pointers from one may not interact with pointers from another except through equality testing and assignment.
You can do it, although at a considerable performance hit. The usual approach is "fat" pointers that include bounds information. Memory safe pointer arithmetic is achieved by checking that any constructed pointer lies within those bounds, and dying noisily if it does not (alternatively, you can test on dereference).
C language environments that worked like this have been commercially available in the past: Saber-C in the '90s, and perhaps earlier, was one example.
One problem is that the obvious implementation technique is to change the representation of pointers (to include base and bounds information, or a pointer to that), which means that you need to redo a lot of the library as well. (Or convert representations when entering into a stock library routine, and accept that whatever it does with the pointer won't get bounds-checked.)
I implemented this once in my C interpreter picoc. Users hated it because it also prevented them from doing some crazy C memory access tricks, so I ended up taking it out.
If you have a char* buf; block you got from network stack and you have to copy buf[3] bytes from the position buf+15 then the compiler doesn't know what to check for if you don't cross the boundary of that buffer.
"Intel MPX is a set of processor features which, with compiler, runtime library and OS support, brings increased robustness to software by checking pointer references whose compile time normal intentions are usurped at runtime due to buffer overflow."
I think clang's AddressSanitizer gets pretty close to what you want. It misses some tricky cases on use-after-return, but other than that it offers pretty robust memory safety model for bounds checks, double free, and so on.
> This vulnerability is the result of yet another missing bound check. It wasn't discovered by Valgrind or some such tool, since it is not normally triggered - it needs to be triggered maliciously or by a testing protocol which is smart enough to look for it (a very difficult thing to do, as I explained on the original thread).
You could also look at this bug as an input sanitization failure. The author didn't consider what to do when the length field in the header is longer than what comes over the wire (even when writing the code in a secure language, this case should be handled somehow, maybe by logging or dropping the packet).
The defined behaviour would be to discard the packet. In a secure language, the buffer would have had a "length" property, and the code would have crashed when a read beyond the buffer's end was attempted. But in C, buffers are just pointers, so there is fundamentally nothing wrong with reading beyond the end of the buffer. So instead of a crash, we get silent memory exposure.
Isn't this basically the whole point of QuickCheck-like testing frameworks? They're basically a specification that is attempted to be falsified in some way by a fuzzer. I don't see why most C projects couldn't be doing this.
I think they don't do this because it's not a widely known testing method, and it's kind of tricky to implement these tests correctly.
But then again, with some dedication, quickcheck-like testing can do a huge amount of work. At work I've implemented these tests for the entire low-level IO framework for our servers and these few tests are a pure meat grinder. It triggered one severe bug that would have downed production in the middle of the night and then some more.
Speaking of proofs, how about we write security critical code in haskell? You need a very simple runtime, but beyond that it would work pretty much wherever.
Most memory-related bugs are automatically eliminated, and security proofs are easier.
Go or Java on top. Coding in C is like juggling chainsaws to say you can juggle them. C is certainly better than old school Fortran where memory management wasn't developed until later, but platforms like Erlang, Go and JRuby are really hard to beat.
The only problem is convincing people to migrate to different tools and transition codebases to another language. It would take a large project like FreeBSD, LLVM or the Linux kernel to move the needle.
Fortran was not meant to be a systems programming language. The fact that it did not have memory management does actually make sense in scientific applications, where you typically know your problem size in advance or can just recompile before a day long computation.
Why port all the security vulns over to Rust? There are already a handful of SSL implementations, it isn't horribly hard to do. Maybe start with http://hackage.haskell.org/package/tls
Be that as it may, porting OpenSSL to any other language is Not Recommended. The code is hideous and the documentation is practically non-existent.
The only reason anyone can recommend using OpenSSL is that it's so widely used and battle worn that vulnerabilities are more likely to be patched than in some arbitrary obscure SSL library without all the warts. If it had been published as-is for the first time in 2014 then no one would touch it.
In addition to that, if you're going to create an SSL implementation in a new language, it would be much preferable to do it without the BSD advertising clause, which you're stuck with if you start with OpenSSL.
Also the library he pointed to has only had 33k downloads total. Can you really suggest that as a replacement for one of the most heavily and read crypto libraries on earth? I wouldn't be surprised if OpenSSL had more than 33k programs that use it as a dependency.
we can plug this seemingly endless source of bugs which has been affecting the Internet since the Morris worm. It has now cost us a two-year window in which 70% of our internet traffic was potentially exposed. It will cost us more before we manage to end it.
Could one make a new kind of OS where C programs are compiled to some intermediate representation then when run this is JIT compiled within a managed hypervisor sandbox? Could Chrome OS become something like this? Does this already exist? MS had a managed code OS called Singularity.
> My opinion, then and now, is that C and other languages without memory checks are unsuitable for writing secure code.
I think they can be used to write secure code, but it has to be done carefully, with really thorough checks and unit tests, and a constant awareness of the vulnerabilities.
Everything I've heard about OpenSSL so far, suggests it was done by a bunch of cowboys who don't care about code quality. Those people shouldn't be writing C, but a safer language.
However, qmail is written in C and has a very good record. So I would disagree with The fact is that no programmer is good enough to write code which is free from such vulnerabilities.
There seem to be at least two programmers who are capable of that.
This argument came up in the thread from a few years ago. It is quite wrong-headed. I would like to give a clear answer to it:
Virtual machines and runtimes may be vulnerable to malicious CODE. That's bad. Programs written in unmanaged languages are vulnerable to malicious DATA. That's horrible and unmitigatable.
Vulns to malicious code are bad, but they may be mitigated by not running untrusted code (hard, but doable in contexts of high security). They are also mitigated by the fact that the runtime or VM is a small piece of code which may even be amenable to formal verification.
Vulns to malicious data, or malicious connection patterns, are impossible to avoid. You can't accept only trusted data in anything user-facing. Also, these vulnerabilities are spread through billions of lines of application and OS code, as opposed to core runtime/VM.
Virtual machines and runtimes may be vulnerable to malicious CODE. That's bad.
Programs written in unmanaged languages are vulnerable to malicious DATA.
Not exatly true. You can still write code vulnerable to input (data) in a "secure" language by accident. C is just especially vulnerable to buffer stuff.
I am afraid you are the one who is not showing signs of having thought about this deeply. What is the ratio of the number of application programs, libraries, and services to the number of VMs and runtimes? Thousands, tens of thousands, millions? Depends on how you count, but it's huge. Reducing the attack surface like this is a big win.
And it is indeed a bad idea to install a browser on a critical server, and to load untrusted sites in it. You can mitigate the problem by not doing that. You can't stop the server from dealing with user data, though, since for many servers, that's what they are for. (If you are not going to deal with untrusted data, it is preferable to disable untrusted connections at as low a level as you can manage).
So reducing the attack surface isn't a laudable goal in your book, because hey the VM itself can have vulnerabilities so there isn't a point? I think the point is that programmers will always make these mistakes and we should limit as much as possible the type of unsafe code that is written to as small an attack vector as possible. You're never going to eliminate vulnerabilities, but we sure can try and reduce the likelihood of them occurring. If there is some objective measurement to be made that says this isn't the case, i.e. the number of JVM vulnerabilities like this outstrip or is on par with client side vulnerabilities that occur in purely C/C++ applications I would love to see it.
Ultimately, I think the better answer will ultimately be a language that inherently provides the primitives for safe memory management but that's low-level and highly peformant, i.e. Rust or something like it.
You're not neccesarily reducing the attack surface. You're adding complexity. While you might reduce the attack surface on low level bugs. You might open yourself to new bug classes.
Downvote me however you guys want. It's just not that easy.
If you could eliminiate the common cold by killing the guy with the running nose, don't you think someone would have done it?
In keeping with the tradition of bad car analogies, that's like saying "Driving cars with automatic traction control won't make accidents go away, so automatic traction control is pointless".
Languages with bounds checks on array accesses don't solve everything, but that doesn't mean that they don't work. They do remove entire classes of silent failures that can potentially slip through the cracks in C-like languages. VMs aren't needed for this -- most of the strongly typed functional languages, D, Go, Rust, and others all compile down to native machine code.
Careful API design, discipline, and good coding in C can also mitigate this sort of problem manually, although (like most things in C), it's extra work, and needs careful thought to ensure correctness.
Do you know of any controlled experiments to test the safety claims for automatic traction control? People used to say similar things about ABS. Then the experiments were done, it turned out to be pointless or possibly dangerous, and people started talking about traction control instead.
Automatic bounds checking could well fail the same way that ABS did: programmers won't bother defining a packet data type, because the compiler will catch any mistakes they make fiddling with arrays. So, like drivers with ABS, programmers with ABC would go faster, but they wouldn't be any safer.
Maybe a better analogy would be roll bars or seat belts: If they help prevent something from breaking, you've already screwed up.
Nothing can prevent bad drivers from driving poorly, and nothing can prevent apathetic programmers from writing insecure code. However, even though I tend to program in C, I can still appreciate environments that will catch dumb mistakes for me and prevent them from turning into security issues.
VMs generally do not have this type of vulnerability (buffer overrun).
Also, most vulnerabilities in (e.g.) the JVM can only be exploited by running malicious code inside the VM. Here, the attacker is supplying data used by OpenSSL, but is not able to supply arbitrary code.
Agree. This needs a big fat the world is coming to an end stlye of warning.
I've just shut down the webservers running SSL that I can control.
If you are vuln and don't want to build openssl from source and can afford the outage. I'd reccomend to do the same.
OTHERWISE BUILD FROM SOURCE IMMEDIATELY, PATCH, AND GET NEW KEYS!
Let's hope CA's don't get swamped by all the CSR's. Or rather let's hope they do so we see people are doing something...
For me right now these are just my hobby projects. So I don't care if they're down. But I imagine it will be fun tomorrow.
Ok, anyone could assist me on how to update openssl without breaking anything? I've fetched newest sources from openssl.org and compiled them, but "make install" doesn't actually install it, it only got compiled, but issuing "openssl version" still gives me the old version.
What I want to do is to patch it so our webserver uses new version.
I would tread lightly here if you aren't comfortable with compiling. Rather than break your website, it might be better to take it down until your distro's packages are available.
You should probably spend your time investigating a good method of reissuing keys for when you get to a stable OpenSSL version.
Some apps have OpenSSL statically compiled into the binaries. Beware that what you think is fixed may not be.
Depending on the distro on which you're based, you may find that making a new package from a source package (e.g. srpm) would be the safest route even if you're in a hurry.
If you're on Ubuntu, it would appear at least the updated base (OpenSSL itself) packages are now in the repos.
Not to sound like a commercial for Cloudflare or anything. But putting your infrastructure behind their services can protect users while they perform their patching. According to their latest blog post http://blog.cloudflare.com/staying-ahead-of-openssl-vulnerab...
[this command generates a private key and server cert and outputs to pem's]
[Note also the key sizes are 4096, you may want 2048. AND I use -sha256, as sha1 is considered too weak nowadays. These certs are valid for 3650 days...10 years]
Since the command overwrites certs/keys in the current directory of the same name as the outfiles...that's it...you're done. Just restart nginx.
If you change a self-signed cert, like above, expect a new warning from the client on the next connection...this is just your new cert being encountered. Click permantly accept..blah blah.
and got the same results as you. How can it be built on January 8th if the patch was just made today?
[EDIT] running
sudo aptitude upgrade
upgraded properly and now I'm getting a version that was compiled earlier today. I'm guessing I needed to update another package as well. Probably `libssl`?
upgrade will work because it updates libssl1.0.0 which is the package you want upgraded :)
Openssl is the command line package and libssl1.0.0 is the library. i was able to upgrade openssl without upgrading libssl1.0.0.
ben@ip-10-0-0-76:~$ dpkg -s libssl1.0.0 |grep Version
Version: 1.0.1e-3ubuntu1
ben@ip-10-0-0-76:~$ dpkg -s openssl |grep Version
Version: 1.0.1e-3ubuntu1
ben@ip-10-0-0-76:~$ sudo apt-get install openssl
...
ben@ip-10-0-0-76:~$ dpkg -s libssl1.0.0 |grep Version
Version: 1.0.1e-3ubuntu1
ben@ip-10-0-0-76:~$ dpkg -s openssl |grep Version
Version: 1.0.1e-3ubuntu1.2
ben@ip-10-0-0-76:~$ openssl version -a
OpenSSL 1.0.1e 11 Feb 2013
built on: Mon Jul 15 12:44:45 UTC 2013
platform: debian-amd64
options: bn(64,64) rc4(16x,int) des(idx,cisc,16,int) blowfish(idx)
compiler: cc -fPIC -DOPENSSL_PIC -DZLIB -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -m64 -DL_ENDIAN -DTERMIO -g -O2 -fstack-protector --param=ssp-buffer-size=4 -Wformat -Werror=format-security -D_FORTIFY_SOURCE=2 -Wl,-Bsymbolic-functions -Wl,-z,relro -Wa,--noexecstack -Wall -DOPENSSL_NO_TLS1_2_CLIENT -DOPENSSL_MAX_TLS1_2_CIPHER_LENGTH=50 -DMD32_REG_T=int -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM
OPENSSLDIR: "/usr/lib/ssl"
ben@ip-10-0-0-76:~$ sudo apt-get install libssl1.0.0
ben@ip-10-0-0-76:~$ dpkg -s libssl1.0.0 |grep Version
Version: 1.0.1e-3ubuntu1.2
ben@ip-10-0-0-76:~$ openssl version -a
OpenSSL 1.0.1e 11 Feb 2013
built on: Mon Apr 7 20:33:19 UTC 2014
platform: debian-amd64
options: bn(64,64) rc4(16x,int) des(idx,cisc,16,int) blowfish(idx)
compiler: cc -fPIC -DOPENSSL_PIC -DZLIB -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -m64 -DL_ENDIAN -DTERMIO -g -O2 -fstack-protector --param=ssp-buffer-size=4 -Wformat -Werror=format-security -D_FORTIFY_SOURCE=2 -Wl,-Bsymbolic-functions -Wl,-z,relro -Wa,--noexecstack -Wall -DOPENSSL_NO_TLS1_2_CLIENT -DOPENSSL_MAX_TLS1_2_CIPHER_LENGTH=50 -DMD32_REG_T=int -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM
OPENSSLDIR: "/usr/lib/ssl"
i wonder how many people will do apt-get update openssl and assume they have fixed it
I got a "security warning" update when I logged in to the server (good), ran apt-get and installed, did openssl version, got the string as noted above (which seemed just a tad out of date).
So... I built and installed from source, and got... the same string.
Interestingly, your tool claims our website (SSL-terminated at our ELB instance) is still vulnerable; while this other tool (http://possible.lv/tools/hb) claims we are unaffected.
Another, known unpatched, app is reported to be affected by both tools.
Is it possible that FiloSottile/Hearbleed may report false positives?
From what I've learned, it reports back if it gets something, when it should get nothing.
How vulnerable a specific site is depends on luck. Yahoo must have broken a whole bunch of mirrors because total amateurs can send mail.yahoo.com a certain blob of code and it has a good chance of returning a stranger's password.
Well, I was interested in actually testing it out in code. I got it working with the pyOpenSSL bindings (I had to expose struct ssl_method_st, SSL_get_ssl_method, ssl_write_bytes and rebuild cryptography for pyOpenSSL.) Fun times.
This thing has been in the wild for two years. What are the odds it hasn't been systematically abused? And what does this imply?
To me it sounds kind of like finding out the fence in your backyard was cut open two years ago. Except in this case the backyard is two thirds of the internet.
Worse, it's retroactively unfixable: Even doing all this [revoking certs, new secret keys, new certificates] will still leave any traffic intercepted by the attacker in the past still vulnerable to decryption.
So it would be a good idea to change all your passwords to critical services like email and banks, once they have issued new certs and updated their openssl.
That's slightly misleading. Every private key disclosure leads to decryption of past traffic unless forward secrecy is used.
However, if you switch to a fixed version of OpenSSL now, then an attacker cannot retroactively exploit this bug even if they have recorded all your past traffic, because exploiting the bug requires a live connection.
(Of course, this only applies to attackers who did not know about the bug before it was publicly released, so some worry is still justified. I only wanted to point out that the "retroactively unfixable" is a misleading exaggeration.)
I think what was meant is that since exploiting this bug leaves no trace, you should automatically consider every master key ever loaded to a vulnerable OpenSSL application to be already compromised. As nothing says this is the first discovery of the bug, one should consider that the black hats have already been exploiting this for long before the first public disclosure.
Yes, but there are other ways to compromise TLS sessions. For example, if you're using session tickets, the ticket key could be in RAM. Or, the session master keys themselves could be leaked. Still, you're _much_ better off with Forward Secrecy -- in most cases keys ticket keys are rotated with server restarts; so are session master keys.
I was thinking of the scenario of old traffic being recorded by someone. Unless they also extracted the session key at that time, that traffic should be secure if PFC was enabled even if someone where to extract the server key now.
It seems like you are somewhat new to the Debian utopia. Here is another great package that a lot of people are not aware of `apt-listbugs.` After you say "yes" to apt-get upgrade, apt-listbugs queries bts for bugs in the packages:version you are about to install. If any bugs are found you have the chance to review the report to see if it applies to you and if it does you can have apt-listbugs pin the package so that the new buggy version is not installed. Every night at midnight (i think) apt-listbugs queries bts to see if the bugs are still relevant and unpins the package if the bug is no longer relevant. It is especially handy for testing/unstable/experimental.
By default it only prompts you for grave-serious bugs. I have been bitten a couple of times by "important" bugs so set listbugs up so that it also checks for "important" bugs. This makes it a tiny bit noisier but not enough to make me switch to the defaults. Changing the severities is easy:
I just wanted to point out that you really do not need the `apt-get clean.` Obviously your work flow is your business but I wanted to speak up in case you thought it was needed before upgrading packages.
Just received an upgrade on Ubuntu 12.04 LTS as well, apt-get clean issued before updating.
EDIT: If you are using DigitalOcean, the update is not yet on their mirrors. Issue 'sudo sed -i "s/mirrors\.digitalocean/archive.ubuntu/g" /etc/apt/sources.list;sudo apt-get clean;sudo apt-get update;sudo apt-get upgrade' to get the patch. Check the comment by 0x0 above ( https://news.ycombinator.com/item?id=7549842 ) to find any services which need restarting.
Basically yes. However, from my experience, package update urgencies are no good indicator of the updates's actual priority.
It's in the +*-security" channels and you're supposed to apply all updates from there.
Node.js sort-of dodged a bullet here. It includes a version of openssl that it links against when building the crypto module (and, I would think, the tls module). Node.js v0.10.26 uses OpenSSL 1.0.1e 11 Feb 2013.
What worries me about this is that the commit that fixes it [0] doesn't include any tests. Is that normal in crypto? If I committed a fix to a show-stopper bug without any tests at my day job I'd feel very amateur.
What a great writeup. Comprehensive without being overly verbose, answers to "what does this mean?" and "does this affect me?", and clear calls to action.
While I'm not happy at having to spend my Monday patching a kajillion machines, I welcome more vulnerability writeups in this vein.
What you probably want is to re-key your cert, do not revoke it. Revoking with some CA's (such as GoDaddy) means to essentially cancel the remainder of the valid date forever and requires purchasing a new cert to secure the same domain. You are forfeiting the rest of its value.
When you re-key, it will automatically deactivate the previous cert and is free. It also gives you the opportunity to update to SHA-2 or increase the key to 2048 bit, which you should do unless you have unusual and extreme legacy support needs (and must keep SHA-1 a while longer).
I disagree. Revoking the certificate is a requirement. If you re-key without revoking, that means someone who has stolen your key could impersonate you until the validity period expires. So revoking is a needed if you want to inoculate yourself against a potential active man-in-the-middle attack.
If you want to be secure, make sure the certificate based on your old key is showing up in the certificate revocation list (CRL), and/or any online certificate status protocol (OCSP) servers it specifies.
Well, I don't think it's anything in memory, but whatever was up to 64k from wherever the downloaded packet was put in userspace (Edit: Er, 64k at a time, but the attacker can try again over and over). Since the kernel should be handing only zeroed pages to userspace to use as a buffer then it should only be memory used by the process using openssl at risk.
The big problem is that this is still a gigantic range of processes (and possible memory buffer contents). But SSH at least would appear to be fine, unless you've ever transferred an SSH key over TLS using OpenSSL.
(What I want now is an exploit.c, PoC.py, pwnSSL.rb, etc... but I guess it would be irresponsible to provide that to the script-kiddies of the interwebz right now)
The part that's caused me to read this page several times over without a clear answer is that they mention that private keys may be leaked, but their calls to action do not recommend generating new private keys. How does that make any sense?
>this leak requires patching the vulnerability, revocation of the compromised keys and reissuing and redistributing new keys. Even doing all this will still leave any traffic intercepted by the attacker in the past still vulnerable to decryption. All this has to be done by the owners of the services.
Yes, down in the Q&A of details of what's leaked, not in the "here's what you need to do" section. It makes you think, "wait, the details say reissue keys...why does the 'what you need to do' section not say that? Did I misunderstand?" It's not very clearly written. Not to mention "revocation of the compromised keys" is itself vague: which keys are compromised? "The crown jewels" of course. We must infer that we're talking about the SSL private keys. And again, because revoking those keys is not mentioned in the call to action, you're forced to question whether your inference is correct.
As an actionable bulletin, this page leaves a lot to be desired. Nice logo and domain name, though.
I believe the reason they got access was one of their customers found it and reported it to them, and they reported it to OpenSSL, and then it somehow leaked (either with the OSSL release, or someone else) and then they posted their now-public writeups of it.
That's not correct. One of the individuals who discovered the bug contacted us as a large provider of SSL termination services. We were asked not to further disclose the details until it was officially patched and announced by OpenSSL. The official announcement occurred today after which we put up a post to let our customers know that they were protected.
I wonder who else was notified early? I noticed Apple's ocspd was downloading an unusual amount of data back on March 31. Could be unrelated, but Apple and other big software vendors would make sense for early notification.
Oh it's even worse, basically every secret you had in your server processes' RAM was potentially read in real-time by an attacker for the last 2 years.
It can only access memory of the process running openssl. So if you got nginx in front of your webprocesses they are protected. However anything in the nginx process is accessible (e.g certificates).
Yes, to be clear (esp. for others reading this thread) this is really bad, but shouldn't be able to compromise your ssh server keys.
However -- ssl certs and session keys are a likely target, and combined with passively logging traffic that is enough to compromise all data going over ssl, such as login/passwords and data.
Problem servers include not only web servers, but also imap/pop and smtp servers supporting tls (via openssl -- afaik gnutls isn't vulnerable to this bug).
Honestly, why aren't the formal verification people jumping on this? I keep hearing about automatic code generation from proof systems like Coq and Agda but it's always some toy example like iterative version of fibonacci from the recursive version or something else just as mundane. Wouldn't cryptography be a perfect playground for making new discoveries? At the end of the day all crypto is just number theory and number theory is as formal a system as it gets. Why don't we have formal proofs for correct functionality of OpenSSL? Instead of a thousand eyes looking at pointers and making sure they all point to the right places why don't we formally prove it? I don't mean me but maybe some grad student.
Yes, why doesn't the same thing exist for SSL? The fact that quark was funded by the NSF means that there is interest in actually doing stuff like this.
I think the summary is a bit too sensationalistic in terms of what the actual security implications are:
The Heartbleed bug allows anyone on the Internet to read the memory of the systems protected by the vulnerable versions of the OpenSSL software.
Yes, while that's true, it's not a "read the whole process' memory" vulnerability which would definitely be cause for panic. The details are subtle:
Can attacker access only 64k of the memory?There is no total of 64 kilobytes limitation to the attack, that limit applies only to a single heartbeat. Attacker can either keep reconnecting or during an active TLS connection keep requesting arbitrary number of 64 kilobyte chunks of memory content until enough secrets are revealed.
The address space of a process is normally far bigger than 64KB, and while the bug does allow an arbitrary number of 64KB reads, it is important to note that the attacker cannot directly control where that 64KB will come from. If you're lucky, you'll get a whole bunch of keys. If you're unlucky, you might get unencrypted data you sent/received, which you would have anyway. If you're really unlucky, you get 64KB of zero bytes every time.
Then there's also the question of knowing exactly what/where the actual secrets are. Encryption keys (should) look like random data, and there's a lot of other random-looking stuff in crypto libraries' state. Even supposing you know that there is a key, of some type, somewhere in a 64KB block of random-looking data, you still need to find where inside that data the key is, what type of key it is, and more importantly, whose traffic it protects before you can do anything malicious.
Without using any privileged information or credentials we were able steal from ourselves the secret keys
It really helps when looking for keys, if you already know what the keys are.
In other words, while this is a cause for concern, it's not anywhere near "everything is wide open", and that is probably the reason why it has remained undiscovered for so long.
It's not hard to screen what's returned for chunks that look like they could be keys (you know the private key's size by looking at the target's certificate, you know it's not all zeros, etc.) and then simply exhaustively check chunks against their public key.
I just looked at one of my running apache processes, it only has 3MB of heap mapped (looked at /proc/12345/maps). That's not a whole lot of space to hide the keys in.
I agree entirely with your post, and I can't quite understand the hysteria in this thread. The odds of getting a key using this technique are incredibly low to begin with, let alone being able to recognize you have one, and how to correlate it with any useful encrypted data.
Supposing you do hit the lottery and get a key somewhere in your packet, you now have to find the starting byte for it, which means having data to attempt to decrypt it with. However, now you get bit by the fact that you don't have any privileged information or credentials, so you have no idea where decryptable information lives.
Assuming you are even able to intercept some traffic that's encrypted, you now have to try every word-aligned 256B(?) string of data you collected from the server, and hope you can decrypt the data. The amount of storage and processing time for this is already ridiculous, since you have to manually check if the data looks "good" or not.
The odds of all of these things lining up is infinitesimal for anything worth being worried about (banks, credit cards, etc.), so the effort involved far outweighs the payoffs (you only get 1 person's information after all of that). This is especially true when compared with traditional means of collecting this data through more generic viruses and social engineering.
So, while I'll be updating my personal systems, I'm not going to jump on to the "the sky is falling" train just yet, until someone can give a good example of how this could be practically exploited.
I have successfully extracted a key and decrypted traffic in a lab. I'm refining my automatic process. You're forgetting analysis of the runtime layout of OpenSSL in RAM which is quite predictable on machines without defensive measures. I have a 100% success rate extracting memory and about a 20% success rate programmatically extracting the secret key of the server. I'm nearly 100% against a certain version of Apache with standard distribution configuration.
I did this with no formal CS education and about 400 lines of code. I'm an operations engineer, not a security expert. Once I get it 100% and review my situation legally, I'll probably publish what I have.
Now is not the time to be conservative. Efforts to downplay this vulnerability are directly damaging to the Internet's security and, given that you are a single-issue poster, suspicious.
runtime layout of OpenSSL in RAM which is quite predictable on machines without defensive measures
I think this is also part of the problem, if it's just storing the keys in plaintext. I've analysed some protection systems which have gone to great lengths to make sure that this isn't the case - where keys are permuted, broken up into randomly-sized chunks, and scattered amongst other randomly generated data, all of which gets moved around in memory in a random fashion periodically. Some of the state required to obtain the key is outside the process.
Obviously this makes encryption/decryption operations a lot slower. "Security costs speed. How safe do you want to be?"
I'm not sure you even need to do any analysis of the runtime layout of OpenSSL, to be honest, though no doubt it increases the efficiency of the attack.
Given the power of this code as a weapon and certain circumstances in my personal life, I will be consulting legal advice before doing anything with it. There's a pretty good chance I will be advised against releasing it.
All you need is a debugger against your target service and an ability to recognize patterns. And no, ASLR doesn't help much.
>Supposing you do hit the lottery and get a key somewhere in your packet, you now have to find the starting byte for it, which means having data to attempt to decrypt it with. However, now you get bit by the fact that you don't have any privileged information or credentials, so you have no idea where decryptable information lives.
Login page of any SaaS will be transmitted over SSL and you'll know what it looks like a priori.
I'm very curious to see the change that introduced the bug in the first place. According to the announcement it was introduced in 1.0.1. That's the version that added Heartbeat support, so maybe it was a bug from the beginning.
Probably to make it more clear what you're referring to, and double-check yourself. There are probably components that are 1 byte, 2 bytes, and 16 bytes long. Writing it out makes it clear and eliminates a chance for human error in the sum, more than a magic 19 does. (I guess 16 is pretty magical too, though. At least it's a "round" number, and in context may be a well-known field size of something in the protocol.)
Yeah, this. It is fairly common to see things like this in C-like languages when it comes to times, like when including the milliseconds in a day you might see:
int timeout = 1000 * 60 * 60 * 24;
(milliseconds in a second * 60 for minute, * 60 for hour, * 24 for hours in a day).
Much more obvious than just putting in 86400000, and the compiler will optimize away the math and putting the math in there explicitly is arguably better than a comment that could easily become unanchored from the real value (if someone changes the value and forgets to update the comment).
When it comes to byte-sizes of things, though, most code will use sizeof() to both make it more clear where these numbers are coming from and to make them automatically adjust if the structure sizes change (granted this is unlikely to happen on a mature protocol).
At the very least having them be preprocessor defines would certainly make things a lot more clear here, so even for C I'd consider this a bit of a "code smell" (though the people who work on this code regularly are probably versed enough in the ssl3 record structure enough that they immediately grok this when they see it).
After reading your comment, I started looking back at the packets I got using the script on a site I knew was not patched. Damn.. there are plaintext passwords in there for paypal.
Does SSH (specifically sshd) on major OSes use affected versions of OpenSSL? [answer pulled up from replies below: since sshd doesn't use TLS protocol, it isn't affected by this bug, even if it does use affected OpenSSL versions]
What's the quickest check to see if sshd, or any other listening process, is vulnerable?
(For example, if "lsof | grep ssl" only shows 0.9.8-ish version numbers, is that a good sign?)
The bug is in the handling of the TLS protocol itself (actually, in a little-used extension of TLS, the TLS Record Layer Heartbeat Protocol), and isn't exposed in applications that just use TLS for crypto primitives.
First off, TLS is crypto bread-and-butter that's used for a lot more than HTTPS. You're not out of the woods because you're not running a webserver.
Second, SSH itself doesn't use TLS; it has its own protocol, so sshd isn't vulnerable.
But third, read overflows like this can be escalated in countless ways to total compromise if some credential, key, canary, or such gets leaked. So just because sshd isn't vulnerable doesn't mean you're not screwed.
Not only HTTPS. Many other protocols are TLS-based: modern email, some VPNs, etc. Really almost everything secret on the Internet is protected by TLS; SSH is a rare exception.
I am having a lot of trouble figuring out what you were attempting to convey in the first four sentences in your comment.
The one thing that I can discern is that you printed a list of every package that depends on libssl1.0.0 for your configured repositories. But you have no idea if those programs make use of heartbeat. ssh (and everything related like libpam-ssh) is on that list and does not use TLS. The same can be said for many others such as tpm-tools, ntp/ntpdate/openntpd, xca and so on.
when openssh uses certificates it still uses its own protocol (even with the x509 patch - without that, the certificates used by openssh are not even the same kind of certificate as those used by openssl).
the problem with the openssl library is in the implementation of the TLS protocol. this is not used by openssh. so openssh is not affected by this problem. even when certificates are used. and even when x509 certificates are used (which requires a separate patch).
This doesn't sound like "responsible disclosure" to me - how can Codenomicon dump this news when all the major Linux vendors don't have patches ready to go ?
Well someone was able to give Cloudflare a heads up last week [1].
It would have been nice if the package maintainers could have had time to build ready-to-roll solutions with Heartbeat compiled out prior to the official OpenSSL fix.
> Recovery from this bug could benefit if the new version of the OpenSSL would both fix the bug and disable heartbeat temporarily until some future version... If only vulnerable versions of OpenSSL would continue to respond to the heartbeat for next few months then large scale coordinated response to reach owners of vulnerable services would become more feasible.
This sounds risky to me. I'm afraid attackers would benefit more from this decision than coordinated do-gooders.
That is my concern as well. We are still running CentOS 6.4 which does not have the affected version of OpenSSL, but we terminate SSL at the ELB so if they are affected then are keys are not safe.
The forum thread has just been updated with this reply:
"We can confirm that load balancers using Elastic Load Balancing SSL termination are vulnerable to the Heartbleed Bug (CVE-2014-0160) reported earlier today. We are currently working to mitigate the impact of this issue and will provide further updates."
Rackspace guy here. We have been digging in and it appears that we did have the impacted version of openssl installed but the heartbeat extension was disabled. Regardless, we have updated everything on the Cloud Load Balancer side to 1.0.1g. I will update here if we find anything different.
What are the chances that the NSA is having a field day with this in the 24-48 hours that it will take everyone to respond? Also, is it possible that CA's have been compromised to the point where root certs should not be trusted?
What are the odds that the NSA didn't already know about it? Even if you don't think they would have deliberately monkeywrenched OpenSSL (as they are widely believed to have done with RSA's BSAFE), they certainly have qualified people poring over widely used crypto libraries, looking for missing bounds checks and all manner of other faults --- quite likely with automated tooling.
As to CAs, there have been enough compromises already from other causes that serious crypto geeks like Moxie Marlinspike are trying to change the trust model to minimize the consequences --- see http://tack.io
What's interesting is that RFC 1122 from 1989 warned about problems like these, and gave a very good approach to prevent them from occurring:
At every layer of the protocols, there is a general rule whose application can lead to enormous benefits in robustness and interoperability
[IP:1]: "Be liberal in what you accept, and conservative in what you send"
Software should be written to deal with every conceivable error, no matter how unlikely; sooner or later a packet will come in with that particular combination of errors and attributes, and unless the software is prepared, chaos can ensue. In general, it is best to assume that the network is filled with malevolent entities that will send in packets designed to have the worst possible effect. This assumption will lead to suitable protective design, although the most serious problems in the Internet have been caused by unenvisaged mechanisms triggered by low-probability events; [...]
This is too much by at least one order of magnitude.
What's the going price for a crypto-level code review
(I'm not even saying audit) these days?
Is all this code necessary for state-of-the art encryption or
isn't it rather backwards compatibility baggage?
If the latter: how much could be gained by splitting the project
into '-current' and '-not'?
Thanks! So how does this work: Say I have this project and I want
it audited -- would you (or the company/person that you
had in mind) give me an estimate like "I'd need 3 weeks for
25, 5 weeks for 50 or 10 weeks for 95% coverage" or do you simply analyse away for a week (or whatever time I'm willing to pay you) and try to find something?
I don't have personal experience with it, but apparently these things are booked months in advance, on a contract basis. The engineer doing the audit spends an agreed number of weeks finding as many problems as they can, and hand you a report at the end.
That cheap? A freelance web/Mobile developer can charge over $5K per week, I find it hard to believe that you could get quality security code review for that price
Great writeup but I guess I'm still a bit confused. As someone responsible for rails servers I can see that I need to update nginx and openssl as soon as packages become available or compile myself. What about keys though? Do I need to get our SSL certs re-issued? regenerate SSH keys? Anything else that I should be doing?
If you're running a vulnerable version of OpenSSL and want to be truly careful, assume your private keys (not just certs) are already compromised. Once new packages are available, you need to update and then re-roll your crypto.
Also, if you're using those keys to protect other secrets like passwords - say, DB credentials or AWS keys stored in an HTTP-hosted Git repo behind - you can't really assume those are safe either.
I don't quite understand how this bug works. I would appreciate any input from someone knowledgeable.
It sounds like the heartbeat code is sending some data in the handshake. That data should be harmless (padding? zeroes?) but the bug results in reading off the end of an array and from whatever other data happens to be there. Someone sniffing the connection can then see those bytes fly by. If they happened to contain private info, game over.
Is that a correct read on the situation? If so, my followup questions are: 1) Why is there any extra data being sent at all beyond a simple command to "heartbeat"? 2) How much data is being leaked here and at what rate? Is it a byte every couple of hours, is it kilobytes per minute, or what?
I am particularly interested in #1, since that's the part I really don't get at the moment. I suspect the answer to #2 will be implied by the answer to #1.
>>> TLS heartbeat consists of a request packet including a payload; the other side reads and sends a response containing the same payload (plus some other padding).
So, what happens is that the payload comes in as a pointer and a size (up to 64kb). The server then prepares a response and copies the memory block [pointer, pointer+payloadSize] into the request.
The attack happens when the payload is smaller than the payload size passed in the request. This results in the response preparation dumping the memory block [pointer+realPayloadSize, pointer+payloadSize] into the response.
Any data in this block is now exposed to the callee; and could contain any data from the process.
Thanks. That lines up with what I've seen elsewhere too. I think the main thing I was missing was that this is not a sniffing attack, but rather an active attack where you talk to a peer over SSL and basically trick it into sending you some content from its memory.
Can attacker access only 64k of the memory?There is no total of 64 kilobytes limitation to the attack, that limit applies only to a single heartbeat. Attacker can either keep reconnecting or during an active TLS connection keep requesting arbitrary number of 64 kilobyte chunks of memory content until enough secrets are revealed.
...so I guess the answer to 2 is only limited by how frequently you can change the heartbeat settings, and how frequently OpenSSL will send a heartbeat packet.
One obvious - if slightly paranoid - answer is that this was a deliberate backdoor. There appears to be a length field specific to the heartbeat packet that's used to determine how much data from the original packet is included in the response, isn't checked against the actual packet length, and allows lengths up to 64k which is unnecessarily generous for the intended purpose but very useful for this attack.
It does take time for these things to be tested and deployed. Regardless of severity of bug, distributions must test packages before sending them out to all their users.
It would be unfortunate if a new package were to be released immediately only to be soon masked/recalled due to unforeseen consequences.
Of note, the Gentoo package was bumped approximately 2 hours after the advisory was published.
Yeah, I haven't seen any new RPMs for RHEL/CentOS/Fedora yet. Kinda concerning, since I'd expect vendors to be given advance notice and the chance to prep updates to coincide with the announcement.
All my RHEL5 boxes are running 0.9.8, though, at least.
Homebrew has updated to 1.0.1g since 6:00PM GMT. It's important to note that this isn't an issue unless you have an outward facing service that uses TLS and the brew/macports library
One (selfish) question I have is whether this can affect primary key material stored in an HSM. I'm assuming not, but that the session key generated by the HSM would still be susceptible.
Note that this bug affects way more programs than just Tor — expect everybody who runs an https webserver to be scrambling today.
"If you need strong anonymity or privacy on the Internet, you might want to stay away from the Internet entirely for the next few days while things settle." - torProject
Any chance this bug originated with the NSA? It seems like it would fall under their goal of subverting the infrastructure that keeps secrets on the internet. Of course this is exactly why such a goal is a bad idea - an unprotected internet causes widespread damage.
I don't know -- why don't you try reasoning it out since you're the one lobbing the accusation. Upon a very simple review of the code change/patch, one can see this is a relatively new feature, agreed upon and passed by the publicly available IETF, implemented naively.
"Never attribute to malice that which can be adequately explained by incompetence" -- slightly-butchered quote, from someone smarter than me.
It's not an accusation, it's a speculation. I don't have the ability to judge it for myself, i.e. "a simple review of the code change/patch". That's why I put it out there. I don't mind being refuted, but I wish it would be refuted rather than just downvoted blindly.
P.S. I think your quote doesn't capture the situation properly when someone is known to have malicious intent.
I don't think so - while the NSA would dearly like to have the access that this vulnerability would allow, they would dislike even more if anyone could have it. If they're going to insert a backdoor they're going to be damn sure only they have the key.
they did not try to "weaken RSA", as in the RSA algorithm. They paid off and/or infiltrated RSA the corporation. You were not attacked, your posts simply contained wrong information and useless speculation.
Screaming about the NSA every time a security bug comes up is not interesting, productive, insightful, or useful, please stop.
Asking "did they do this?" is not an accusation, seriously.
And even then, the NSA has had their fingers in enough places and lied about it enough times (infiltrating FOSS projects was explicitly one of their goals, IIRC) that the sane default position would be to assume shenanigans on their part unless proven otherwise.
The vetting process does absolutely nothing to prevent something like this from happening, especially since some very sneaky and pernicious bugs can be introduced in the guise of simple mistakes. It would be foolish to assume this isn't part of the standard playbook, and just as foolish to discount the possibility of maliciously introduced bugs just because the evidence doesn't immediately point to malicious intent - that is the nature of the attacker.
The alternative is remaining ignorant and vulnerable to the single most well funded and experienced adversary a crypto user will ever likely face.
We really need to see some of the big companies take down their services until they've fixed this and call out for every company out there to audit themselves and confirm to users that this is serious and should be checked and that no service should stay online until they've patched their systems. This should get attention beyond just techies. Business as usual is not acceptable since every day that goes by is the opportunity for someone to take advantage of this and get the keys to your service and all past traffic.
I would not be surprised if people at the NSA, GHCQ and most state security services are going into overdrive right now to get access to anything and everything that is vulnerable to this bug.
> I would not be surprised if people at the NSA, GHCQ and most state security services are going into overdrive right now to get access to anything and everything that is vulnerable to this bug.
I assume the NSA has known about this bug for a long time and has been actively exploiting it.
Note: if you use mint.com, it's likely hitting your banks with your login on your behalf today. You'll still want to change those passwords even if you didn't use banking sites during the known vulnerability window.
So, Google and Codenomicon independently found this two-year-old vulnerability at approximately the same time? How does that happen? Are they both looking at the same publicly-shared fuzzing data, or was there a patch that suddenly made it more obvious?
The obvious concern would be that one found it a good while ago, and just didn't bother announcing it until the other team was anyway. I don't believe that's what happened here, but I'm curious what the mechanism actually was.
Is there a way to tell if a third-party site has patched the bug? (Upgraded to 1.0.1g) Not much point in changing your password on that site before the vulnerability is fixed.
All references I see recommend (for 1.0.1-series) to move to 1.0.1g - but the OpenSSL homepage[0] says that 1.0.1g is a Work in Progress. There is a download[1] link for it though. Anybody have definitive answer for what's going on here? It's a little confusing.
I used the OpenSSL library for building a SAML token parser in JBoss (java). All the front end stuff was java and OpenSSL was used for public/private key decryption and validation of SAML tokens and signatures. I'm not sure exactly what an OpenSSL "server" -- it sounds like there is a feature which you can implement (or not) in your webserver to test the SSL/TLS listener.
However, you could -- as I did -- use anything else as your interface for the web. Why would you specifically include a heartbeat for just SSL is beyond me. If a website is up and running, you'll know it with the usual methods, the https codes. You don't need a separate "heartbeat" for telling you that an internal mechanism for processing a protocol is running...do you?
Testing my externally-accessible OpenVPN server revealed that it is indeed vulnerable. I just powered the box off, going to be a long day at work before I can get home and fix it :/
How to build openSSL statically into a source build of Nginx, just finished running this with nginx-1.4.7 and openSSL-1.0.1g and it compiled just fine. You'll have to tweak it to your environment of course.
What popular SSL client software uses the vulnerable OpenSSL? (Any web browsers, for example on popular linuxes? How about 'curl' when connecting to HTTPS sites?)
How would a client be compromised? I mean I guess a malicious server could send these bad heartbeat packets and sniff the keys, but if the server is pwned then your secrets are already revealed, right?
Imagine you've got a script that, among other things, does a 'wget' against some innocent plain HTTP URL. But an attacker intercepts your request, and redirects you to an HTTPS URL of their choosing.
Yes, wget uses OpenSSL, and follows redirects silently by default.
Now that server uses heartbleed to x-ray your client process memory, collecting all sorts of confidential information, including perhaps credentials to other services.
This bug has a lot of nasty, unintuitive permutations and repercussions that will take time to fully grasp.
What I find strange is that I have a VPS setup on Digital Ocean, with Ubuntu LTS + OpenSSL 1.0.1 + a manually compiled Nginx. This combination should have been vulnerable, yet my website is not reported as vulnerable by the tools I tried for detecting the vulnerability.
Maybe DigitalOcean issued a fix without me noticing? I also updated my Ubuntu packages, yet OpenSSL is still at 1.0.1.
Tinfoil-hat time: is it interesting that within hours (?) of public disclosure of the bug, there's a domain, a logo, a full writeup, everything. The paranoid part of me says the nefarious powers-that-be want me us to use the latest version, as though that would further their goals somehow.
Common sense says I'm just being silly. I just wonder.
How feasible would it be to write things like nginx, Apache, web browsers etc. so that they can use both OpenSSL and NSS, where you could choose what to use via config switch? Then it would be easy to "fix" such a bug when it occurs. The probability that both libraries have a vulnerability at the same time is probably very low.
OK well I just updated about 40 servers. Has anyone started working with CAs to reissue SSL certificates signed with a new key? Are they willing to do the reissue for free? In particular I use RapidSSL for most things and Verisign for a few bigger clients who prefer it.
I don’t; but I do not know how I could ever be sure. I’m a generalist sys admin and my knowledge of crypto is limited to the basics. That being said my understanding is that this vulnerability is in the code that creates the sessions not in the certificates themselves. The risk is that my key already was compromised when I was using the vulnerable version. For me this means two things:
1) There is no easy way for me to confirm or deny the CA is fixed short of attempting to exploit them.
2) Even if the CA is not fixed the vulnerability appears to in the routines used for session management not in the SSL certificate itself. While there is cc information and other stuff I would not like to be leaked, the CSR itself only contains my public key not my private key. As long as my servers are patched and I have a SSL cert using a new keypair that I know has not being compromised; I am not sure if the CA's version of openssl maters or not.
I am in no way trying to pretend I am an expert. I am sure there are problems with my analysis but it still feels like its time to be pragmatic and get a fix in place before asking all the what-ifs. Not that those questions should not be asked but it’s a mater of prioritizing.
Would you be somewhat better protected i.e. (not loosing private keys, etc) if your machine sat behind a load balancer ? The memory exposed would be that of the load balancer correct ?
How is it that Google and Hotmail were not vulnerable? Were they using their own implementations of SSL? I would have figured Google would make use of OpenSSL.
it means if you're running a bad version of openssl then someone can dump the entire contents of your ram, including public/private keys, and anything that is in memory such as passwords and even DB connections.
As far as I can tell, openvpn with TLS authentication is vulnerable as it just uses the usual TLS suite. If you use PSKs or the (mis-named?) --tls-auth PSK additional MAC, then you are only owned if one of your own legitimate nodes revealed the PSK (or was coopted into performing this attack) in which case you're already owned.
Ideally, no, but with SSL, and a very-strong password, it might not be worse than other options for automating payouts... until a major bug like this comes along.
For comparison, the Bitcoin RPC password timing bug - https://github.com/bitcoin/bitcoin/issues/2838 - would have been a more slower and blatant/detectable way to compromise the same sorts of bitcoin RPC daemons.
I'm fairly confident that you're wrong about "nobody"... but you'll have to find the examples yourself.
Yes, it requires extra settings in bitcoin.conf: to enable RPC, accept connections from non-local addresses, and use SSL. But it's only 5 lines if the host is not otherwise firewalled from the net:
But couldn't it be the case that with this bug you could sweep private keys from server's memory if they happen to be in there, because bitcoind is using them at the moment?
Yes, I would put that under the "implications for Bitcoin web services that use HTTPS" category. For context, the deleted question asked about the "Bitcoin blockchain" so I was responding to that specifically, with the added caveat for Bitcoin services that use HTTPS.
So, basically, it is the consequence of "quickly adding an implementation" of an extension of the TLS protocol to otherwise mature, more-or-less solid and "slightly" audited (at least by OpenBSD and FreeBSD teams) code base. OK. It happens.
btw, is OpenBSD affected or they did the job well by not blindly adding an unnecessary stuff (extensions) and bumping the versions without auditing the changes?
"goto fail;" doesn't seem that bad now huh.
Lovely how these GNU/Linux freedom fighters were LOLling their asses off earlier, but when it happens to them they sweat themselves and cry for spoon-fed instructions to compile a software package from its sources.
My opinion, then and now, is that C and other languages without memory checks are unsuitable for writing secure code. Plainly unsuitable. They need to be restricted to writing a small core system, preferably small enough that it can be checked using formal (proof-based) methods, and all the rest, including all application logic, should be written using managed code (such as C#, Java, or whatever - I have no preference).
This vulnerability is the result of yet another missing bound check. It wasn't discovered by Valgrind or some such tool, since it is not normally triggered - it needs to be triggered maliciously or by a testing protocol which is smart enough to look for it (a very difficult thing to do, as I explained on the original thread).
The fact is that no programmer is good enough to write code which is free from such vulnerabilities. Programmers are, after all, trained and skilled in following the logic of their program. But in languages without bounds checks, that logic can fall away as the computer starts reading or executing raw memory, which is no longer connected to specific variables or lines of code in your program. All non-bounds-checked languages expose multiple levels of the computer to the program, and you are kidding yourself if you think you can handle this better than the OpenSSL team.
We can't end all bugs in software, but we can plug this seemingly endless source of bugs which has been affecting the Internet since the Morris worm. It has now cost us a two-year window in which 70% of our internet traffic was potentially exposed. It will cost us more before we manage to end it.