My opinion, then and now, is that C and other languages without memory checks are unsuitable for writing secure code. Plainly unsuitable. They need to be restricted to writing a small core system, preferably small enough that it can be checked using formal (proof-based) methods, and all the rest, including all application logic, should be written using managed code (such as C#, Java, or whatever - I have no preference).
This vulnerability is the result of yet another missing bound check. It wasn't discovered by Valgrind or some such tool, since it is not normally triggered - it needs to be triggered maliciously or by a testing protocol which is smart enough to look for it (a very difficult thing to do, as I explained on the original thread).
The fact is that no programmer is good enough to write code which is free from such vulnerabilities. Programmers are, after all, trained and skilled in following the logic of their program. But in languages without bounds checks, that logic can fall away as the computer starts reading or executing raw memory, which is no longer connected to specific variables or lines of code in your program. All non-bounds-checked languages expose multiple levels of the computer to the program, and you are kidding yourself if you think you can handle this better than the OpenSSL team.
We can't end all bugs in software, but we can plug this seemingly endless source of bugs which has been affecting the Internet since the Morris worm. It has now cost us a two-year window in which 70% of our internet traffic was potentially exposed. It will cost us more before we manage to end it.
TLS heartbeat consists of a request packet including a payload; the other side reads and sends a response containing the same payload (plus some other padding).
In the code that handles TLS heartbeat requests, the payload size is read from the packet controlled by the attacker:
pl = p;
pl is the pointer to the actual payload in the request packet.
Then the response packet is constructed:
/* Enter response type, length and copy payload */
*bp++ = TLS1_HB_RESPONSE;
memcpy(bp, pl, payload);
The bug is that the payload length is never actually checked against the size of the request packet. Therefore, the memcpy() can read arbitrary data beyond the storage location of the request by sending an arbitrary payload length (up to 64K) and an undersized payload.
I find it hard to believe that the OpenSSL code does not have any better abstraction for handling streams of bytes; if the packets were represented as a (pointer, length) pair with simple wrapper functions to copy from one stream to another, this bug could have been avoided. C makes this sort of bug easy to write, but careful API design would make it much harder to do by accident.
This kind of tool (SSL) should be written in ada or haskell.
C and C++ are just fine, the fact that the OpenSSL guys cocked it up is not the language's fault, it is theirs. There are efficient ways to prevent this type of bug.
The parent had a good point and you should really try to look at Haskell before you say that kind of nonsense.
All the tools that are available for static analysis are basically extra type systems bolted on top of existing languages.
If you try to detect buffer overflows using static analysis of the linux kernel what you need to do is to is go through the source code and define invariants. Those invariants are TYPES in languages powerful enough to express them.
For example the invariant that memory, or any resource allocated must be freed can be expressed in Haskell.
In C++ it cannot be expressed. There are workarounds like RAII, but that does not give any guarantees.
If you do not think type systems and thus languages make any differences, you also cannot believe that formal verification makes any difference, because type systems are a weak form of formal verification. How "weak" depends on the language.
You should also read up on the Curry-Howard correspondence to learn something about the deep connections between types, programs, and proofs.
Besides the language peculiarities, a garbage collected or interpreted language is very vulnerable to side channel attacks because of the large amount of complicated behaviour that is being glossed over by the language runtime. (One example would be garbage collection rounds and timing attacks, but I'm sure smarter people would find tons of features that leak secret information. Another example is on-demand JIT'ing when code becomes hot in certain runtimes. The timing of such a JIT stall could publish information you thought secure.)
Economics plays an invisible part here. Someone writing a library has a limited amount of time to implement some set of features, and to balance that against other needs, like making the code "clean"/pretty and secure. In this case, pretty code and secure code are akin. Consumers would likewise have to balance out feature needs with how likely the code is going to explode. What it comes down to is that you aren't likely to have secure, stable code in a language that doesn't inherently encourage it.
It starts to be clearer then, that the more modern, "prettier" languages offer material benefits in their efforts to be more elegant.
Even in C, Go or Python, I column align any text that is remotely similar, so differences are obvious.
Clean code might be extra work but the net work (maintenance) should amortize less. Reducing cognitive load for large supportable production codebase cannot be underscored enough.
"All versions of the open source Ruby on Rails Web application framework released in the past six years have a critical vulnerability that an attacker could exploit to execute arbitrary code, steal information from databases and crash servers."
"A lingering security issue in Ruby on Rails..."
"Ruby on Rails security updates patch XSS, DoS vulnerabilities"
On the same note C != C++ either and you can write large systems in C++ without ever using memory allocation. You can use only bounds checked functions.
And you can have large security holes if you're not careful, no matter which language you pick.
The first is an example of an error made more common by the language design, the other an example of errors typical for a class of applications. There's a fundamental difference here. There's a ton of reasons to criticize ruby and it brings its own set of flaws and problems, some rooted in the language and some rooted in its ecosystem - but the given examples just show that web applications are hard to get right. That's why this is not "a point not well made" but rather "sorry, you're attacking a strawman here".
The fact that these languages don't automatically do all my system administration tasks for me is not an argument against using them.
There are very fundamental connections between strong typing, program verification, and proofs.
Thus, the argument that Haskell probably has the same, is simply false.
There are large web platforms in Haskell. Yesod is probably the largest eco-system. It is clearly not as well used as RoR, but anyone can dig through large amounts of code to try to find these bugs.
What Haskell has that everyone else has are bugs/misunderstandings in how protocols are implemented. Sometimes there can be fundamental bugs in the run-time-system. However, large classes of bugs are fundamentally less likely to appear than in less safe languages.
For example, here, if the guarantee of functional programming is that a given input leads to a given output and has no memory side effects, then your attack surface area is a lot, lot smaller.
This is a question of priorities. We have speed and security. If you chose C/C++ (non-existent automated checking of memory access) you are chosing speed first, security second.
If security is critical then you need to chose a language that makes array out of bounds access well nigh impossible. This is an easy problem -- we have languages that will give this to us.
What percentage of exploits in the wild come from array (and pointer) access out of bounds? I'd venture to say it is above 50%.
Rather than have programmers everywhere "try hard to be careful" writing this code, let them use a safer language and have a few really smart folk work on optimizing the compiler for said language to make the safety checks faster (e.g. removing provably unnecessary/redundant checks).
People think that chosing C/C++ has a better business case (i.e. better performance / scaling) because "being really careful" works most of the time. The problem is when heartbleed (or the next array out of bounds access bug) hits the the business case's ROI no longer looks so much better than the safer path.
A better language won't eliminate all security holes but it can eliminate a huge class of them and allow engineers to focus the energy they used to spend on "being really careful about array access and pointers" on other tasks (be they security, performance or feature related).
EDIT: stating the obvious .. there are good uses for C style languages but writing large bodies of software that needs to be resistant to malicious user attacks is not one of them.
BTW Amazon AWS/ELM is vulnerable, confirmed publically by their support.
I gave this some thought earlier today, and expect that address space randomisation can make this bug eventually expose the server keys. You need to hit an address that has been just vacated from a (crashed) httpd worker.
Most implementations clear encryption key material on exit, but a crashed process never got to run that code.
Of course, servers helpfully just start themselves back up again.
As for scanning for key material, I wonder how to tell that 256-bit random data is the 256-bit random data you want.
The Cold Boot attack paper by Halderman, Schoen et al. here
...discusses this in detail in chapter 6, Identifying Keys in Memory.
EDIT: fixed the reference
If you can dredge up 64kB of fresh data every time, that's 511,744 tests per shovelful which is quite a bit to sift through from a performance perspective but it's also a trivially parallel task.
Additionally, folk might know of even better ways to narrow that down. For example, the data representation in memory might have easy to grep for delimiters.
256 bits is 32 bytes
If you get 64kB of payload data back each time then it can only contain 65536-31=65505 different consecutive strings of 32 bytes.
The RFC doesn't mention why there has to be a payload, why the payload has to be random size, why they are doing an echo of this payload, why there has to be a padding after the payload. If this data is just a regular C struct like the RFC makes it out to be (I didn't know you could have a struct with a variable size, but apparently the fields are really pointers or it's just a mental model and not a real struct).
Apparently the purpose of the payload is path MTU discovery. Something that is supposed to happen at the IP layer, but I don't know enough about datagram packets. I guess an application may want to know about the MTU as well...
I'm not here to point fingers, I'm just saying C is a nightmare to me and a reason for me to never be involved with system programming or something like drafting RFC's ;-).
But if one can argue that C is a bad choice for writing this stuff, then that is not an isolated thing. "C" is also the language of the RFCs. "C" is also the mindset of the people doing that writing. After all, the language you speak determines how you think. It introduces concepts that become part of your mental models. I could give many examples, but that's not really the point.
And it's about style and what you give attention to. To me, that RFC is a real bad document. It starts to explain requirements to exceptional scenario's (like when the payload is too big) before even having introduced and explained the main concepts and the how and why's.
So while you may argue that this is a C problem and not a protocol problem, it is really all related.
And you may also say, in response to someone blaming these coders, that blame is inappropriate (and it is) because these are volunteers and they are donating their free time to something to find valuable, the whole distribution and burden of responsibility is, naturally, also part of the culture and how people self-organize and so on.
As someone else explained (https://news.ycombinator.com/item?id=7558394) the protocol is real bad but it is the result of more or less political limitations around submitting RFCs for approval. There is no reason for the payload in TLS (but apparently there is in DTLS) but my point is simply this:
If you are doing inelegant design this will spill over into inelegant implementation. And you're bound to end up with flaws.
Rather than trying to isolate the fault here or there, I would say this is a much larger cultural thing to become aware of.
But it's a disingenuous one. It ignores the realities of systems. The reality is that there is currently no widely available memory-safe language that is usable for something like OpenSSL. .NET and Java (and all the languages running on top of them) are not an option, as they are not everywhere and/or are not callable from other languages. Go could be a good candidate, but without proper dynamic linking it cannot serve as a library callable from other languages either. Rust has a lot of promise, but even now it keeps changing every other week, so it will be years before it can even be considered for something like this.
Additionally, although the parsing portions of OpenSSL need not deal with the hardware directly, the crypto portions do. So your memory-safe language needs some first-class escape hatch to unsafe code. A few of them do have this, others not so much.
It's fun to say C is inadequate, but the space it occupies does not have many competitors. That needs to change first.
Anyway, I am quite realistic about the prospect of my comment having that kind of effect on the industry - I don't suffer from delusions of grandeur. I was aiming the comment more at people who choose C/C++ for no good reason to write a user-level app; that app is nearly certain to have memory use errors, and if it has any network or remote interface, chances are they can be easily exploited. I'd like as many people as possible to understand that they can't expect to avoid such errors, any more than one of the most heavily audited pieces of software avoided them. We have had decades of exploits of this vulnerability, and yet most programmers are oblivious to it, or think only bad programmers are at risk. So just as tptacek goes around telling people not to write their own crypto, I go around telling people - with less authority and effectiveness, unfortunately - not to write C/C++ code unless they really need to.
As for the performance issues forcing OpenSSL to use C, well, we apparently exposed all our secrets in the pursuit of shaving off those cycles. I hope we are happy.
Just one thing: when I brought up talking directly to the hardware, I did not mean just for performance's sake. Avoiding side-channel attacks often requires to have high control over the generated machine code, and that is the primary reason to not do it in higher-level languages (unless they also permit that level of control).
This is as much a cultural as a technical problem; C really is in the last chance saloon for this sort of problem, we have the solution to hand, but a strong cadre of developers will still only consider C for this sort of work.
I love C and C++, and each one has its place, but really, they're very different. Almost as much as C++ is to Java, for example.
I'm fine with low-level libraries being written in C++, but would hope that developers expose a C API around everything.
Edit: I say this having used VHDL quite a bit. I appreciate its type strictness and ranges.
I've been advocating Ada's use on HN for a few years now, but it always falls on deaf ears. People seem to think it's old and dead like COBOL or old FORTRAN, but it's really a quite modern language that's extremely well thought out. Its other drawback is that it's pretty ugly and uses strange names for things (access is the name given to pointer like things, but Ada specifies if you have say a record with a 1 bit Bool in it, you must be able to create an access to it, so a pointer is not sufficient).
Note: I'm not suggesting it for current production use, but rather as something that could be expanded further in the future.
"IRONSIDES is an authoritative DNS server that is provably invulnerable to many of the problems that plague other servers. It achieves this property through the use of formal methods in its design, in particular the language Ada and the SPARK formal methods tool set. Code validated in this way is provably exception-free, contains no data flow errors, and terminates only in the ways that its programmers explicitly say that it can. These are very desirable properties from a computer security perspective."
Personally, I like Ada a lot, except for two issues:
1. IO can be very, very verbose and painful to write (in that way, it's a bit like Haskell, although not for the same reason). Otherwise, it's a thoroughly modern language that can be easily parallelized.
2. Compilers for modern Ada 2012 are only available for an extravagant amount of money (around $20,000 last I heard) or under the GPL. Older versions of the compiler are available under GPL with a linking exception, but they lack the modern features of Ada 2012 and are not available for most embedded targets. And the Ada community doesn't seem to think this is much of a problem (the attitude is often, "well, if you're a commercial project, just pony up the money"). The same goes for the most capable Ada libraries (web framework, IDE, etc.) -- they're all commercial and pretty costly. Not an ideal situation for a small company.
But yes, Ada is exceptionally fast and pretty safe. There's a lot of potential there, but to be honest my hopes are pinned on Rust at this point.
For example, it has only just gotten (partial) Ada 2012 features, a full 3-4 years after the AdaCore GPL compiler.
A larger problem, in my opinion, is that things like OpenSSL are used (And should be!) from N other languages. As a result, calling into the library requires almost by definition lowest-common denominator interfaces. Which is C.
C code calling into Rust can certainly be done, but I believe it currently prohibits using much of the standard library, which also removes a lot of the benefits.
C++ doesn't, I think, have as much of a problem there, but I'm somewhat skeptical of C++ as a silver bullet in this case.
I don't know about Ada and any other options.
Nimrod  also generates C, but as I understand it garbage collection is unavoidable.
I know that historically it's been easier to write a code generator than a compiler backend, but with LLVM you get the backend just as cheaply as the code gen.
This way it would be Rust -> LLVM IR -> C -> GCC for MSP430.
What are you referring to here?
The problem is, you do need them. There's no 1:1 C# -> C extraction. You have to extract at least some part of the CLR along with.
For the other points there is some debate, but don't most serious languages have a C FFI?
I googled around just now for some benchmarks on the overhead of FFIs. I found this project  which measures the FFI overhead of a few popular languages. Java and Go do not look competitive there; Lua came surprisingly on top, probably by inlining the call.
Before you retort with an argument that a few cycles do not matter that much, remember that OpenSSL does not run only in laptops and servers; it runs everywhere. What might be a small speed bump on x86 can be a significant performance problem elsewhere, so this is something that cannot be simply ignored.
Considering that in C the plusone call is 4 or so cycles, and the Java example is 5 times slower, that's only 20 or so cycles. If the function we're FFIing into is 400 cycles, that's only a 1% decrease in speed. I'm willing to pay that price if it means not having to wake up to everything being vulnerable every couple of months.
There are other means of achieving a secure implementation, such as programming in a very high-level language, such as Cryptol, and compiling to a low-level language:
Aren't Operating Systems lower level than OpenSSL?
Functional programming's unpopularity is not rooted in any real or imagined inability to write operating systems.
Perhaps they would balk at the idea of writing a kernel in Haskell, but it has been done before.
http://metasepi.org (mostly in Japanese)
http://www.ipa.go.jp/files/000036232.pdf (Slides in English)
If you looked at all for "best practices", things like RIAA, stl/boost, and other concepts became very clear, very quickly, and these are the types of thing that limit these kinds of bugs (RIAA, in particular). Now, to be fair, I was writing crypto-related software, so I was paying very close attention, but I didn't really have to hunt.
Maybe a language like Rust will offer a safer alternative at some point, but that point surely isn't today, and probably not tomorrow, either. Maybe there are other languages that offer better safety, but they often bring along their own set of very serious drawbacks.
In terms of writing relatively safe code today, that performs relatively well, that can integrate easily with other libraries/frameworks/code, and can be readily maintained, the use of C++ with modern C++ techniques is often the only truly viable option.
Rust absolutely does offer a safer alternative today. The only problem with Rust at this point is that the standard library is in a state of great flux, which makes it hard to use the language for serious projects. But the memory safety is there.
And even with all the changes, there are at least a couple of companies using the last tagged release of Rust in production.
That said, I fervently hope that Rust can hit 1.0 soon (as in this year). A lot of people are looking to move on from C and C++ at this point, but a lot are moving to Go, D, or Nimrod because Rust has been beta for so long (yes, I know Go is technically not in the same tier as Rust, D, and Nimrod). Once they put in the effort of learning these languages, they're unlikely to switch to Rust, thus missing out on all the safety guarantees that Rust offers.
I should probably finish Bjarne's C++11 book - I am maintaining a codebase of old style C++ and seem to be stuck in the old methods of doing it, mainly because of using compilers that don't have C++11 support.
Is there any recommended reading on new style C++ other than Bjarne's book?
Just the ones who don't understand how good API designs can work well to solve these problems, don't worry not all of us are like that :)
For example, a shared library that implements SSL would have to be a shim for something living in a separate process space.
That is a Haskell implementation of TLS. It is written in a language that has very strong guarantees about mutation, and a very powerful type system which can express complex invariants.
Yes, crypto primitives must be written in a low level language. C is not low level enough to write crypto, neither securely nor fast, so that's not an argument in its favor.
I vehemently disagree. Well-written C is very easy to audit. Much much moreso than languages like C# and Java, where something I could do with 200 lines in a single C source file requires 5 different classes in 5 different files. The problem with C is that a lot of people don't write it well.
Have you looked at the OpenSSL source? It's an ungodly f-cking disaster: it's very very difficult to understand and audit. THAT, I think, is the problem. BIND, the DNS server, used to have huge security issues all the time. They did a ground-up rewrite for version 9, and that by and large solved the problem: you don't read about BIND vulnerabilities that often anymore.
OpenSSL is the new BIND; and we desperately need it to be fixed.
(If I'm wrong about BIND, please correct me, but AFICS the only non-DOS vulnerability they've had since version 9 is CVE-2008-0122)
> but we can plug this seemingly endless source of bugs which has been affecting the Internet since the Morris worm.
If we're playing the blame game, blame the x86 architecture, not the C language. If x86 stacks grew up in memory (that is, from lower to higher addresses), almost all "stack smashing" attacks would be impossible, and a whole lot of big security bugs over the last 20 years could never have happened.
(The SSL bug is not a stack-smashing attack, but several of the exploits leveraged by the Morris worm were)
Including people responsible for one of the most important security-related library in the world. No matter how good and careful a programmer is, they are still human and prone to errors. Why not put every chance on our side and use languages (e.g. Rust, Ada, ATS, etc.) that make entire classes of errors impossible? They won't fix all problems, and definitely not those associated with having a bad code base, but it'd still be many times better than hoping people don't screw up with pointers lifetime.
I don't think intentionally preventing the programmer from doing certain things the computer is capable of doing on the theory it makes errors impossible makes sense.
As I've said several times in this thread, somebody has to deal with the pointers and raw memory because that's the way computers work. Using a language where the language runtime itself handles such things only serves to abstract away potential errors from the programmer, and prevents the programmer from doing things in more efficient ways when she wants to. It can also be less performant, since the runtime has to do things in more generic ways than the programmer would.
> Including people responsible for one of the most important security-related library in the world.
I think you've hit on a crucial part of the problem: practically every software company on Earth uses OpenSSL, but not many of them pay people to work on it.
calvinow@Mozart ~/git/openssl $ git log --format='%aE' | grep -Po "@.*" | sort -u -
> I don't think intentionally preventing the programmer
> from doing certain things the computer is capable of
> doing on the theory it makes errors impossible makes
> somebody has to deal with the pointers and raw memory
> because that's the way computers work
You're confusing the difference between syntactical restrictions and actual restrictions on what one can make the computer do.
I define "things the computer is capable of" as "arbitrary valid object code executable on the CPU". (Valid here meaning "not an illegal instruction".) Any language that prevents me from producing any arbitrary valid object code is inherently restrictive. C allows me to do this. I can even write functionless programs in C, although it's often non-portable and requires mucking with the linker. If the CPU has some special instruction I want to use, I can use inline assembly.
Any language that prevents me from doing arbitrary pointer arithmetic and memory accesses prevents me from doing a lot of useful things I can do in C. See my other comment about linked lists with 16-bit pointers on a 64-bit CPU.
My understanding of Rust is that its pointers have similar semantics to the "safe_pointers" in C++. If that's the case, my understanding is that it would prevent me from doing things like the 16-bit linked list (please, correct me if I'm wrong).
> in C, I can make the computer do absolutely anything I
> want it to in exactly the way I want it to. Maybe I
> don't like 8-byte pointers on my 64-bit CPU, and I want
> to implement a linked list allocating nodes from a
> sparse memory-mapped pool with a known base address
> using 16-bit indexes which I add to the base to get the
> actual addresses when manipulating the list? That could
> be a big win depending on what you're doing, and
> (correct me if I'm wrong) there is no way to do that in
> Java or Haskell.
Bugs will still occur, just in a different way: Java is advocated as being a much "safer" language, but how many exploits have we seen in the JRE? Going to more restrictive, more complex languages in an attempt to fix these problems will only lead to a neverending cycle of increasing ignorance and negligence, combined with even more restrictive languages and complexity. I believe the solution is in better education and diligence, and not technological.
Very few. I don't think I can remember ever seeing an advisory for Java's SSL implementation.
Yes, bugs are possible in all languages, but that doesn't mean there's no difference between languages. I'm reminded of Asimov: "When people thought the earth was flat, they were wrong. When people thought the earth was spherical, they were wrong. But if you think that thinking the earth is spherical is just as wrong as thinking the earth is flat, then your view is wronger than both of them put together."
(There are a large number of bugs in the browser plugin used for java applets, but they have no relation to the JRE itself)
There are languages that make it very very hard to write bad code. Haskell is a good example of where if your program type-checks, there's a high chance it's probably correct.
C is a language that doesn't offer many advantages but offers very many disadvantages for its weak assurances. Things like the Haskell compiler show that you can get strong typing for free, and there's no longer many excuses to run around with raw pointers except for legacy code.
I really wish people would stop saying this. It's not in the slightest bit true. Making this assertion makes Haskellers seem dangerously naive.
(It's not free: you need to link in the entire Haskell runtime system, which is not small. But you can absolutely do it.)
Sure, but how much slower is Haskell than an equivalent implementation in C? Some quick searching suggests numbers like 1000% slower... and no amount of security is worth a 1000% performance hit, let alone a vague "security mistakes are less likely this way" sort of security. Being secure is useless if your code is so slow that you have to run so many servers you don't make a profit.
Could the Haskell compiler be improved to the point that this isn't a problem? Maybe. Ultimately I think the problem is that Haskell code is very unlike object code, and that makes writing a good compiler very difficult. C is essentially portable assembler; translating it to object code is much more trivial.
> C is a language that doesn't offer many advantages but offers very many disadvantages for its weak assurances.
C offers simplicity. Sure, there are some quirks that are complex, but by and large it is one of the simplest languages in existence. Once you understand the syntax, you've essentially learned the language. Contrast that to Java and C#: you are essentially forced by the language to use this gigantic and complicated library all the time. You are also forced to write your code in pre-determined ways, using classes and other OOP abstractions. In C, I don't have to do that: I can write my code in whatever way I feel makes it maximally readable and performant.
C also offers flexibility: in C, I can make the computer do absolutely anything I want it to in exactly the way I want it to. Maybe I don't like 8-byte pointers on my 64-bit CPU, and I want to implement a linked list allocating nodes from a sparse memory-mapped pool with a known base address using 16-bit indexes which I add to the base to get the actual addresses when manipulating the list? That could be a big win depending on what you're doing, and (correct me if I'm wrong) there is no way to do that in Java or Haskell.
> there's no longer many excuses to run around with raw pointers except for legacy code.
If by "raw pointer" you mean haphazardly casting different things to (void * ) or (char * ) and doing arithmetic on them to access members of structures or something, I agree, 99.9% of the time you shouldn't do that.
Or are you talking about that "auto_ptr" and so-called "smart pointer" stuff in C++? In that case, your definition of "raw pointer" is every pointer I've ever defined in all the C source code I've ever written.
Pointers exist because that's the way computer hardware works: they will never go away. I'd rather deal with them directly, since it allows me to be clever sometimes and make things faster.
With things like stream fusion (http://research.microsoft.com/en-us/um/people/simonpj/papers...) , which I imagine would capture a lot of crypto calls, GHC can generate some very performant code (paper contains examples of hand-written C code being beat by Haskell code, and the C code is far from naive).
There are a lot of tricks at your disposal when you know more about the state of the code. And compilers are usually better than humans in this regard.
As an aside, I'm reasonably sure that GCC can use SIMD instructions for certain bulk memory operations in certain circumstances if you feed it the right -march= parameters... I don't think it's as clever as the techniques in this paper, however.
A lot of Haskell stuff is based around its lazy evaluation model though
Yes it is. Most of the applications I use take roughly 0% of my processor's capacity. I can spare a multiple like that outside of a few hot loops.
Saying 10x slowdown isn't worth it is like saying no one would ever compute on a phone.
Also random BS benchmark http://benchmarksgame.alioth.debian.org/u32/benchmark.php?te... says haskell is at least half as fast as C.
We're talking about a server-side vulnerability in OpenSSL here, not applications running on your personal computer.
Roughly speaking, making your server code twice as slow means it will cost you twice as much money to run your servers. Of course, that depends a lot on what exactly you're doing and is obviously not always true... but OpenSSL is a very performance-critical piece of most server-side software in the wild.
If OpenSSL suddenly became twice as slow, it would cost a lot of people a lot of money.
> Also random BS benchmark says haskell is at least half as fast as C.
I never said my "quick searching" was exhaustive. I suspect the relative performance is heavily dependent on what exactly is being done in the code.
Perspective check: We are talking about a situation in which OpenSSL had NO SECURITY, has had NO SECURITY for two years, and an unknown amount of existing caught traffic is now vulnerable. If the NSA did not already know about this bug (and given that it is not hard to imagine the static analysis tool that could have caught this years ago, it's plausible they've known for a while), they are certainly quite busy today collecting private keys before we fix things, so what security there may have been is now retroactively undone. (Unless you used PFS, which I gather is still rare. In other news, turn that on!)
Do not argue as if you're in a position in which OpenSSL experienced a minor bug, so let's all just calm down here and not make such radical arguments. We are in a position in which OpenSSL was ENTIRELY INSECURE and has been for years, because of a trivial bug that can pretty much ONLY happen in the exact language that OpenSSL was implemented in! Virtually no other language still in use could even have had this bug. This is not a minor thing. This is not something to wave away. This is a profound, massive failure. This is the sort of thing that ought to bury C once and for all, not be glossed over. (As for theoretical arguments that C could be programmed in ways that don't expose this, if the OpenSSL project is not using them, I'm not entirely convinced they really do exist in any practical way.)
If we're OK with bugs that are this critical, heck, I can speed up your encryption even more!
This is incredibly, incredibly false.
Pointers exist. Raw memory accesses exist. Even if you're writing code in a language that hides them from you, they still exist, and there is still potential for somebody to have done something stupid with them. I guarantee you that there are JVM's in the wild with vulnerabilities as severe as this one. Arguing for the use of languages that intentionally cripple the programmer on the theory they make vulnerabilities less likely is silly.
I'm not denying the severity of this issue. But bugs happen. All we can do is fix them, learn from them, and move on. The lesson to be learned here is that really messy code is a big problem that needs to be fixed, because it makes auditing the code prohibitively difficult.
The proper response IMHO is a ground-up rewrite of OpenSSL. A lot of big players use OpenSSL; financing such an endeavor would not be difficult.
Some things need to be in C because they need to be run everywhere, including embedded in applications. No other language does this, to my knowledge.
It wouldn't cost anything except on accelerator boxes.
All other things being equal, yes. But responding to security incidents also costs a lot of people a lot of money.
What good is a cryptographic library that's not secure? Worse than no good. If you're not encrypting your data, you (should) be aware of that, and act accordingly. On the other hand, if you think you're encrypting your data...
Writing Haskell to generate proven-correct C is an approach that is known to work.
People keep saying this, and it keeps not being true.
C has something like over 200 undefined and implementation defined behaviors. That alone makes it a minefield of complexity in practice.
For a value of 'correct' that includes "uses all available memory and falls over".
> The total length of a HeartbeatMessage MUST NOT exceed 2^14 or max_fragment_length when negotiated as defined in [RFC6066].
> The padding_length MUST be at least 16.
> The sender of a HeartbeatMessage MUST use a random padding of at least 16 bytes.
> If the payload_length of a received HeartbeatMessage is too large, the received HeartbeatMessage MUST be discarded silently.
Then if it's all good, modify the buffer to change its type to heartbeat_response, fill the padding with new random bytes, and send this response. No need to copy the payload (which is where the bug was), no need to allocate more memory.
(Now I'm sure someone will try to find a flaw in this approach...)
Incidentally, someone on the mailing list brought up the issue of having a compiler flag to disable bounds checking. However, the Rust authors were strictly against it.
This is Rust's greatest promise. Not only is writing memory-safe code possible, but it's also possible for Rust to do anything C is currently doing -- from systems, to embedded, to hard real-time, and so one. The promise of Rust cannot be overstated. And having finally grasped the language's pointer semantics, I've started to really appreciate its elegance. It compares very favorably to OCaml and other mixed paradigm languages with strong functional capabilities.
We really need at least a stable (in terms of the language and its standard libraries) of Rust before it can even be considered as a viable option. Even then, we'll need to see it used seriously in industry for at least a few years by early adopters before it'll be more widely trusted.
We keep hearing about how Rust 1.0 will be available sometime this year, and how there are only a relatively small handful of compatibility-breaking issues to resolve. But until those issues are all resolved and until Rust 1.0 is actually released and usable, Rust just isn't something we can take seriously, I'm afraid to say.
Perfect is the enemy of the good definitely applies here. Any of the last two releases of Rust (0.9 and 0.10) would have made a nice 1.0 release, particularly once managed pointers were moved out from the language core to the standard library.
I also worry about more complexity being added to the language, so the sooner it can reach 1.0, the better. Unfortunately, the Rust community seems to really enjoy bikeshedding, so my hopes for a 1.0 release this year are not very high.
Nonetheless, I've already been wrong about Rust once (re: complexity -- once you learn the admittedly tricky pointer semantics, it's really not that horribly complex). I would love to be proven wrong again.
A data point: I have a few little Rust projects that rely on some patches to some other libraries; whereas I used to spend above a half-hour compiling and sometimes an hour or two freshening making things compile for the new version, I'm now typically down to about 10 seconds to install the newest nightly and 5 minutes or to fix up some warnings and standard library changes. A stable 1.0 is starting to feel imminent and inevitable to me.
I wish godspeed to Rust and any other language which doesn't expose the raw underlying computer the way C/C++ does, which is IMO insane for application programming.
In my experience well written C++ has a lot less security issues vs. similar C code. We've had third party security companies audit our code base so my statement has some anecdotal support in real world systems.
"New" is a thing of the past, along with the need to even see T *.
These days good code in C++ is so high-level, one almost never sees a pointer, much less an arbitrarily-sized one. This is of course unless you're dealing with an ancient library.
Another thing of importance:
If you're working with collection iteration correctly (which tends to be the basis for a lot of these out of bounds errors), there is no contesting the beginning offset, or the end offset - much less evaluating whether that end is in sight. Even comparisons missing stronger types showing "this thing is null terminated vs that is not" can be eliminated if you just create enough definitions for the type-system to defend your own interests. If you're coding defensively, these shouldn't even be on the menu. One buffer either fits in the other for two compatible types, or you simply don't try it at all.
It'd be amazing what standard algorithms can leave to the pages of history, if they were actually put to good use.
Python has some nice high-level concepts with their "views" on algorithmic data streams that show where modern C++ is headed with respect to bounds checking, minus the exceptions of course :)
Or you have the cargo cult programmers, who don't know the language very well, so just pick up idioms from random internet postings or from some old part of the codebase that's been working for a while, so it must be a good source of design tips.
Remember, any time you talk about the safety of a language, you have to think about its safety in the hands of a mediocre programmer who's in a hurry to meet a deadline. Because that's where the bulk of the problems slip in; not when people are actively and deliberately coding defensively.
- Stricter type checking than C, (e.g. an error to add a centigrade value to a Fahrenheit value without casting)
- Bounds checked arrays and strings
Turbo Pascal added:
- Selectively turn off bounds checking for performance
- Inline assembler
- Pointers and arbitrary memory allocation
I don't think there's anything you can do in C that you can't do in TP. For example, it was easy to write code that hung off the timer or keyboard interrupts in MSDOS, which is pretty low level stuff.
The important thing is that the safe behaviour should be the default, you have to mark unsafe areas of code explicitly. This is the opposite to how it works with C.
Wow, that sounds scary. Do you have any references or further reading about this?
Now, combining pointers and STL is not a good idea. In fact, using raw pointers in C++ is not a good idea, at least IMO (but you've seen I am a bit concerned about memory safety). However, this is perfectly supported by compilers, and not even seriously discouraged (some guides tell you to be careful out there). I've seen difficult bugs produced by this, in my case, in a complicated singleton object.
Not just your opinion; it's become the "standard of practice" in the C++ development community.
They're trying to get it that between STL, make_shared, C++14's make_unique, etc., that you won't actually be using "naked new"s in any but the rarest cases. For the rest of memory management you'd use types to describe the ownership semantics and let the compiler handle the rest.
"...you are kidding yourself if you think you can handle this better than the OpenSSL team."
Well, I can think of at least one example that counters this supposition. As someone points out elsewhere in this thread, BIND is like OpenSSL. And others wrote better alternatives, one of which offered a cash reward for any security holes and has afaik never had a major security flaw.
What baffles me is that no matter how bad OpenSSL is shown to be, it will not shake some programmmers' faith in it.
I wonder if the commercial CA's will see a rise in the sale of certificates because of this.
Sloppy programmer blames language for his mistakes. News at 11.
It's quite a shame that there isn't a compiler that does this, and it's a project I've considered spending some time on if I can find a big enough block of that to get a solid start.
Edit: here's one project for making a memory-safe C: http://www.seclab.cs.sunysb.edu/mscc/ . Interesting, but (a) it is a subset of C, (b) it doesn't remove all vulnerabilities, and (c) I still don't grok the advantage of using this over a language actually designed for modern, secure application programming.
This is where we do need some extra help, in the form of a library that holds state for the compiler so that we don't need to instrument our pointers. Nothing in the C standard prevents the compiler from doing this. The library you pass the pointer into may ignore this information, if it doesn't have the necessary instrumentation, but we at least get the capability.
Re: other languages, Rust I will grant. It's the only one of those that's compelling for C's use-cases (Java and C# are both entirely unusable for the best uses of C and C++).
If you changed it via arithmetic or anything other than direct assignment you have violated the standard. Assuming of course that they are part of separate allocations, pointers from one may not interact with pointers from another except through equality testing and assignment.
One problem is that the obvious implementation technique is to change the representation of pointers (to include base and bounds information, or a pointer to that), which means that you need to redo a lot of the library as well. (Or convert representations when entering into a stock library routine, and accept that whatever it does with the pointer won't get bounds-checked.)
But it's certainly doable.
"Intel MPX is a set of processor features which, with compiler, runtime library and OS support, brings increased robustness to software by checking pointer references whose compile time normal intentions are usurped at runtime due to buffer overflow."
You could also look at this bug as an input sanitization failure. The author didn't consider what to do when the length field in the header is longer than what comes over the wire (even when writing the code in a secure language, this case should be handled somehow, maybe by logging or dropping the packet).
But then again, with some dedication, quickcheck-like testing can do a huge amount of work. At work I've implemented these tests for the entire low-level IO framework for our servers and these few tests are a pure meat grinder. It triggered one severe bug that would have downed production in the middle of the night and then some more.
Most memory-related bugs are automatically eliminated, and security proofs are easier.
It's a crypto DSL that I believe is implemented in Haskell (it compiles to Haskell, C, C++ and a few others).
Go or Java on top. Coding in C is like juggling chainsaws to say you can juggle them. C is certainly better than old school Fortran where memory management wasn't developed until later, but platforms like Erlang, Go and JRuby are really hard to beat.
The only problem is convincing people to migrate to different tools and transition codebases to another language. It would take a large project like FreeBSD, LLVM or the Linux kernel to move the needle.
The only reason anyone can recommend using OpenSSL is that it's so widely used and battle worn that vulnerabilities are more likely to be patched than in some arbitrary obscure SSL library without all the warts. If it had been published as-is for the first time in 2014 then no one would touch it.
In addition to that, if you're going to create an SSL implementation in a new language, it would be much preferable to do it without the BSD advertising clause, which you're stuck with if you start with OpenSSL.
Reimplementing a secure SSL implementation in a secure language is cheaper than porting the broken code.
Could one make a new kind of OS where C programs are compiled to some intermediate representation then when run this is JIT compiled within a managed hypervisor sandbox? Could Chrome OS become something like this? Does this already exist? MS had a managed code OS called Singularity.
I think they can be used to write secure code, but it has to be done carefully, with really thorough checks and unit tests, and a constant awareness of the vulnerabilities.
Everything I've heard about OpenSSL so far, suggests it was done by a bunch of cowboys who don't care about code quality. Those people shouldn't be writing C, but a safer language.
However, qmail is written in C and has a very good record. So I would disagree with The fact is that no programmer is good enough to write code which is free from such vulnerabilities.
There seem to be at least two programmers who are capable of that.
Fundamentally, I think we're going to have to give up on security and start handing out drivers licenses to anyone who wants to use the internet.
So uhm. Yeah, that doesn't work either.
Edit: Btw since HN has this obessions with Tarsnap, it's written in C btw. So you should stop obessing about it and downvote me some more.
Virtual machines and runtimes may be vulnerable to malicious CODE. That's bad. Programs written in unmanaged languages are vulnerable to malicious DATA. That's horrible and unmitigatable.
Vulns to malicious code are bad, but they may be mitigated by not running untrusted code (hard, but doable in contexts of high security). They are also mitigated by the fact that the runtime or VM is a small piece of code which may even be amenable to formal verification.
Vulns to malicious data, or malicious connection patterns, are impossible to avoid. You can't accept only trusted data in anything user-facing. Also, these vulnerabilities are spread through billions of lines of application and OS code, as opposed to core runtime/VM.
Virtual machines and runtimes may be vulnerable to malicious CODE. That's bad.
Programs written in unmanaged languages are vulnerable to malicious DATA.
1. use a managed language
2. use a provable language (Haskell, Idris, etc)
If you have a VM your code is DATA! You can have say format string bugs in managed languages too, see for example: http://blog.stalkr.net/2013/06/golang-heap-corruption-during...
I'm sorry, but you're agrument here is plain bullshit.
Using your logic the whole kernel/browsers etc should be written in a managed language.
A browser is basically a VM for webpages though. I'd bet that chrome/firefox/IE have more severe vulnerabilities than openssl per time though.
I just... You can't argue with bullshit sorry.
Your might get less buffer overruns okay, but is it more secure in the end without any doubt?
I doubt it.
And it is indeed a bad idea to install a browser on a critical server, and to load untrusted sites in it. You can mitigate the problem by not doing that. You can't stop the server from dealing with user data, though, since for many servers, that's what they are for. (If you are not going to deal with untrusted data, it is preferable to disable untrusted connections at as low a level as you can manage).
If you hold everything else constant, yes, less buffer overruns == more secure.
Ultimately, I think the better answer will ultimately be a language that inherently provides the primitives for safe memory management but that's low-level and highly peformant, i.e. Rust or something like it.
You're not neccesarily reducing the attack surface. You're adding complexity. While you might reduce the attack surface on low level bugs. You might open yourself to new bug classes.
Downvote me however you guys want. It's just not that easy.
If you could eliminiate the common cold by killing the guy with the running nose, don't you think someone would have done it?
Languages with bounds checks on array accesses don't solve everything, but that doesn't mean that they don't work. They do remove entire classes of silent failures that can potentially slip through the cracks in C-like languages. VMs aren't needed for this -- most of the strongly typed functional languages, D, Go, Rust, and others all compile down to native machine code.
Careful API design, discipline, and good coding in C can also mitigate this sort of problem manually, although (like most things in C), it's extra work, and needs careful thought to ensure correctness.
Automatic bounds checking could well fail the same way that ABS did: programmers won't bother defining a packet data type, because the compiler will catch any mistakes they make fiddling with arrays. So, like drivers with ABS, programmers with ABC would go faster, but they wouldn't be any safer.
Nothing can prevent bad drivers from driving poorly, and nothing can prevent apathetic programmers from writing insecure code. However, even though I tend to program in C, I can still appreciate environments that will catch dumb mistakes for me and prevent them from turning into security issues.
ESC is certainly shown to be effective, although I don't know about traction control: http://en.wikipedia.org/wiki/Electronic_stability_control#Ef...
Also, most vulnerabilities in (e.g.) the JVM can only be exploited by running malicious code inside the VM. Here, the attacker is supplying data used by OpenSSL, but is not able to supply arbitrary code.