TLS heartbeat consists of a request packet including a payload; the other side reads and sends a response containing the same payload (plus some other padding).
In the code that handles TLS heartbeat requests, the payload size is read from the packet controlled by the attacker:
pl = p;
pl is the pointer to the actual payload in the request packet.
Then the response packet is constructed:
/* Enter response type, length and copy payload */
*bp++ = TLS1_HB_RESPONSE;
memcpy(bp, pl, payload);
The bug is that the payload length is never actually checked against the size of the request packet. Therefore, the memcpy() can read arbitrary data beyond the storage location of the request by sending an arbitrary payload length (up to 64K) and an undersized payload.
I find it hard to believe that the OpenSSL code does not have any better abstraction for handling streams of bytes; if the packets were represented as a (pointer, length) pair with simple wrapper functions to copy from one stream to another, this bug could have been avoided. C makes this sort of bug easy to write, but careful API design would make it much harder to do by accident.
This kind of tool (SSL) should be written in ada or haskell.
C and C++ are just fine, the fact that the OpenSSL guys cocked it up is not the language's fault, it is theirs. There are efficient ways to prevent this type of bug.
The parent had a good point and you should really try to look at Haskell before you say that kind of nonsense.
All the tools that are available for static analysis are basically extra type systems bolted on top of existing languages.
If you try to detect buffer overflows using static analysis of the linux kernel what you need to do is to is go through the source code and define invariants. Those invariants are TYPES in languages powerful enough to express them.
For example the invariant that memory, or any resource allocated must be freed can be expressed in Haskell.
In C++ it cannot be expressed. There are workarounds like RAII, but that does not give any guarantees.
If you do not think type systems and thus languages make any differences, you also cannot believe that formal verification makes any difference, because type systems are a weak form of formal verification. How "weak" depends on the language.
You should also read up on the Curry-Howard correspondence to learn something about the deep connections between types, programs, and proofs.
Besides the language peculiarities, a garbage collected or interpreted language is very vulnerable to side channel attacks because of the large amount of complicated behaviour that is being glossed over by the language runtime. (One example would be garbage collection rounds and timing attacks, but I'm sure smarter people would find tons of features that leak secret information. Another example is on-demand JIT'ing when code becomes hot in certain runtimes. The timing of such a JIT stall could publish information you thought secure.)
Economics plays an invisible part here. Someone writing a library has a limited amount of time to implement some set of features, and to balance that against other needs, like making the code "clean"/pretty and secure. In this case, pretty code and secure code are akin. Consumers would likewise have to balance out feature needs with how likely the code is going to explode. What it comes down to is that you aren't likely to have secure, stable code in a language that doesn't inherently encourage it.
It starts to be clearer then, that the more modern, "prettier" languages offer material benefits in their efforts to be more elegant.
Even in C, Go or Python, I column align any text that is remotely similar, so differences are obvious.
Clean code might be extra work but the net work (maintenance) should amortize less. Reducing cognitive load for large supportable production codebase cannot be underscored enough.
"All versions of the open source Ruby on Rails Web application framework released in the past six years have a critical vulnerability that an attacker could exploit to execute arbitrary code, steal information from databases and crash servers."
"A lingering security issue in Ruby on Rails..."
"Ruby on Rails security updates patch XSS, DoS vulnerabilities"
On the same note C != C++ either and you can write large systems in C++ without ever using memory allocation. You can use only bounds checked functions.
And you can have large security holes if you're not careful, no matter which language you pick.
The first is an example of an error made more common by the language design, the other an example of errors typical for a class of applications. There's a fundamental difference here. There's a ton of reasons to criticize ruby and it brings its own set of flaws and problems, some rooted in the language and some rooted in its ecosystem - but the given examples just show that web applications are hard to get right. That's why this is not "a point not well made" but rather "sorry, you're attacking a strawman here".
The fact that these languages don't automatically do all my system administration tasks for me is not an argument against using them.
There are very fundamental connections between strong typing, program verification, and proofs.
Thus, the argument that Haskell probably has the same, is simply false.
There are large web platforms in Haskell. Yesod is probably the largest eco-system. It is clearly not as well used as RoR, but anyone can dig through large amounts of code to try to find these bugs.
What Haskell has that everyone else has are bugs/misunderstandings in how protocols are implemented. Sometimes there can be fundamental bugs in the run-time-system. However, large classes of bugs are fundamentally less likely to appear than in less safe languages.
For example, here, if the guarantee of functional programming is that a given input leads to a given output and has no memory side effects, then your attack surface area is a lot, lot smaller.
This is a question of priorities. We have speed and security. If you chose C/C++ (non-existent automated checking of memory access) you are chosing speed first, security second.
If security is critical then you need to chose a language that makes array out of bounds access well nigh impossible. This is an easy problem -- we have languages that will give this to us.
What percentage of exploits in the wild come from array (and pointer) access out of bounds? I'd venture to say it is above 50%.
Rather than have programmers everywhere "try hard to be careful" writing this code, let them use a safer language and have a few really smart folk work on optimizing the compiler for said language to make the safety checks faster (e.g. removing provably unnecessary/redundant checks).
People think that chosing C/C++ has a better business case (i.e. better performance / scaling) because "being really careful" works most of the time. The problem is when heartbleed (or the next array out of bounds access bug) hits the the business case's ROI no longer looks so much better than the safer path.
A better language won't eliminate all security holes but it can eliminate a huge class of them and allow engineers to focus the energy they used to spend on "being really careful about array access and pointers" on other tasks (be they security, performance or feature related).
EDIT: stating the obvious .. there are good uses for C style languages but writing large bodies of software that needs to be resistant to malicious user attacks is not one of them.
BTW Amazon AWS/ELM is vulnerable, confirmed publically by their support.
I gave this some thought earlier today, and expect that address space randomisation can make this bug eventually expose the server keys. You need to hit an address that has been just vacated from a (crashed) httpd worker.
Most implementations clear encryption key material on exit, but a crashed process never got to run that code.
Of course, servers helpfully just start themselves back up again.
As for scanning for key material, I wonder how to tell that 256-bit random data is the 256-bit random data you want.
The Cold Boot attack paper by Halderman, Schoen et al. here
...discusses this in detail in chapter 6, Identifying Keys in Memory.
EDIT: fixed the reference
If you can dredge up 64kB of fresh data every time, that's 511,744 tests per shovelful which is quite a bit to sift through from a performance perspective but it's also a trivially parallel task.
Additionally, folk might know of even better ways to narrow that down. For example, the data representation in memory might have easy to grep for delimiters.
256 bits is 32 bytes
If you get 64kB of payload data back each time then it can only contain 65536-31=65505 different consecutive strings of 32 bytes.
The RFC doesn't mention why there has to be a payload, why the payload has to be random size, why they are doing an echo of this payload, why there has to be a padding after the payload. If this data is just a regular C struct like the RFC makes it out to be (I didn't know you could have a struct with a variable size, but apparently the fields are really pointers or it's just a mental model and not a real struct).
Apparently the purpose of the payload is path MTU discovery. Something that is supposed to happen at the IP layer, but I don't know enough about datagram packets. I guess an application may want to know about the MTU as well...
I'm not here to point fingers, I'm just saying C is a nightmare to me and a reason for me to never be involved with system programming or something like drafting RFC's ;-).
But if one can argue that C is a bad choice for writing this stuff, then that is not an isolated thing. "C" is also the language of the RFCs. "C" is also the mindset of the people doing that writing. After all, the language you speak determines how you think. It introduces concepts that become part of your mental models. I could give many examples, but that's not really the point.
And it's about style and what you give attention to. To me, that RFC is a real bad document. It starts to explain requirements to exceptional scenario's (like when the payload is too big) before even having introduced and explained the main concepts and the how and why's.
So while you may argue that this is a C problem and not a protocol problem, it is really all related.
And you may also say, in response to someone blaming these coders, that blame is inappropriate (and it is) because these are volunteers and they are donating their free time to something to find valuable, the whole distribution and burden of responsibility is, naturally, also part of the culture and how people self-organize and so on.
As someone else explained (https://news.ycombinator.com/item?id=7558394) the protocol is real bad but it is the result of more or less political limitations around submitting RFCs for approval. There is no reason for the payload in TLS (but apparently there is in DTLS) but my point is simply this:
If you are doing inelegant design this will spill over into inelegant implementation. And you're bound to end up with flaws.
Rather than trying to isolate the fault here or there, I would say this is a much larger cultural thing to become aware of.