
It’s time for a memory safety intervention - okket
https://tonyarcieri.com/it-s-time-for-a-memory-safety-intervention
======
hannob
There's a lot right in this, but I feel he's chosen the wrong enemy by harshly
criticizing Daniel Stenberg (the curl dev). He's not a C zealot, he's just
accepting realities.

Curl should be replaced by something memory safe. But it cannot happen today,
because the infrastructure isn't there. We don't have memory safe languages
suitable for system programming that's widely available on all kinds of
architectures. Rust may become that language, but today it is not. It's not as
portable, not as easy to integrate etc. pp.

~~~
zeveb
> We don't have memory safe languages suitable for system programming that's
> widely available on all kinds of architectures.

Yes, yes we do. Go is available on a ton of architectures; Lisp is available
on a ton of architectures; ML is available on a ton of architectures.

Yes, C is available on more. Yes, for some architecture C is the only
reasonable choice. But on x86 & ARM, there's no good reason to write new code
in C.

~~~
DowsingSpoon
What about applications with soft or hard real time requirements? (So, for
example, a video game or an audio player.) I am not aware of any GC language
which is suitable for use in this space. Go certainly is not.

What about any application that cares in the slightest about performance? In
this case, memory layout is incredibly important, and many (most?) of those
higher level languages are immediately disqualified.

~~~
fauigerzigerk
Correct me if I'm wrong as I'm not an expert in this field at all, but aren't
hard real time systems written so that no memory allocation or release happens
during normal operation?

I'm sure there are other reasons why Go would rarely be considered for a hard
real time system, but purely from a memory management perspective you could
simply disable the GC.

And then of course there is also Rust.

------
devy
> "We need to collectively admit memory safety is a problem."

Yes, we need it badly. Because I am still seeing a lot of folks either deny
that memory safety is a big problem of programming in C or some even accept
the fact.[1]

Without admitting that and start taking action to improve infrastructure
software, I don't see how we can build a securer future. That's an
intervention, it should start now.

[1]:
[https://news.ycombinator.com/item?id=13979206](https://news.ycombinator.com/item?id=13979206)

~~~
dboreham
Memory safety is certainly a problem. But so is the ability to execute data.
I'm scared by the use of any language that has that capability for security-
critical applications.

~~~
asveikau
In the last 15 or so years that awareness has increased of that issue, the
defaults for stack and heap allocations are to mark pages as non-executable.
It is therefore harder than it used to be to execute data - you need to jump
through some hoops.

Not to mention if you could not execute data, your programs would not load.

~~~
makomk
The moment you have any kind of interpreter or JIT compiler, you're
effectively bypassing non-executable protection on memory by providing a way
of executing code that looks just like accessing data from the CPU's
perspective.

------
corysama
The C++ Core Guidelines Checker, in combination with the Guidelines Support
Library, is supposed to bring Rust-like memory safety guarantees to C++. Has
anyone here had a chance to try it out?

[https://github.com/isocpp/CppCoreGuidelines/blob/master/CppC...](https://github.com/isocpp/CppCoreGuidelines/blob/master/CppCoreGuidelines.md)

[https://github.com/Microsoft/GSL](https://github.com/Microsoft/GSL)

[https://msdn.microsoft.com/en-
us/library/mt762841.aspx](https://msdn.microsoft.com/en-
us/library/mt762841.aspx)

[https://blogs.msdn.microsoft.com/vcblog/2016/10/12/cppcorech...](https://blogs.msdn.microsoft.com/vcblog/2016/10/12/cppcorecheck/)

Microsoft has been embracing the idea. But, the GSL and Checker are both
compiler-independent.

~~~
hsivonen
I very much doubt that Core Guidelines Checker can do the job of the Rust
borrow checker without the lifetime info the borrow checker needs.

Is there some doc that explains how the Core Guidelines Checker could achieve
Rust-like safety given what it has to work with?

~~~
Manishearth
IIRC it's simultaneously not 100% safe because it can't stop R-W bugs
(iterator invalidation, etc), and also more restrictive because it allows a
smaller set of lifetime relations. This may have changed since I last saw it,
which was admittedly quite a while ago.

Still, major improvement. Could very well be "enough", practically speaking.

------
vvanders
Amen.

I just spent ~1.5 days tracking down a dangling pointer bug in interop code
from Rust to C. It's so easy to do when you don't have lifetimes on pointers.
I'd gotten so comfortable with lifetimes that I'd started taking it for
granted.

Yes Rust can be hard to write, but the time and pain we spend of the other
side of that equation can be just as important if not more so.

~~~
Manishearth
For stylo I've taught bindgen to be able to insert borrows in the binding
functions in the presence of certain types, so on the C++ side you define it
as FooBorrowed bindingFunc(BarBorrowed b), and on the rust side you get fn
bindingFunct(&Bar) -> &Foo, which elides correctly. As a result a lot of the
rust-side FFI code is actually 100% safe and the burden of safety has been
pushed over completely to the C++ side. It's pretty neat.

~~~
vvanders
My problem was that I was taking a:

    
    
      Some([u8; 32]).map(|v| v.as_ptr()).unwrap_or(::std::ptr::null());
    

Which the map(..) copies by value and immediately goes out of scope so now you
have a dangling pointer to the stack(fun!). If it was a reference I'd be
totally covered but since pointers drop lifetime semantics that's where I got
bit.

Love bindgen though, just whitelisting the function I needs generates awesome
interop APIs. At the end of the day though unsafe is unsafe.

~~~
Manishearth
ah, yeah, bad idea :)

Yep, unsafe is unsafe and needs more scrutiny.

------
pron
Yeah, but the fact is that there are billions of lines of C out there, and
tens of millions more being added every year, and that's not going to change
for a while. Under no scenario will more than, say, 50% of new systems code be
written in a memory safe language within the next 15 years. It doesn't matter
what we want or what we think the industry should do; given the rate of change
in those sections of the industry that use C/C++, realistically that's just
not going to happen any time soon, while the errors already affect us.

That doesn't mean people shouldn't switch to safe languages for new projects
if they can, but it does mean that they should use the tooling available for C
much more. Unlike what the article claims, they don't -- at least not those
who write programs with lots of memory safety issues. There are static
analyzers that can _guarantee_ \-- 100% -- no memory safety issues. Airtight
guarantees may take some work, but it's still far, far cheaper to achieve
using state-of-the-art tools than switching to a new language for many
projects (even new ones in organizations where C is established, let alone old
ones). Unlike what the article claims, programmers utilizing state-of-the-art
C tooling and best practices _do not_ "constantly produce programs riddled
with severe memory safety vulnerabilities".

Unfortunately, those state-of-the-art tools and practices are used almost
exclusively by authors of safety-critical software, and are not commonly used
in more mundane kinds, let alone in open-source projects. An intervention more
likely to work in the short term should push for that, regardless of longer-
term solutions in the form of safe languages.

~~~
lawnchair_larry
_" There are static analyzers that can guarantee -- 100% -- no memory safety
issues."_

This is untrue, at least for any practical interpretation of such a guarantee.
Which product do you think does this?

Note that for it to count, it has to return the results before the programmer
is dead.

~~~
pron
[https://trust-in-soft.com/products/](https://trust-in-soft.com/products/) (a
commercial product from the people who've built Frama-C)

Sound static analyzers are not 100% automatic. They require user help in the
form of annotations in cases where they can't prove everything automatically.
But the work required is similar -- probably less -- than writing tests.

Also, note that it's very easy to find all memory safety errors in a sound way
-- you just report every pointer and array access that's unsafe. The problem
isn't identifying all errors, but proving the memory safety of the cases that
_aren 't_ (and that's where they may ask for help in the form of annotation --
to convince them of the _absence_ of an error).

~~~
lawnchair_larry
_" They require user help in the form of annotations in cases where they can't
prove everything automatically."_

So, like I said then. Doesn't exist in any practical form :) Annotations are
clearly a non-starter.

~~~
pron
Right, because clearly adding some annotations (which is similar in effort to
writing test cases) is a non-starter while rewriting hundreds of millions of
lines of code in new, immature languages is going to solve the problem faster.
And that nonexistent-in-practice approach is commonly used in safety critical
software.

~~~
lawnchair_larry
Mark my words. In 10 years, there will be no more adoption of annotations than
there is today.

If you can't get anybody to use something, it's a non-starter. It doesn't
matter why. It just hasn't happened and it wont happen.

~~~
pron
You could have said the same about unit tests 20 years ago. The annotations
themselves are immaterial; the point is the use of formal methods. Formal
methods requires some assistance. Whether you write those as annotations in
the code or in a separate file is an implementation detail.

------
digikata
So has any person or organization done a statistical survey of a body of bug
reports (like CERT) and characterized how many bugs are memory safety related?

I see this SEI blog entry [1] that cites a report, with a bitrotted link (I
think it's [2]). The blog summarizes the report associating buffer overflow as
a source of "14 percent of software security vulnerabilities and 35 percent of
critical vulnerabilities making it the leading cause of software security
vulnerabilities overall." Though it seems to be the easiest to track and a
leading error, buffer overflow is only one category of memory safety problem.

[1] [https://insights.sei.cmu.edu/sei_blog/2014/08/performance-
of...](https://insights.sei.cmu.edu/sei_blog/2014/08/performance-of-compiler-
assisted-memory-safety-checking.html)

[2]
[https://courses.cs.washington.edu/courses/cse484/14au/readin...](https://courses.cs.washington.edu/courses/cse484/14au/reading/25-years-
vulnerabilities.pdf)

~~~
dikaiosune
At one point I made an attempt to do this from the CVE database, but it's
difficult to do since CVEs aren't really machine readable from a "memory
safety vs. other security flaws" perspective. I recall finding a lot of
reported vulns with buffer, overflow, null pointer dereference, double free,
etc., but that's far from scientific.

~~~
digikata
I would be nice to also track the maturity of the errant code base that those
types of bugs are being reported in. One theory might be very mature/well
tended code bases should have decreasing numbers of memory safety issues, and
data in this regard would form some sort of basis to say - this code should be
left alone, this other code would likely benefit from a rewrite into a new
lang..

------
Perseids
As others have pointed out, a huge amount of C code is going to stay with us
for quite some time. While fighting for the adoption of other languages is
right and noble, what I would consider practical is eliminating (the worst of)
memory management related exploitation for any C/C++ code at compile time.
There has been done quite some work that shows this can be done efficiently,
e.g. SoftBound+CETS [1]. Heck, even WebAssembly shows that you can get
complete control flow protection for a comparatively low price. I would gladly
use a Linux distribution that compiled all packages with such protections,
even if it ran about 20% slower.

[1]
[https://www.cs.rutgers.edu/~santosh.nagarakatte/softbound/](https://www.cs.rutgers.edu/~santosh.nagarakatte/softbound/)
; presentation: [https://media.ccc.de/v/30C3_-_5412_-_en_-
_saal_1_-_201312271...](https://media.ccc.de/v/30C3_-_5412_-_en_-
_saal_1_-_201312271830_-_bug_class_genocide_-_andreas_bogk)

------
BadassFractal
Don't think anybody cares enough to go through so much change to deal with
vulnerabilities. I think nowadays people treat buffer overruns the same way
they treat bus delays: it's a fact of life, just live with it. Not saying
that's right, but turning that ship around is no easy feat.

------
spion
Is the problem portability though? Or is it stable ABI / easy
interoperability? What good is a fancy language when at the end you would
still expose low level API and headers specifying functions that take unsafe
pointers to raw memory?

------
andrewfromx
The part about swift, "it's neat too" I read that as kind of mocking the swift
language compared to rust or golang. Or do we think the author meant that
sincerely as, swift is in same league as the others?

~~~
kenferry
I don't think he's mocking it, but he might not know much about it.

Swift's probably on par with go. You have to diverge a bit from normal
practice or involve concurrency to be unsafe. This is in contrast to C, where
"normal practice" often produce buffer overflows.

------
monochromatic
I'm reminded of an old article from Steve Yegge[0]. Relevant section:

> If I were going to write the Ten Golden Rules of Software, the top of the
> list would be:

> Error Prone == Evil

> Although this concept is obvious to 99.999% of the general population, it's
> only accepted by 2% of computer programmers. The remaining 98% subscribe to
> a competing philosophy, namely: "Error Prone = Manly". Never mind that they
> just assigned "Manly" to the variable "Error Prone", and the expression
> always returns true; that happens to be the correct value in this case, so
> it's an acceptable hack.

[0] [https://sites.google.com/site/steveyegge2/ten-
predictions](https://sites.google.com/site/steveyegge2/ten-predictions)

------
BuuQu9hu
As somebody who would like to move to capability-safe languages, I think that
moving to memory-safe languages is a necessary first step.

You don't have to move to Rust, but you do have to move away from C.

~~~
abecedarius
I want to see that move too, but I don't agree it's a prerequisite: you can
sandbox code in unsafe languages using something like Sandstorm.

~~~
Manishearth
Still means you can leak application memory ; this is what happened with
(cloud|heart)bleed. Sandstorm mitigates RCEs escaping the sandbox, but not the
leaking of secrets.

~~~
gpderetta
You are not wrong in general, but people have pointed​ out that heatbleed
technically was not caused by a memory safety issue.

~~~
Manishearth
Well, it's debatable. It was memory safety in the sense that they were using a
buffer to manually manage memory and then messing up there. Languages like
Rust won't save you from that, yes, but it is still at a basic level the same
kind of bug. The question is if someone writing the program in Rust or Go or
w/e would have used such a custom-allocator approach. It's unlikely, but such
an approach is unlikely in C too, so it's not very clear-cut :)

(but my general point was about sandstorm not fixing all the problems, just a
subset -- not about whether other languages fix all those problems)

------
grabcocque
Hear hear.

The curl devs were totally irresponsible, and worse, outright intellectually
dishonest to pretend memory safety wasn't a serious problem for them or anyone
else.

------
dmitrygr
> Rust is the new kid on the block.

> It supports a wide variety of platforms,

> and might even run on that microcontroller

> you think can’t run anything but C.

Hahaha.

Oh, you're serious...let me laugh harder...

When the rust compiler can produce code that runs on my micro with harvard
architecture, 14-bit instruction words, and 64 bytes (not megabytes or
kilobytes) of RAM, we can talk.

.

EDIT: loving the downvotes. Keep them coming! Because nothing says "civilized
debate" like quiet underhanded unexplained disagreement :)

~~~
AstralStorm
Port LLVM to it, you're halfway there.

Such a restricted machine unfortunately is not amenable to any abstraction and
I have no idea what such bad hardware would be used for. Nor why.

~~~
mikeash
Tiny microcontrollers are used all over the place. They're useful in things
like appliances and simple peripherals.

I'm guessing that the other comment is referring to PIC microcontrollers. Over
a billion such microcontrollers are sold every year. Not all of them are
_that_ limited, but many are.

And they are amenable to the abstraction of C. That's kind of the whole point
of that comment.

The world of computing is far larger than PCs and smartphones. One could make
a convincing argument that those are a small minority, in fact.

~~~
kbenson
> And they are amenable to the abstraction of C. That's kind of the whole
> point of that comment.

Well, that's sort of the problem, right? C was made _so_ portable (because,
admittedly at the time of development hardware was less uniform) that a lot of
assumptions we take for granted are actually undefined, because you _can 't_
make those assumptions about the underlying hardware. What makes C _usable_
and _useful_ on these architectures is the same thing that causes problems for
the other 99.9% of _people_ programming in it.

Should a language with obvious shortcomings in the name of portability be one
of the de-facto standard languages for _all types_ of library development? Do
i even trust a crypto or XML parser library written in C on a platform with a
non "standard" word or byte sizes that wasn't developed specifically for that
architecture, or those constraints?

> The world of computing is far larger than PCs and smartphones. One could
> make a convincing argument that those are a small minority, in fact.

As a question of where to optimize, I would say it pays to get a much safer
language on these devices, both because it would prevent crashes, bugs and
security problems, and because (I think?) far less people are responsible for
writing the code, so a little extra work on their part pays of more overall).

I think C is as prominent as it is today because of how history played out,
not necessarily because of its inherent merits.

------
forgottenpass
Does anybody _not_ roll their eyes at headlines of the form "It's time for an
X intervention"?

I expect I'd agree with this article's substance, but I gave up in the first
paragraph. I find myself unable to maintain giving a shit what this person has
to say.

~~~
eridius
Why? This is a perfectly-justified usage of the word "intervention".

