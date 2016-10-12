Hacker News new | comments | show | ask | jobs | submit login
It’s time for a memory safety intervention (tonyarcieri.com)
65 points by okket 3 hours ago





There's a lot right in this, but I feel he's chosen the wrong enemy by harshly criticizing Daniel Stenberg (the curl dev). He's not a C zealot, he's just accepting realities.

Curl should be replaced by something memory safe. But it cannot happen today, because the infrastructure isn't there. We don't have memory safe languages suitable for system programming that's widely available on all kinds of architectures. Rust may become that language, but today it is not. It's not as portable, not as easy to integrate etc. pp.

I recommend looking at the top comment in https://news.ycombinator.com/item?id=13966241.

Stenberg said, and may believe, that C is not the source of most of curl's security problems. But examining counts of security vulnerabilities, it definitely is.

Not sure he's hitting on him but who knows :-) In any event, he is criticising Daniel Stenberg for downplaying the memory safety issues in curl. It looks like that criticism is justified to some degree.

Edited the post, thanks for the note :-)

My English is usually decent, but I still sometimes make stupid mistakes.

> We don't have memory safe languages suitable for system programming that's widely available on all kinds of architectures.

Yes, yes we do. Go is available on a ton of architectures; Lisp is available on a ton of architectures; ML is available on a ton of architectures.

Yes, C is available on more. Yes, for some architecture C is the only reasonable choice. But on x86 & ARM, there's no good reason to write new code in C.

> But on x86 & ARM, there's no good reason to write new code in C

Sorry, but this is pure ignorance. Please let me know when was the last time you've tried to implement a CFD or FEA code in Go or Lisp ?

What about kernel drivers?

> "We need to collectively admit memory safety is a problem."

Yes, we need it badly. Because I am still seeing a lot of folks either deny that memory safety is a big problem of programming in C or some even accept the fact.[1]

Without admitting that and start taking action to improve infrastructure software, I don't see how we can build a securer future. That's an intervention, it should start now.

[1]: https://news.ycombinator.com/item?id=13979206

Memory safety is certainly a problem. But so is the ability to execute data. I'm scared by the use of any language that has that capability for security-critical applications.

The amount of problem denial in the line of "experienced C programmers make no mistakes" or "what? Memory safe languages can avoid only maybe half or so of security bugs, totally not worth it" is astonishing. Granted, it's not easy right now (sometimes close to impossible) for some types of projects to not use C but defending the obvious [1] flaws of the language is mind blowing.

[1] Pretty obvious from the beginning. The creators of C made a pragmatic choice and created a language which basically does not guarantee anything. This allowed them to write a compiler fast enough for the limited computing power they had at hand.

The C++ Core Guidelines Checker, in combination with the Guidelines Support Library, is supposed to bring Rust-like memory safety guarantees to C++. Has anyone here had a chance to try it out?

https://github.com/isocpp/CppCoreGuidelines/blob/master/CppC...

https://github.com/Microsoft/GSL

https://msdn.microsoft.com/en-us/library/mt762841.aspx

https://blogs.msdn.microsoft.com/vcblog/2016/10/12/cppcorech...

Microsoft has been embracing the idea. But, the GSL and Checker are both compiler-independent.

Amen.

I just spent ~1.5 days tracking down a dangling pointer bug in interop code from Rust to C. It's so easy to do when you don't have lifetimes on pointers. I'd gotten so comfortable with lifetimes that I'd started taking it for granted.

Yes Rust can be hard to write, but the time and pain we spend of the other side of that equation can be just as important if not more so.

For stylo I've taught bindgen to be able to insert borrows in the binding functions in the presence of certain types, so on the C++ side you define it as FooBorrowed bindingFunc(BarBorrowed b), and on the rust side you get fn bindingFunct(&Bar) -> &Foo, which elides correctly. As a result a lot of the rust-side FFI code is actually 100% safe and the burden of safety has been pushed over completely to the C++ side. It's pretty neat.

reply


My problem was that I was taking a:

  Some([u8; 32]).map(|v| v.as_ptr()).unwrap_or(::std::ptr::null());
Which the map(..) copies by value and immediately goes out of scope so now you have a dangling pointer to the stack(fun!). If it was a reference I'd be totally covered but since pointers drop lifetime semantics that's where I got bit.

Love bindgen though, just whitelisting the function I needs generates awesome interop APIs. At the end of the day though unsafe is unsafe.

So has any person or organization done a statistical survey of a body of bug reports (like CERT) and characterized how many bugs are memory safety related?

I see this SEI blog entry [1] that cites a report, with a bitrotted link (I think it's [2]). The blog summarizes the report associating buffer overflow as a source of "14 percent of software security vulnerabilities and 35 percent of critical vulnerabilities making it the leading cause of software security vulnerabilities overall." Though it seems to be the easiest to track and a leading error, buffer overflow is only one category of memory safety problem.

[1] https://insights.sei.cmu.edu/sei_blog/2014/08/performance-of...

[2] https://courses.cs.washington.edu/courses/cse484/14au/readin...

Hear hear.

The curl devs were totally irresponsible, and worse, outright intellectually dishonest to pretend memory safety wasn't a serious problem for them or anyone else.

The part about swift, "it's neat too" I read that as kind of mocking the swift language compared to rust or golang. Or do we think the author meant that sincerely as, swift is in same league as the others?

I don't think he's mocking it, but he might not know much about it.

Swift's probably on par with go. You have to diverge a bit from normal practice or involve concurrency to be unsafe. This is in contrast to C, where "normal practice" often produce buffer overflows.

Don't think anybody cares enough to go through so much change to deal with vulnerabilities. I think nowadays people treat buffer overruns the same way they treat bus delays: it's a fact of life, just live with it. Not saying that's right, but turning that ship around is no easy feat.

As somebody who would like to move to capability-safe languages, I think that moving to memory-safe languages is a necessary first step.

You don't have to move to Rust, but you do have to move away from C.

I want to see that move too, but I don't agree it's a prerequisite: you can sandbox code in unsafe languages using something like Sandstorm.

Does anybody not roll their eyes at headlines of the form "It's time for an X intervention"?

I expect I'd agree with this article's substance, but I gave up in the first paragraph. I find myself unable to maintain giving a shit what this person has to say.

> Rust is the new kid on the block.

> It supports a wide variety of platforms,

> and might even run on that microcontroller

> you think can’t run anything but C.

Hahaha.

Oh, you're serious...let me laugh harder...

When the rust compiler can produce code that runs on my micro with harvard architecture, 14-bit instruction words, and 64 bytes (not megabytes or kilobytes) of RAM, we can talk.

.

EDIT: loving the downvotes. Keep them coming! Because nothing says "civilized debate" like quiet underhanded unexplained disagreement :)

You will also never have a memory safety issue on that processor because in that environment using malloc() is already considered a sin of the highest degree.

I wrote a chess program that ran on a PIC 16 series with 176 bytes of RAM and I got 5 levels of look-ahead with a compiler that didn't support recursion. No dynamic allocation anywhere.

I laugh when people claim a Raspberry Pi 3 is an "embedded system". Sure you can embed it in something, but it's got an ocean of resources compared to most.

Keep in mind that there's a middle ground here though. ARM 32bit micros still have basically no support from Rust et al but often have enough resources (Cortex M7 anyone?) to do things where it would be nice to have memory safe dynamic allocation. Programmers often have to resort to convoluted static data structures for some tasks where dynamic allocation would be useful and don't get me started on the NXP LPC-like chips that have two cores with shared regions.

reply


If you'd like to start a substantive discussion, you have to offer the community a comment in the same spirit.

Additionally:

> Please resist commenting about being downvoted. It never does any good, and it makes boring reading.

https://news.ycombinator.com/newsguidelines.html

I doubt you'd be running Curl on that microcontroller.

Port LLVM to it, you're halfway there.

Such a restricted machine unfortunately is not amenable to any abstraction and I have no idea what such bad hardware would be used for. Nor why.

Tiny microcontrollers are used all over the place. They're useful in things like appliances and simple peripherals.

I'm guessing that the other comment is referring to PIC microcontrollers. Over a billion such microcontrollers are sold every year. Not all of them are that limited, but many are.

And they are amenable to the abstraction of C. That's kind of the whole point of that comment.

The world of computing is far larger than PCs and smartphones. One could make a convincing argument that those are a small minority, in fact.

They're not that amenable to the abstraction of C. Companies have managed to create C compilers for PICs anyway because there's a market for them, but they have - amongst other problematic features - extremely limited support for indirect addressing, no hardware support for function pointers, a hardware stack with limited depth that cannot be accessed directly, only implicitly via call and return instructions, and bank switching if you want to access more than 128 bytes of memory. Someone elsewhere in the thread mentioned using a compiler that didn't support recursion; that's probably because there's no easy way of supporting stack variables on most PICs.

reply


But, to be fair, dynamic memory management is rarely used in microcontroller programming. Additionally, it's much easier in such a restricted environment to test the machine against every possible input (or a very representative sample). This explains the relative paucity of memory safety bugs in these environments.

reply


> And they are amenable to the abstraction of C. That's kind of the whole point of that comment.

Well, that's sort of the problem, right? C was made so portable (because, admittedly at the time of development hardware was less uniform) that a lot of assumptions we take for granted are actually undefined, because you can't make those assumptions about the underlying hardware. What makes C usable and useful on these architectures is the same thing that causes problems for the other 99.9% of people programming in it.

Should a language with obvious shortcomings in the name of portability be one of the de-facto standard languages for all types of library development? Do i even trust a crypto or XML parser library written in C on a platform with a non "standard" word or byte sizes that wasn't developed specifically for that architecture, or those constraints?

> The world of computing is far larger than PCs and smartphones. One could make a convincing argument that those are a small minority, in fact.

As a question of where to optimize, I would say it pays to get a much safer language on these devices, both because it would prevent crashes, bugs and security problems, and because (I think?) far less people are responsible for writing the code, so a little extra work on their part pays of more overall).

I think C is as prominent as it is today because of how history played out, not necessarily because of its inherent merits.

>I have no idea what such bad hardware would be used for. Nor why.

You would use it for places where the uC is cheaper than a few transistors or flip-flops, for example to flash lights in a pattern. Also when you need extremely low power consumption, like a digital watch or BTLE beacon.

There are uCs in everything these days.

reply


https://en.wikipedia.org/wiki/PIC_microcontroller

Incredibly common chips, less so today, but historically they were all over the place. Looking at Wikipedia, it seems the 14bit instruction set PIC had a whopping 128 bytes of memory.

For managing a bunch of toggle switches or dials with 8 positions, I imagine that was more than enough.

> Such a restricted machine unfortunately is not amenable to any abstraction

I can't tell if you're using hyperbole, or if you mean "any abstraction" literally.

If you mean it literally, we should probably discuss that.

Yeah, but the fact is that there are billions of lines of C out there, and tens of millions more being added every year, and that's not going to change for a while. Under no scenario will more than, say, 50% of new systems code be written in a memory safe language within the next 15 years. It doesn't matter what we want or what we think the industry should do; given the rate of change in those sections of the industry that use C/C++, realistically that's just not going to happen any time soon, while the errors already affect us.

That doesn't mean people shouldn't switch to safe languages for new projects if they can, but it does mean that they should use the tooling available for C much more. Unlike what the article claims, they don't -- at least not those who write programs with lots of memory safety issues. There are static analyzers that can guarantee -- 100% -- no memory safety issues. Airtight guarantees may take some work, but it's still far, far cheaper to achieve using state-of-the-art tools than switching to a new language for many projects (even new ones in organizations where C is established, let alone old ones). Unlike what the article claims, programmers utilizing state-of-the-art C tooling and best practices do not "constantly produce programs riddled with severe memory safety vulnerabilities".

Unfortunately, those state-of-the-art tools and practices are used almost exclusively by authors of safety-critical software, and are not commonly used in more mundane kinds, let alone in open-source projects. An intervention more likely to work in the short term should push for that, regardless of longer-term solutions in the form of safe languages.

Tooling requires a lot of effort from every programmer too, it cannot work on the scale you are talking about. On that scale compatible memory-safe C compiler or transpiller are pretty much the only options, since gcc and llvm don't care about that.

reply


I don't think we can guarantee the absence of that kind of errors from all existing billions of lines, but the work is still far cheaper than switching languages. New projects can make use of sound static analysis from the get-go, and old projects can apply them incrementally.

reply


Anything we can't guarantee statically we can check at runtime, which is what memory-safe C compilers do. But this is something OS distributions can use on all of the C packages. What you are suggesting is an effort that each and every project has to take and this simply doesn't scale much better than switching languages.

reply


I do think that sound static analysis tools are far cheaper than adopting a new language, and they can be used incrementally (i.e., lesser guarantees are cheaper), but sure, memory-safe C compilers, too! (I'm just not familiar with the approach; links?) My point is that there are solutions that we can start applying today short of rewriting in new languages, or at least until we rewrite.

reply


https://www.cs.rutgers.edu/~santosh.nagarakatte/softbound/

http://sva.cs.illinois.edu/index.html

http://chrisseaton.com/plas15/safec.pdf

>I do think that sound static analysis tools are far cheaper than adopting a new language

For new projects I don't really see why that should always be the case. It probably depends on the type of project, on the organization, on any integration requirements and on the willingness of developers to make the change.

Let me put it this way: adopting such analysis tools is about as costly as adopting a new unit testing framework. But you're right, the cost of adopting a new language depends on many factors.

