
Fearless Security: Memory Safety - feross
https://hacks.mozilla.org/2019/01/fearless-security-memory-safety/
======
rwmj
When talking about Garbage Collection they claim that _" Even languages with
highly optimized garbage collectors can’t match the performance of non-GC’d
languages"_ and link to this paper: [http://greenlab.di.uminho.pt/wp-
content/uploads/2017/09/pape...](http://greenlab.di.uminho.pt/wp-
content/uploads/2017/09/paperSLE.pdf) However I cannot see where this paper
supports the assertion (although Rust comes out well).

The particular problem is that malloc/free is not free of charge. Different
allocators have complex internal implementations, plus you lose easy sharing
of complex structures and compaction. So if you're using a GC you probably
program in a different way, eg using more shared immutable structures,
exploiting the advantages of GC.

Edit: A bit later they brush off this old favourite for GC-haters:
[https://people.cs.umass.edu/~emery/pubs/gcvsmalloc.pdf](https://people.cs.umass.edu/~emery/pubs/gcvsmalloc.pdf)
claiming that GC requires 5x as much memory. The problem with this paper is it
assumes that malloc/free is cost-free and instantaneous, and that your
programmer always calls free at the exact moment that the memory is no longer
needed, both of which are extremely unrealistic.

~~~
jasode
_> The particular problem is that malloc/free is not free of charge. [...] The
problem with this paper is it assumes that malloc/free is cost-free and
instantaneous,_

I think I get that you're trying to explain how GC overhead is overstated and
malloc()/free() is understated but your angle about "malloc not being free of
charge" it isn't really the evidence you want to use.

As analogy, we can see that a Lamborghini is faster than a Hyundai. Let's say
we state that the performance comparison is flawed because it it still takes
the Lamborghini a _non-zero amount of time_ to accelerate to 60 mph (or 100
km/hr) and Lamborghini is _still consuming gasoline_. While the cost
datapoints are _true_ , it still doesn't change the macro observation that the
Hyundai is slower.

Also, you're misrepresenting the 2nd paper you cited. The authors do mention
"overhead" of malloc. They do _not_ assume that malloc is "cost-free" on page
4:

 _> 3.2 malloc Overhead - When using allocators implemented in C, the oracular
memory manager invokes allocation and deallocation functions through the Jikes
VM SysCall foreign function call interface. While not free, these calls do not
incur as much overhead as JNI invocations. Their total cost is just 11
instructions: six loads and stores, three registerto-register moves, one load-
immediate, and one jump. This cost is similar to that of invoking memory
operations in C and C++, where malloc and free are functions defined in an
external library (e.g., libc.so)._

In other words, even if malloc/free is _not instantaneous and not cost-free_ ,
it can still be faster than GC. Neither paper's conclusions depend on
malloc/free taking zero amounts of time or zero cpu instructions. If the
papers are flawed, you have to explain it using different logic.

~~~
vardump
> >3.2 malloc Overhead - When using allocators implemented in C...

That's not even about malloc overhead, it's about the overhead of just
_calling_ C malloc. So actual malloc overhead is on top of that.

Malloc and free have pretty unpredictable runtime over time, once a lot of
allocations and deallocations have been performed. That's why you don't use
either in latency sensitive code, like with realtime requirements.

> In other words, even if malloc/free is not instantaneous and not cost-free,
> it can still be faster than GC.

Whoah, faster than GC in _what_ regard? GC will probably win the throughput
race, or _average_ latency. Manual allocation will likely win the jitter race,
lower latency standard deviation.

(I write low level code in C/C++, including kernel drivers and bare metal
firmware with hard realtime requirements. Hopefully in the future in Rust or
some other memory _and_ concurrency safe language.)

~~~
jasode
_> Malloc and free have pretty unpredictable runtime over time, _

Yes, that's another _true_ statement about malloc but it also doesn't matter
to the particular point I'm making. To continue your _correct & true_
statements of malloc, we can add:

\- malloc has to search the freelist; GC can be just a bump allocation which
is faster

\- malloc leads to fragmented memory; GC can reorganize and reconsolidate

\- malloc doesn't have extra intelligence to assign pointers to shared memory
structures (e.g. Java string pool stores identical strings only once based on
hashes)

\- ... a dozen other true statements about malloc

All those true statements (which most can agree on) isn't the
misunderstanding. The issue is _misusing_ those true statements as some type
of convincing evidence to explain the papers' flaws. For example:

 _> Whoah, faster than GC in what regard?_

Well, we can just use the _total runtime_ of the the 2 papers benchmarks where
there were lots of memory operations. (In other words, we can acknowledge that
performance has multiple dimensions/axis but we can also look at the simple
measurement of _total wall clock time_ of benchmark code that doesn't do
database access or floating point calculations.)

 _The C /C++ programs ran faster and took up less memory._

Ok, were there flaws in the benchmarks? Then lets explain _the specific
flaws_.

Yes, I can say _" malloc runtime is unpredictable"_ but that true statement
doesn't actually explain anything about GC running slower than malloc/free in
the papers. We can also say that _" malloc is not cost free"_ as another
_true_ statement -- but that also doesn't actually explain the GC's longer
elapsed time.

See the problem with those attempted explanations? They're all non-sequiturs.

~~~
vardump
> Well, we can just use the total runtime of the the 2 papers benchmarks where
> there were lots of memory operations. (In other words, we can acknowledge
> that performance has multiple dimensions/axis but we can also look at the
> simple measurement of total wall clock time of benchmark code that doesn't
> do database access or floating point calculations.)

...

> See the problem with those attempted explanations? They're all non-
> sequiturs.

I'm comparing GC vs manual memory management.

You (or the papers) are comparing different implementations of programs in
different languages. That might be great for practical considerations for
choosing implementation language, but is pointless when comparing those two
different memory management strategies. Apples and oranges.

EDIT: I feel "Quantifying the Performance of Garbage Collection vs. Explicit
Memory Management" paper is a bit dishonest. From the paper:

> The culprit here is garbage collection activity, which visits far more pages
> than the application itself [61]. As allocation intensity increases, the
> number of major garbage collections also increases. Since each garbage
> collection is likely to visit _pages that have been evicted_ , the
> performance gap between the garbage collectors and explicit memory managers
> grows as the number of major collections increases.

 _Pages got evicted_ – so their heap ran out of physical RAM and started
swapping to disk. Wow.

Yeah, GC uses much more RAM, that's a well known downside. Setting the
benchmark up in such a way that causes the system to start swapping is not a
fair way to compare GC and manual allocation throughput.

~~~
jasode
_> You (or the papers) are comparing different implementations of programs in
different languages._

Fyi... the 2nd paper is using the same language of Java. It just compares
different allocation strategies: explicit vs GC. (I think that paper is
written in a confusing way.)

My original point back to op (rwmj) was that the computer scientists were
quite aware that malloc had a non-zero cost. And pointing that out really
doesn't challenge the paper's findings.

~~~
vardump
Yeah, and the second paper said their GC scenario system was _swapping to
disk_. Please read my edit to the previous comment.

------
nine_k
For completeness sake, there are C programs that never allocate memory
dynamically, using press-allocated variables only. Your microwave likely runs
one.

There are also programs that allocate but never we-allocate memory. They are
fast-terminating programs, ranging from a CLI utility to onboard software that
controls a surface-to-air rocket.

But many programs still need more complex memory management done safely.

~~~
hannob
Buffer overflow without any dynamic allocation:

#include <string.h>

#include <stdio.h>

int main() {

char hello[5];

strcpy(hello, "hello");

printf("%s\n", hello);

}

(gcc warns about, interestingly clang does not)

~~~
wahern
Rust recently had an integer overflow in str::repeat permitting a buffer
overflow. It was a classic bug, the kind that gave C a bad reputation. And the
error(s) in arithmetic occurred _outside_ any unsafe block.

If we want to be pedantic, all this talk of "fearless" wrt Rust is dangerous
hyperbole. Once you move past juggling scalar values things get dicey,
especially in C but also in so-called "safe" languages like Rust. And that, I
think, was the previous poster's point--as you move away from dynamic
allocation you tend to move toward using scalar (fixed-sized) objects.

One of the original points of emphasis of Rust was to favor scalar values
rather than pointers and even references. To a limited but useful extent you
can mimic this in C. Rust examples that simply use Vec miss the point--1) the
str::repeat bug overflowed a Vec, and 2) just because C doesn't come with a
built-in Vec doesn't mean you can't write one or use one.

~~~
lixtra
> Rust recently had an integer overflow in str::repeat permitting a buffer
> overflow.

In case someone is interested of the details, like I was:
[http://cve.mitre.org/cgi-
bin/cvename.cgi?name=CVE-2018-10008...](http://cve.mitre.org/cgi-
bin/cvename.cgi?name=CVE-2018-1000810)

------
nickpsecurity
"Another type of problem that can appear is memory leakage... This is a
memory-related problem, but one that can’t be addressed by programming
languages."

[https://www.fos.kuis.kyoto-u.ac.jp/~tanki/papers/memoryleak....](https://www.fos.kuis.kyoto-u.ac.jp/~tanki/papers/memoryleak.pdf)

Well, that took about a minute in DuckDuckGo. :) Key words were "memory leaks"
and "type system." For those wanting to find CompSci, adding type system in
quotes to a property is a reliable way to find language work on that property.
The word language can help, too, but the wording in PDF's varies on that more.

------
insertcredit
Not bad but a little bit of a publicity stunt.

A major source of vulnerabilities is (still) the Javascript engine and that's
(still) written in C++.

Even worse, as far as I know, Mozilla has no plans to rewrite even parts of
Spidermonkey in Rust.

For some recent examples:

[https://usn.ubuntu.com/3688-1/](https://usn.ubuntu.com/3688-1/)

[https://usn.ubuntu.com/3749-1/](https://usn.ubuntu.com/3749-1/)

~~~
ridiculous_fish
A JS engine is a high-risk, high-reward problem for Rust. High-reward because
JS engines are, to your point, a major source of vulnerabilities; high-risk
because JS-engine theory is rather outside of Rust's wheelhouse.

One class of vulnerabilities in JS engines is use-after-move. A raw pointer is
extracted, an allocating function is called (triggering a GC), then the raw
pointer is used, pointing into nowhere. It's awkward to express in Rust that a
function may modify state inaccessible from its parameters.

A second class of vulnerabilities is type-confusion. A value is resolved to (a
pointer to) some concrete type, but some later code mutates the value. Now the
concrete type is wrong. Again this possibility is awkward to express in Rust.

The problem is complicated by the NaN-boxing and JIT aspects of JS engines,
which interfere with Rust's tree-ownership dreams.

People smarter and way better at Rust than myself are working on it; I'm
excited by the prospect of novel solutions that can defeat entire classes of
problems.

~~~
johncolanduoni
I'm curious what proportion of vulnerabilities in JS engines are due to mis-
generated JIT code vs direct errors in their compiled code. Rust allows you to
express some nice properties not always directly related to memory safety
(e.g. checked consumption, convenient and safe ADTs), but unless there is a
novel application of these facilities to the structure of a JIT engine it
won't help a ton with the former kind of vulnerabilities.

I'm excited to see a practical programming language that implements full
dependent typing; languages like Idris are actually really good at dealing
with precisely the kinds of situations you mention.

~~~
ridiculous_fish
JS engines have many parts implemented natively, which may be called from JS,
and in turn call back into JS. An example is CVE-2015-6764: this grabs an
array length, which quickly becomes stale, because accessing one of the
array's elements invokes a custom toJSON which in turn modifies the array's
length.

This feels like a hopeless problem; can any of Rust's powers be brought to
bear here? Could Idris?

~~~
johncolanduoni
F* is probably the best equipped at the moment to deal with situations like
that CVE at the moment, since its library has a concept of heaps. Basically,
any function that can access or modify the "heap" (which in F* is just a set
of pointers that are guaranteed to point to a value and not alias any others
outside of the same heap) must specify what properties of the state of the
heap must be true at entry, and what properties are true afterwards. So in
pseudo-types (, the functions for accessing a JavaScript array would be
something along the lines of

    
    
        fn arrayLength(x: JSArray*) -> n: uint (requires nothing) (ensures length of x = n, changes nothing)
        fn callToJson(x: JSValue*) -> JSValue* (requires nothing) (ensures nothing)
        fn arrayAccess(x: JSArray*, m: uint) -> JSValue* (requires length of x > m) (changes nothing)
    

(NB: F* syntax doesn't look much like this, but I'm guessing this will be
readable to more people on HN)

The stuff in parentheses after each function type are the preconditions and
post-conditions respectively. So if you do something like:

    
    
        let x = arrayLength(someArray)
        for i in range(x) {
          let element = arrayAccess(x-1)
        }
    

It will typecheck just fine. But if you add the call to toJSON:

    
    
        let x = arrayLength(someArray)
        for i in range(x) {
          let element = arrayAccess(x-1)
          let transformed = callToJson(element)
          // ERROR: (requires length of x > m) not satisfied for all runs of loop body
        }
    

Since callToJson cannot ensure any property of the heap after it runs. In this
way you can elide range checks when needed for performance without worrying
that you've sacrificed safety.

Covering all the cases a JS engine would need without adding 10 million lines
of proofs to the size of SpiderMonkey is still an open problem, but this
general approach (known as Hoare Logic[1]) is very enticing, and the type
systems that languages like Idris and F* have are definitely the closest to
realizing it in more places. There are real software engineering efforts using
descendants of Hoare logic like TLA+ (notably Amazon IIRC), but it's rare to
see it even in huge projects like browsers.

It's also critical to note that the heap concept of F* is not a totally fixed
part of the language; most of the specification of how heaps work are actually
in the standard library. That level of flexibility is what I think makes these
languages likely to become capable of tackling these problems: something like
a JS engine or any optimizing compiler is exactly the kind of place where
being able to come up with your own type-level verification model is worth the
effort.

[1]:
[https://en.wikipedia.org/wiki/Hoare_logic](https://en.wikipedia.org/wiki/Hoare_logic)

------
pmoriarty
_" Some languages (like C) require programmers to manually manage memory by
specifying when to allocate resources, how much to allocate, and when to free
the resources."_

This is not required of programmers in C, because the programmer could choose
to delegate memory management to a memory management library, such as the
Boehm-Demers-Weiser conservative garbage collector. [1]

[1] - [http://www.hboehm.info/gc/](http://www.hboehm.info/gc/)

~~~
monocasa
Conservative garbage collection like Boehm inevitably leads to memory leaks in
long running applications. It's awesome for some use cases but isn't a
complete solution by any means.

~~~
rurban
Wrong. Only if the system failed to identify a root, which is a grave
developers mistake, but in the general case boehm-gc does not generate any
leaks.

Unlike manual memory collection which inevitably leads to memory leaks.

~~~
yorwba
Failing to identify a root would cause the GC to free too much memory. Memory
leaks happen when too little memory is freed. Boehm GC is called conservative
because it can't distinguish pointers from other memory content, so it will
determine some allocations to be reachable even if there's no pointer to them,
because some random integer looks like a pointer into that allocation.

~~~
rurban
I see, that's what you meant. You are right. You need to use boehmgc in
precise mode, which is not the default. You need to tag your pointers. Also
use incremental mode for shorter pause times.

------
zozbot123
In related news, the C/C++ development community is finally catching up to
what Java has been providing for over 20 years.

~~~
jchw
What is it that the C/C++ development community is catching up to?

(And also, I'm not convinced that programming languages should be treated as
having their own isolated developer communities, considering there is often a
lot of overlap. Are we talking individual users? Companies? Language
designers? Etc.)

~~~
pjmlp
Good support for multi-core programming comparable to what
java.lang.concurrent, language thread and network async IO offer.

C++11 introduced std::thread, with a couple of issues retified in later
revisions, and apparently executors just failed C++20, delaying the
introduction of a major part of async networking.

Language safety as well.

For me, in spite of the safety improvements in C++, I see the language being
tailored for specific niches and no longer a full stack language, similar to
how it is handled on modern desktop and mobile OSes.

And C will never catch up in security.

