
Safe-Linking – Eliminating a 20 year-old malloc() exploit primitive - eyalitki
https://research.checkpoint.com/2020/safe-linking-eliminating-a-20-year-old-malloc-exploit-primitive/
======
saurik
Blocking 15 out of 16 exploit attempts sounds great, but you have to ask
yourself how many shots you get, and since most people don't think "this
program crashed: I should never run it ever again", this isn't anywhere near
as helpful as it seems. If you crash my webserver, I spawn a new one. You
might even already have 16 webservers running right now! If the user clicks a
link and their tab crashes, they might hit reload a few times before giving
up. If you receive a text message or an email that crashes a parser on your
phone, the retry might seriously happen in a tight loop. Like, I hear about
this sort of thing being a good idea constantly, but I have never once heard
of it actually blocking an attack against any "normal" target (I say "normal",
as some super hardened military target might be configured to go into complete
lockdown immediately upon a single unexpected event, with a protocol to call
in a forensics expert to analyze logs to see if it is safe to restart the
service before ending the outage).

~~~
eyalitki
We can detect corruption in 15/16 times. In our tests, when we didn't detect a
corruption, the program still crashed, but without a proper error message. One
should remember that the pointer is masked, so an attacker without the secret
mask won't be able to control the reveal()ed pointer. And using a garbled
pointer won't give the attacker anything useful, and most probably it will
crash the target program.

~~~
mywittyname
Are you saying the technique is effective at preventing a successful attack
nearly every time, but that pointer modification are detected (thus logged) 15
out of 16 times?

~~~
eyalitki
Yes. We have 12 bits masking just the lower page, in addition to more than 10
bit above that. Chances to guess that correctly are VERY slim.

------
daanx
Interesting article -- thanks! I see they used the `key ^ P` encoding where
the `key` is `L >> PAGESHIFT` to use the ASLR randomized bits from the free
list position `L`.

In
[mimalloc]([https://github.com/microsoft/mimalloc](https://github.com/microsoft/mimalloc))
we use a similar strategy to protect the free list in secure mode. However,
there are some weaknesses associated with using a plain _xor_ -- if the
attacker can guess `P` that reveals the key immediately (`P^key^P == key`) or
even if there is a read overflow, and the attacker can read multiple
encodings, they can xor with each other, say `(P1^key)^(P2^key)` and then we
have`(P1^P2)` which may reveal information about the pointers (like
alignment).

What we use in _mimalloc_ instead is two keys `key1` and `key2` and encode as
`((P^key2)<<<key1)+key1`. Since these operations are not associative, the
above approaches do not work so well any more even if the `P` can be
guesstimated. For example, for the read case we can subtract two entries to
discard the `+key1` term, but that leads to `((P1^key2)<<<key1) -
((P2^key2)<<<key1)` at best. (We include the left-rotation since xor and
addition are otherwise linear in the lowest bit).

Just some thoughts. Of course, this may be too much for the use-case. However,
we found just little extra overhead for the extra operations (as most programs
are dominated by memory access) so it may be of benefit.

------
phkamp
FreeBSD solved that problem 22 years ago:

[https://papers.freebsd.org/1998/phk-
malloc/](https://papers.freebsd.org/1998/phk-malloc/)

(And got better malloc(3) performance at the same time.)

~~~
kazinator
Nobody has solved "that problem".

No malloc has absolute detection of all corruption and even tools like
Valgrind don't catch everything.

Let's not pretend that a few pointer hacks and checks have solved a problem.

~~~
phkamp
The particular problem they boast about having solved,does simply not exist in
the malloc(3) I wrote 23 years ago, because it does not have the silly linked
list to begin with.

In addition to all the security problems inherent in mingling metadata
directly next to user-data, the linked list gives O(Nalloc) performance on
free(3) and realloc(3) whereas my malloc had O(1) performance.

That may not seem like a lot, but when the 'O' is a page-in from disk, and
Nalloc is from C++ code, it is nothing to sneeze at.

Not only did that make my malloc faster, but it would, not "could", but
"would" detect several classes of malloc-usage errors, double-free etc,
unconditionally.

Over the next 10-ish years, that practically eliminated entire classes of
malloc-mistakes, making a lot of FOSS software safer, no matter which malloc
you used it with.

Given the definition of the malloc(3) family API, there is no way you can
detect all corruption without hardware support, but there are people working
on that too, notably Robert Watsons CHERI project at Cambridge.

So yeah, nice pointer arithmetic, but how about people solved the real problem
instead ?

(On big memory and multi-core systems use jemalloc, on small memory and
single-core systems use phkmalloc.)

~~~
armitron
That there are entire guides (see below) and even games written on phkmalloc
exploitation proves kazinator's point rather than yours.

[1] [http://www.ouah.org/BSD-heap-smashing.txt](http://www.ouah.org/BSD-heap-
smashing.txt)

[2]
[https://events.ccc.de/congress/2005/fahrplan/attachments/636...](https://events.ccc.de/congress/2005/fahrplan/attachments/636-CCCmallocyy.pdf)

[3]
[https://overthewire.org/wargames/vortex/vortex11.html](https://overthewire.org/wargames/vortex/vortex11.html)

[4]
[https://www.usenix.org/legacy/events/woot11/tech/final_files...](https://www.usenix.org/legacy/events/woot11/tech/final_files/Novark7-7-11.pdf)

~~~
phkamp
No ?

The only problem OP has solved is detecting overwriting of metadata
inadvisably comingled with the user data.

That problem does simply not exist in phkmalloc, or jemalloc for that matter,
writing past the end or before the beginning of an allocation does not
_automatically_ give you the ability to violate the internal integrity of the
malloc implementation.

If you actually read the first link you pasted carefully, you will see what an
amazing tour-de-force it was to exploit CVE-2003-0015, which was a run-of-the-
mill double-free mistake in CVS.

As I remember it, there were proof-of-concept exploits against list-based
mallocs in a matter of days, it took months before bbp's article appeared.

When I wrote phkmalloc, most of the exploitable malloc-related problems were
usage problems, double frees, freeing modified pointers, writing past start or
end of allocation etc.

If you had seen my presentation in New Orleans 1998, you would undoubtedly
remember my handwritten slide with around 20 setuid programs phkmalloc had
found issues in within the first year.

That is the primary reason why there are so few malloc(3) related exploits
these days: phkmalloc and later jemalloc complained about them.

Those are mistakes which list-based mallocs do not find, because it is too
expensive to walk the list all the time, and on top of that, they run slower,
because the list is O(n)!

There is neither a reason, nor an excuse for using a list-based malloc in this
day and age.

Neither OP nor I claim to have solved the "big" malloc problem, because, as I
already explained: That takes hardware support.

------
jfindley
I was initially surprised at their benchmark results, until I read more
closely and noticed that they were benchmarking on a cloud VM. I wish people
wouldn't do this for this form of test - it adds so many variables it's hard
to take their benchmark too seriously.

~~~
eyalitki
We executed the benchmark suite multiple times, and results were consistent.
It was way more stable than the same execution on a PC with multiple programs.
Anyway, the maintainers of glibc performed the same checks and confirmed our
measurements.

~~~
kees99
Thumbs up with reaching out to upstream to mainline it.

Did you talk to musl team as well?

~~~
eyalitki
Not yet. Working with each open-source demanded a lot of effort, so we were
hoping that integrating it into a few dominant libraries will attract the
attention of the rest. We will happily assist them if they need/want our help.

~~~
boomer_joe
The musl mailing list is very active, and communication with the maintainers
is generally effortless. It would be interesting to see how this plays out.
They are also in the process of rewriting their malloc implementation.

Thank you for your work!

~~~
rurban
And musl is just working in merging its new mallocng.

------
floatboth
Does this apply to jemalloc?

~~~
eyalitki
While ptmalloc/dlmalloc/tcmalloc use a single-linked list meta-data that is
stored adjacent to the user buffers, in jemalloc there is no such meta-data.
jemalloc's design stores the sensitive meta-data separately, thus removing the
need to add a dedicated protection layer for it.

------
kazinator
This is a crude, unreliable debugging aid being trotted out as a security
enhancement.

~~~
ghostpepper
So you'd be opposed to merging this change upstream?

~~~
kazinator
Yes; it wastes machine cycles.

