

How OpenBSD Leverages Malloc to Find Bugs - adbge
http://os-blog.com/how-openbsd-leverages-malloc-to-find-bugs/

======
eis
Without having looked at OpenBSD's implementation, just unmapping free pages
does not mean you catch all writes to free()d memory. If a block is smaller
than the pagesize, which many times is the case, a page will likely still hold
other objects so it wont be unmapped and therefor not trigger a segfault.

Valgrind is my tool of choice to find invalid reads or writes. It catches all
of them plus it gives you nice information like a stacktrace.

What OpenBSD's implementation of malloc() does get you though is improved
coping with memory fragmentation. Allocating memory in a continously heap
brings lots of pain - jemalloc (which is the default allocator in FreeBSD and
NetBSD) solves this nicely by also only using mmap() instead of sbrk() on
Linux.

Note: I should have said "resource reclamation" instead of "fragmentation".
See discussion with ajross below for details.

~~~
ajross
Or simply if the address in question has been remapped since the original
free. For big 32 bit processes that's very likely. There's nothing wrong with
the implementation really, it's just oversold.

Obviously the valgrind advice is great. But I don't follow your point about
mmap vs. sbrk regarding fragmentation. Fragmentation is a property of memory
addresses, not the syscall used to allocate them. A big linear heap allocated
with sbrk will fragment just as badly as the same heap allocated in a single
block, even if it's allowed to have "holes" in it.

Maybe you're claiming that jemalloc deliberatly leaves gaps between
allocations? That can help a little in practice, though the real effect is
just to change the size of the fragmented blocks.

~~~
eis
Actually also with a 64 bit address space, you can easily get the same address
if the allocator returns a cached block.

The problem with sbrk() is that if you allocate a few megabytes of memory + a
tiny chunk, even after freeing everything apart from the tiny chunk near the
"program break", _nothing_ will be given back to the OS. If you use mmap()
instead though, all pages can be given back apart from the one where the tiny
chunk resides in. This makes for a tremendous difference sometimes.

~~~
ajross
OK, I understand. The symptom you're describing isn't really "fragmentation".
Fragmentation is the inability to use smaller blocks of memory because the
larger allocations won't fit. That behavior isn't changed by this.

You're talking about a resource reclamation issue. Unmapping a page is a clear
signal to the kernel that the memory is unused and can be repurposed
immediately. Otherwise, it needs to find and eject a page from memory using
the VM system, which is more expensive (though I'd guess not a lot more
expensive except in pathologically allocation-bound systems).

~~~
eis
You are right, it's not fragmentation but resource reclamation. I've added a
note to the initial post.

I'm not sure though that the VM system can reclaim memory in the heap
allocated with sbrk(). At least I've never seen that before. Or do you mean
"eject" as in swap out?

------
munin
the windows heap manager can be configured to do this on a per-application
basis using the 'gflags' utility.

it's a useful debugging tool. you're not assured that use-after-free is going
to be tightly temporally coupled with the free (a lot of the time it is
though) and as more time passes after free() the odds increase that the
dangling virtual address is re-allocated.

it is also the enemy of performance. you should probably not run production
things with the heap configured in this way.

------
bartwe
I have a number of custom allocators that use the same techniques, with some
improvements.

When space is cheap: All allocations are two 4k pages The returned pointer is
alligned with the end of the buffer to detect overruns. (with a 16byte
allignment) The page following the alloc is always denied. The space around
the allocation is filled with flag values, and these are checked on free.
After free the pages are held in storage for a few thousand following
allocations.

The variants of this allocator does things like only doing this for specific
ranges of allocation sizes or only after a certain number of allocations.

With this a good number of overruns and use after free bugs have been found.

Mostly used this technique on windows with delphi, on linux i prefer valgrind.

~~~
eridius
Why 2 4k pages? If I'm allocating 1k of memory, what does having an entire
extra page get me?

~~~
jsnell
The second 4k is a guard page, to catch buffer overflows. So the heap looks
like:

3k empty, 1k data, 4k empty and read/write-protected.

The extra page is needed since you can't set the memory protection flags at a
higher granularity.

~~~
eridius
Ah. It sounded like you were mapping 2 4k pages for the allocation itself,
aligning the allocation at the end of that 2-page span, and then mapping the
following page and marking it as denied.

~~~
jsnell
I wasn't the original poster, that was just my impression of the scheme. I
could have misunderstood, in which case I don't have any alternate theories
for what that second page is for :-)

------
Spider
Is this really different from the glibc malloc() implementation with
MALLOC_CHECK_ set to 3?

> MALLOC_CHECK_ is designed to be tolerant against simple errors, such as
> double calls of free() with the same argument, or overruns of a single byte
> (off-by-one bugs). Not all such errors can be protected against, however,
> and memory leaks can result.

> If MALLOC_CHECK_ is set to 0, any detected heap corruption is silently
> ignored;

> if set to 1, a diagnostic message is printed on stderr;

> if set to 2, abort(3) is called immediately;

> if set to 3, a diagnostic message is printed on stderr and the program is
> aborted.

~~~
marshray
Yes, the OpenBSD technique of completely unmapping the memory rather than re-
using it will be able to catch even some read accesses, not just write
accesses that happen to corrupt malloc's guards.

------
nodata
Can this be configured at runtime using an environment variable? It would be
nice to test for bugs like this in dev.

~~~
pmjordan
You could easily do this on Linux, where the executable format allows you to
override library-defined symbols, including malloc(). Implement the new
malloc/calloc/realloc/free as a static library and link it to your dev builds.
Don't link it for releases.

Or just use valgrind, if you can.

~~~
piotrSikora
You seem to miss the difference between "developer testing his/her app" and
"everyone testing every app".

To the point, I'm using "/etc/malloc.conf -> AFGJPRX" on most of my systems,
which means that virtually everything that runs there is checked. Can you
_easily_ do that on Linux? How many apps are you running via valgrind on a
daily basis?

------
smutticus
How is this different from efence on Linux?

~~~
lflux
It comes with the operating system, no need to re-link against a different
library or a different LD_PRELOAD path.

------
jeffreymcmanus
Stop saying "leverages" unless you're describing lifting a heavy rock using a
stick. The word you're looking for is "uses".

~~~
jemfinch
Your comment contributes nothing to this thread or this website. Stop wasting
your time and others' posting about matters of stylistic opinion.

~~~
jeffreymcmanus
Pot, meet kettle.

------
Tharkun
Why is this on hacker _news_? OpenBSD has been doing this for _years_? About
as news worthy as Windows 95's BSOD.

~~~
chollida1
Not everyone spends all day working in OpenBSD, I learned something from the
article.

