

A/b test mallocs against your memory footprint - ice799
http://timetobleed.com/ab-test-mallocs-against-your-memory-footprint/

======
jwilliams
I'm a bit lost - what does this provide above existing tools - e.g.
MallocDebug+Shark on OSX or Valgrind/heap/leaks/etc on Linux?

~~~
ice799
the shim outputs a log of the memory allocation functions called
(malloc,realloc,calloc,free) that can be replayed by the replayer program.

you can then replay the exact same allocation pattern against different
allocators to determine which is best for you.

to my knowledge, the tools you mention do not provide similar functionality.

~~~
jwilliams
Right ok.... What I don't get is that it mentions profiling the advantage of
TCMalloc, which is thread-cached malloc. Is this going to be realistic when
the replayer is a single thread? (I could be missing something)

~~~
ice799
Short answer: depends what you care about.

Long answer: if the allocator is poorly designed, a lot of time will be spent
traversing its free list/tree/whatever looking for a block to fit your size
requirements. this lookup time can be exacerbated if your heap is badly
fragmented, or if the allocator does a poor job coalescing freed blocks. you
could end up spending lots of time in malloc looking for a nicely sized block.

also, with regard to heap fragmentation - long running processes which do lots
of allocations/frees can cause fragmentation, again depending on the design of
the allocator. if there is a lot of heap frag, you could see some substantial
bloating.

so profiling your process for those two items can be valuable.

what you say is true; the major gain for TCMalloc is in
multi-(native)-threaded apps.

perhaps the next version of malloc_wrap will support multiple threads.

in either case, we have not yet finished collecting data about the different
allocators, so I am not currently in a position to say which is better for our
use case.

i just wanted a tool to let me replay a constant set of allocation patterns
against different allocators to find out if swapping out libc's malloc made a
difference for us and that is precisely what malloc_wrap is.

~~~
jwilliams
_what you say is true; the major gain for TCMalloc is in
multi-(native)-threaded apps._

I think this is pretty key, because otherwise TCMalloc is somewhat of an
overhead. Depending on your platform, a standard malloc with will pull ahead
(depends on how favourable locking is, but it is the case for OS X anyway).

A multi-threaded instance sounds interesting, but - I'm guessing it would be a
challenge to get a representative sample.

~~~
ice799
You might be reading the article too literally -- you can test more than just
tcmalloc, of course (ned, ptmalloc*, libumem, etc). It is -very- possible that
one of these allocators will handle our memory footprint more gracefully than
say, libc. There is only one way to find out: via A/B testing.

I think the important thing to keep in mind is that assertions like:

"I think this is pretty key, because otherwise TCMalloc is somewhat of an
overhead."

are a bit subjective, IMHO. Allocators are different from one another, and of
course they react to a series of allocations/deallocations differently. We're
trying to find out if the way we use our heap is better suited to another
allocator like tcmalloc, or nedmalloc, or whatever.

And RE: multi-threaded - I don't believe it will be particularly difficult to
get a representative sample, but working on that isn't very high on my list
right now.

