
Memory – Part 6: Optimizing the FIFO and Stack allocators - fruneau
https://techtalk.intersec.com/2014/09/memory-part-6-optimizing-the-fifo-and-stack-allocators/
======
userbinator
It's great to be able to make memory allocators faster, but it's also wise to
consider the possibility that it's the applications that use them which are
where the inefficiency is - and ironically it's the applications that do the
most unnecessary allocation/deallocation that would benefit the most from a
faster memory allocator.

To paraphrase a common adage, the most efficient way to allocate memory is to
not do it at all. Probably helps with reducing the chance of use/free bugs
too.

~~~
yason
What would be an unnecessary allocation? A program generally does something
with the allocated memory and at that point it becomes a necessary allocation.

Specifically, it makes a _lot of sense to optimize system_ allocators.

If you can't rely on fast system allocators you are inclined to recreate your
own pools and arenas that allocate from malloc() and write all the management
boilerplate yourself, with your own bugs on top. This also eats up your time
that you would otherwise spend on writing your application. Oh yeah, you're
also likely to end up slower than any of the recent allocators published since
2000's or so.

On the other hand, if the system has some modern slab-style allocator that is
cache-aware and does automatic pooling of similarly sized objects, you get all
that basically for free by calling malloc() and free() in a dumbfounded and
"unnecessary" way, very much repeatedly. Well, the applications need to manage
their memory somehow, hence the allocations and deallocations.

Optimizing the system memory allocator pays off as long as you never see the
allocator hogging too many cycles in your profiler. If you can get away with
lots of malloc(), free(), and whatnot because of a smart allocator that
ideally turns those into bumping pointers merely then, _you win_.

Custom memory management is generally useful in some highly optimized loops
where you just can't pay the cost of a random book-keeping round when calling
malloc() or free(), or in cases where you can spend some to save some. Then
you might want to manage your own pool so that you can guarantee there won't
be operations other than O(1). Alternatively your program might benefit from a
pattern where all the memory is allocated sequentially and never freed until
at the very end of the operation. Processing one request or running one cycle
of operation might examples.

~~~
millstone
A C++ program that copies arounds lots of temporary std::strings is an example
of unnecessary allocations.

The thing about optimizing allocators is that you very quickly run out of ways
to make them faster without a space tradeoff. Notice the first optimizations
detailed in the article was to retain a 'free page' instead of returning it to
the kernel: this speeds up allocation and deallocation, but increases memory
usage. And if one makes it faster, why not two, or three, or fifty?

The same is true for that slab-style allocator. The basic idea is to separate
memory into separate pools of different sizes, and only allocate from the pool
for the given size. If that pool is full, enlarge it, even if other pools have
enough space to satisfy the allocation. That's wasted memory!

So we have to balance speed against memory usage. The right balance is domain-
specific: that Java app running on a monster server can afford to waste lots
of memory to speed up allocations, while the allocator on my iPhone needs to
be much more mindful of its space overhead.

------
chris_wot
Interesting article. I checked out [http://intersec.com](http://intersec.com)
to see what they do, but I got redirected to [https://www-
preprod.intersec.com/en/](https://www-preprod.intersec.com/en/)

Seems a bit odd?

