
A Story Of realloc (And Laziness) - janerik
http://blog.httrack.com/blog/2014/04/05/a-story-of-realloc-and-laziness/
======
optimiz3
Code in the article for realloc is dangerous and wrong:

    
    
      void *realloc(void *ptr, size_t size) {
        void *nptr = malloc(size);
        if (nptr == NULL) {
          free(ptr);
          return NULL;
        }
        memcpy(nptr, ptr, size); // KABOOM
        free(ptr);
        return nptr;
      }
    

Line marked KABOOM copies $DEST_BYTE_COUNT, rather than $SOURCE_BYTE_COUNT.

Say you want to realloc a 1 byte buffer to a 4 byte buffer - you just copied 4
bytes from a 1 byte buffer which means you're reading 3 bytes from
0xDEADBEEF/0xBADF000D/segfault land.

EDIT: Also, this is why the ENTIRE PREMISE of implementing your own
reallocator speced to just the realloc prototype doesn't make much sense. You
simply don't know the size of the original data with just a C heap pointer as
this is not standardized AFAIK.

~~~
greenyoda
_" Also, this is why the ENTIRE PREMISE of implementing your own reallocator
speced to just the realloc prototype doesn't make much sense."_

If you're reimplementing realloc() it's pretty easy to know the size of the
allocated regions - you just need to store the size somewhere when you
allocate a block. One common method is to allocate N extra bytes of memory
whenever you do malloc() to hold the block header and return a pointer to
(block_address + N) to the user. When you then want to realloc() a block, just
look in the block header (N bytes before the user's pointer) for the size.

The block header can store other useful stuff, like debugging information. I
once implemented a memory manager for debugging that could generate a list of
all leaked blocks at the end of the program with the file names and line
numbers where they were allocated.

~~~
ANTSANTS
That would require either replacing malloc as well, or programming to the
hairy details of your system's libc (ie knowing how and where it lays out the
buffer metadata). The point is not that either are impossible, but that you
can't replace realloc without doing one or the other.

------
CJefferson
I have also found people often unestimate realloc (but have never done the
same level of investigation to find out just how clever it is!)

On several occasions I have wanted to use mmap, mremap, and friends more often
to do fancy things like copy-on-write memory. However, I always find this
whole area depressingly poorly documented, and hard to do (because if you mess
up a copy-on-write call, it just turns into a copy it seems, with the same
result but less performance).

While it's good realloc is clever, I find it increasingly embarassing how
badly C (and even worse, C++ which doesn't even really have realloc (as most
C++ types can't be bitwise moved) memory allocation maps to what operating
systems efficiently support.

~~~
plorkyeran
C++ _really_ wants a realloc variant that extends an allocation if it can be
extended without a copy, and leaves the allocation unchanged if it can't. The
annoying thing is that there's no good reason why this can't exist beyond that
the STL allocator interface happens not to have it.

~~~
bodyfour
jemalloc's non-standard interface gives you some of what you want, expectially
xallocx() [http://www.canonware.com/download/jemalloc/jemalloc-
latest/d...](http://www.canonware.com/download/jemalloc/jemalloc-
latest/doc/jemalloc.html)

There have been C++ templates written that use jemalloc-specific calls; for
instance see Folly from facebook. I haven't taken a close look, but I know
they do some jemalloc-specific code:
[https://github.com/facebook/folly/tree/master/folly](https://github.com/facebook/folly/tree/master/folly)

The other allocated-related thing that C++ really wants (and could benefit C
as well) is "sized deallocation". Most of the time you often know the exact
size of the memory you allocated. If you could pass that to free() the
allocator could save some work determining it. In the case of C++ the compiler
can often do this on your behalf for many "delete" calls (at least in the
cases where it knows the _exact_ type). Google did an implementation of this
idea and got good results. They submitted a proposal to the standards body but
I don't know if there is any recent activity. I hope it does happen though:
[http://www.open-
std.org/jtc1/sc22/wg21/docs/papers/2013/n353...](http://www.open-
std.org/jtc1/sc22/wg21/docs/papers/2013/n3536.html)

~~~
plorkyeran
Folly does take advantage of jemalloc to expand allocations in-place when
possible, but afaik it doesn't do the more extreme optimization mentioned in
the article where pages are moved to a different virtual address without
actually paging into memory.

Sized deallocation made it into C++14.

~~~
bodyfour
Since 2.1, jemalloc does support using mremap to do large realloc()'s,
although it seems to be off by default. You need "./configure --enable-mremap"
to get it.

That's good news about sized deallocation, I hadn't noticed that there is an
updated "N3778" proposal which apparently was accepted. I still haven't seen
the dlmalloc work to support that show up in the main svn branch.

------
asveikau
This bothers me so much:

    
    
        buffer = realloc(buffer, capa);
    

Yeah, 'cause when it fails we didn't need the old buffer anyway... Might as
well leak it.

~~~
rfrey
Serious question from a guy made soft by garbage collection: how frequent is
memory allocation failure nowadays, with large memories and virtual memory?
Were I to guess from my state of ignorance I'd think that if allocs began to
fail, there was no recovery anyhow... so leaking in this case would be one
leak right before a forced quit.

Wrong? Are there lots of ways allocation can fail besides low memory
conditions?

~~~
fprawn
Memory allocation failures are virtually non-existent in modern desktop
computers. Good practice is to not test return values from malloc, new, etc.

Memory can be allocated beyond RAM size, so by the time a failure occurs your
program really should crash and return its resources.

Embedded systems have fewer resources and some will not have virtual memory
and so the situation will be different. But unless you know better, the best
practice is still to not check the return from allocators. Running out of
memory in a program intended for an embedded platform should be considered a
bug.

~~~
fprawn
I'll clarify this. I'm not saying you shouldn't ever check return values,
that's obviously not the right thing to do. And of course there are exceptions
to the general rule. If you're allocating a large chunk of memory and there's
a reasonable expectation that it could fail, that should be reported, of
course.

In the general case, however, if allocating 100 bytes fails, reporting that
error is also likely to fail. An actual memory allocation failure on a modern
computer running a modern OS is a very rare and very bad situation. It's
rarely recoverable.

It's not bad to handle allocation failures, but in the vast majority of cases
it's very unreasonable to do so. You can write code for it if you want, have
fun.

And just to be completely clear, I am ONLY talking about calls to malloc, new,
realloc, etc. NOT to OS pools or anything like that. Obviously, if you
allocate a 4Mb buffer for something (or the OS does for you), you expect that
you might run out. This is ONLY in regards to calls to lower level heap
allocators.

I don't think you'll find any experienced programmer recommending that you
always check the return from malloc. That's completely absurd. There are
always exceptions to the rule, however.

~~~
asveikau
> In the general case, however, if allocating 100 bytes fails, reporting that
> error is also likely to fail. An actual memory allocation failure on a
> modern computer running a modern OS is a very rare and very bad situation.
> It's rarely recoverable.

I call BS on this. First of all, it's not the 100 byte allocation that is
likely to fail; chances are it's going to be bigger than 100 bytes and the 100
byte allocation will succeed. (Though that is not 100% either.) Second, the
thing you're going to do in response to an allocation failure? You're going to
unwind the stack, which will probably lead to some temporary buffers being
freed. That already gets you more space to work with. (It's also untrue that
you can't report errors without allocating memory but that's a whole other
story...)

I suspected when I wrote in this thread that I'd see some handwavy nonsense
about how it's impossible to cleanly recover from OOM, but the fact is I've
witnessed it happening. I think some people would just rather tear down the
entire process than have to think about handling errors, and they make up
these falsities about how there's no way to do it in order to self justify...
Although, when I think back to a time in which I shared your attitudes, I
think the real problem was that I hadn't yet seen it being done well.

~~~
fprawn
If you have time, can you expound on this? Is there, perhaps, an open source
project that handles NULL returns from malloc in this way you could point me
to?

~~~
asveikau
My first instinct is to say look at something kernel-related. If an allocation
fails, taking down the entire system is usually not an option (or not a good
one anyway). Searching [http://lxr.linux.no/](http://lxr.linux.no/) for
"kmalloc" you see a lot of callers handling failure.

------
ctz
The realloc implementation in this blog is incorrect: the passed in pointer
must not be freed if realloc is called with a non-zero length and returns
NULL. This will cause a double free in correct callers.

As someone else pointed out, the example call of realloc is also incorrect.

edit: also, malloc is incorrect for three reasons: 1) sbrk doesn't return NULL
on failure, 2) a large size_t length will cause a contraction in the heap
segment rather than an allocation, and 3) sbrk doesn't return a pointer
aligned in any particular way, whereas malloc must return a pointer suitably
aligned for all types.

~~~
xroche
I fixed the double free. I must admit that the code was typed as I wrote the
blog entry, and is horribly wrong :)

------
kabdib
I fixed a crippling bug on another platform that was taking down whole
servers, because someone was depending on a clever realloc to behave well.

This is implementation coupling at its worst. Don't do it.

~~~
xroche
Yes and no. The real error here would be to realloc without any geometric
progression IMHO - ie. reallocating one more byte each time, which would
behave well on Linux (except the libc call cost of course) but not on other
implementations (such as some Microsoft's MSVCRT versions). Assuming realloc
has no catastrophic performance impact is not something too daring.

------
picomancer
This is really neat. Somehow I always assumed realloc() copied stuff instead
of using the page table.

But say you have 4K page table size. You malloc() in turn a 2K object, a 256K
object, and another 2K object, ending up with 2K, 256K, 2K in memory. Then
your 256K is not aligned on a page boundary. If you realloc() the 256K it has
to move since it's surrounded by two objects. When you do that, you'll wind up
with the two pages on the end being mapped to multiple addresses. Which is
actually just fine...Interesting...

~~~
nhaehnle
The libc memory allocator does not simply hand out memory contiguously. In
your example, the 256K block will end up being 4K aligned.

In fact, that's what the article already explains: the large alloc will just
end up being passed through to the kernel, which only deals at page
granularity.

~~~
whoopdedo
What the article revealed to me is that there is no guarantee a contiguous
block of allocated virtual memory will be backed by contiguous physical
memory. In hindsight, that should be obvious.

But what does this mean for locality? Will I be thrashing the cache if I use
realloc frequently? Do I even have the promise that malloc will return
unfragmented memory?

~~~
nhaehnle
_Do I even have the promise that malloc will return unfragmented memory?_

What do you mean by this? malloc returns memory that is contiguous in the
virtual address space. It may not be contiguous in the physical address space,
but that should be irrelevant for cache behavior.

 _Will I be thrashing the cache if I use realloc frequently?_

I suppose. But if you use realloc, you should anyway ensure that you realloc
geometrically growing chunks of memory (e.g., whenever you need a new buffer,
you _multiply_ its size by a constant factor like 1.2 instead of just adding
an element at a time). As a result, realloc() should be infrequent enough that
it normally doesn't matter.

------
crackerz
And this is why OpenSource is awesome.

~~~
__david__
Agreed. Not sure why people are downvoting you. It blows my mind every time I
think, "I wonder how _< some program>_ works?" and I'm able to just "apt-get
source some_program" and check it out.

Working with Linux and having the source (and the ability to change it) for
entire stack all the way down to and including the kernel is liberating. As a
programmer it feels like the entire world is open to me.

I guess that's GNU's dream brought to life, really.

------
mjcohen
In the original #define, the parameter is lower case "c" and the expansion
uses upper case "C".

