How does it prevent such allocation? Can you not just mark the first read from a...

DSMan195276 · on Jan 8, 2016

Internally, an OS like Linux maps every page in a requested piece of memory to a read-only page of zeros. When you attempt to read from this page, it works fine. When you attempt to write to it, the MMU causes a page-fault and the OS inserts a page where the zero-page is, and then continues your program execution. Thus, it doesn't actually have to allocate any memory for you if you never write to the new pages.

But the OS/MMU doesn't distinguish between a regular write, and a write of zero. Thus, if you manually zero every page you get (And thus write zeros to the read-only zero-page), it'll page-fault back to the OS and the OS will have to allocate a new page so that the write succeeds - Even though if you didn't do the zeroing of memory you would have gotten the same effect of having a bunch of zeros in memory, but without having to allocate any new pages for your process.

kqr · on Jan 8, 2016

Isn't that just saying that calloc is compatible with lazy allocation?

DSMan195276 · on Jan 8, 2016

Kinda. Reading my comment a second time, I'm not exactly happy with my description, since while it's 'right' is a very simplistic description, ignoring some of the finer points.

Since malloc/calloc are generally used for smaller allocations, the chances you can actually avoid allocating some pages you ask for is pretty slim since a bunch of objects get placed into the same page (And thus writing to any of them will trigger a new page being allocated). There's also no guarantee there isn't headers for malloc to make use of, or similar surrounding your piece of memory, which makes the point moot - Just using malloc triggers writes to at least the first page. So while calloc/malloc are kinda compatible with lazy-allocation, you really shouldn't rely on it being a thing, and it probably won't matter.

It's worth understanding, but the chances it actually comes into play aren't huge. If your program does lots of small malloc's and free's, then it basically won't matter because you won't be asking the kernel for more memory, just reusing what you already have.

If you care about taking advantage of lazy-allocation for one reason or another, the bottom line is probably that you shouldn't be using malloc and calloc for that then. Just use mmap directly and you'll have a better time - more control, you have a nice page-aligned address to start with, and you can be sure the memory is untouched. malloc and calloc are good for general allocations, but using mmap takes out the guesswork when you have something big and specific you need to allocate.