I agree with most of this, but I’m not sure about tracking the size metadata becoming a required task for the caller.
The cost of storing the size of every allocation is relatively high, at least some of the time, where it isn’t implied by the usage. Meanwhile the caching system for allocations can store it very efficiently, a block of 4KB of 8-byte allocations will contain over 500 allocations that can all share their metadata. Once they’re handed out by the allocator their shared origin is obscured, so they’d need individual tracking.
I do acknowledge that when size is inherent to the context (new or allocating for a specific struct) then maybe an allocator that doesn’t track size could allow for some clever optimisations, though I’m doubtful it could overcome the loss of shared metadata, which is so much more efficient.
The Rust allocator APIs require layout information on dealloc[0]. In the majority of cases it's a non-issue. Take Vec (like std::vector) for instance, it has pointer+length+capacity. It needs the capacity value so it knows when to realloc, so it can equivalently use it to dealloc with size information as well.
The only case I know of where this is an issue is when downsizing from Vec<T> to Box<[T]> (size optimisation for read-only arrays). Box<[T]> only stores the ptr+length, so the first step is calling shrink in order to make the capacity and length equal.
When it comes to type-erasure, it happens to work just fine. A type-erased heap pointer like Box<dyn Any> will have the size+align info stored in the static vtable. Yes it's some extra space stored, but only in the .text data and not as part of any allocations.
On this topic, I've linked a short post on allocator ideas[1] by a rust std-lib maintainer, which lists some of the other things we might add to rust's upcoming (non-global) Allocator trait
> The cost of storing the size of every allocation is relatively high
Thus it would be great if we don't push the burden to the allocator. It'll also need to store the size somewhere, adding to the cost for every allocation. Pay for only what you use.
Fortunately C++17 and C23 (free_sized) have already fixed this.
The point is that they’re simple and direct. Replacing them with a big data structure and modern syntax is not as good as what people have done for decades: allocate a pool of blocks to your exact liking.
I went into the article expecting the standard criticism of any and all things C: stupid people can do stupid things with them, so nobody should be allowed to use them. He instead pointed out some legitimate gotchas with malloc and free and offered a reasonable solution. I can't say that I see any problems with his proposed solution - his proposed API looks as simple and direct as old-style malloc and free to me.
Malloc never fails on linux in most configurations, problem #1. Overcommit and other "defaults" make this a non starter.
If the API can't be trusted as accurate, the rest of the issues are not worth using. No matter how the 'api' is presented, it will still have the same problem.
Correct, it also fails on IOS the same way ( https://www.mikeash.com/pyblog/friday-qa-2010-12-17-custom-o... ), I haven't checked on more recent IOS versions, but it used to. I imagine that sooner than later apple may merge IOS and OSX kernels, who knows.
If the solution proposed is not to work on the 3 of the common operating systems(IOS, ANDROID, LINUX) adoption may be a problem.
> malloc can fail,
Yes, only if you allocate greater than the virtual address space in a process, or if you don't use an optimistic allocator (or disable overcommit) it could, I suggest you try this to find some REAL shady offenders that just abuse memory use, lodge bugs and then record the response (or lack thereof).
I have tried this for a bunch of utilities, and have had a hard time convincing software developers that they don't need to do insane things.
> and should fail.
I absolutely agree, fixing this problem means 'fixing and solving the problem for all software' which is something that is a very hard problem to solve, if you do find a way let me know.
I don't see the point of passing the size to a "free" function. I don't see how it could be used to speed up de-allocation. Additionally most usage would probably not want to keep the size around.
But I concur that realloc is mostly pointless. For code that want to grow or shrink, I think it's much better for it to know the data block size. I think there's very little opportunity to happen to have free memory next to your allocation that can be "grown into". At least for slab like allocators, so the growing room is minimal.
It's a bit difficult to unify all APIs because data will be needlessly passed around, when in most cases you don't care. Aligned allocation may also need a slightly different implementation anyway.
> it would be great if std::malloc() could return how big the allocated memory block actually is, so we can leverage any extra space we might have gotten “for free”.
I remember one of the Window allocation functions doing this, but I believe eliminated that behavior as it lead to old applications that didn't handle it correctly crashing.
That is the danger with, say, adding a length value to free. Sometimes an off by one value will work fine, until someone tweaks how the allocator works.
I don't actually agree that it's a bad api, although it certainly has shortcomings. It's a low-level api to a library that is intended to be as slim as possible.
The sorts of things the author wants are indeed valuable and important, but also belong at a higher level of abstraction. The malloc() subsystem would even be a reasonable base to implement that on top of.
The problem described in the article is the opposite of what you say. malloc() is too high level - it hides the allocation size and simply returns a nice neat void*. This leads to implementation hacks like sneaking a capacity field into the allocation, and hiding this capacity from the user.
I've built such things on top of malloc/free. There's nothing about them that prevents that, but it requires more than just putting some sort of wrapper around malloc() and free() calls.
My own view on this is that a hardened allocator API should separate the functions of an allocation identifier/cookie and the actual pointer to the allocated memory:
where free() maybe also should take ptr, strictly for validation purposes.
A design like this encourages segregation of allocator metadata and the allocated memory, though it is possible to achieve such a design with the classic C malloc/free API.
However, a design like this is even more helpful against use-after-free because cookies can be unique for the lifetime of a program, whereas pointers naturally get reused when a block of memory is reallocated. So the traditional API can never be fully resilient against UAF, whereas an API like this can.
The underlying observation here is that malloc/free couples two different things (access to memory and identifying a previously made allocation) in a way that creates an API which is far less able to mitigate misuse in a safe way. IMO, these functions should be separated in new designs.
The downside with always-unique cookies would be that you'd necessarily need some lookup data structure on both alloc and free, which is gonna be pretty expensive, both in memory usage (at least 16-byte entries, multiplied by load factor) and performance (essentially guaranteed cache misses on both alloc and free, unless you have generational lookup tables). Or, worse, some tree structure if you don't want some allocations to have to resize the entire hashtable. That's two things from the GC world - generational allocations, and stop-the-world pauses vs even more significant overhead :)
What it solves is double-free, not use-after-free; potential corruption (even if not of the allocator state) is always gonna be a problem with any allocator that ever reuses memory.
Indeed, double-free, not UAF; I should know better than to write comments while sleep-deprived...
I suppose a cookie could be used in a "trust, but verify" approach if the free function takes both a pointer and a cookie. You would have the usual sidecar data next to the allocated region, but verify that the cookie matches. This would avoid the lookup issues you discuss.
SIMD is no mallot. It's absurd and wasteful to allocate page aligned memory by creating an even larger malloc allocation which will allocate extra pages via mmap underneath. Use mmap to directly allocate page aligned memory. That's what is designed for.
Most post-C languages fix it because it's a pretty well understood problem and the main complexity with fixing it in C is just backwards compatibility.
The cost of storing the size of every allocation is relatively high, at least some of the time, where it isn’t implied by the usage. Meanwhile the caching system for allocations can store it very efficiently, a block of 4KB of 8-byte allocations will contain over 500 allocations that can all share their metadata. Once they’re handed out by the allocator their shared origin is obscured, so they’d need individual tracking.
I do acknowledge that when size is inherent to the context (new or allocating for a specific struct) then maybe an allocator that doesn’t track size could allow for some clever optimisations, though I’m doubtful it could overcome the loss of shared metadata, which is so much more efficient.