If you don't pass the size, the allocation subsystem has to track the size somehow, typically by either storing the size in a header or partitioning space into fixed-size buckets and doing address arithmetic. This makes the runtime more complex, and often requires more runtime storage space.
If your API instead accepts a size parameter, you can ignore it and still use these approaches, but it also opens up other possibilities that require less complexity and runtime space by relying on the client to provide this information.
The way I've implemented it now was indeed to track the size in a small header above the allocation, but this was only present in debug mode. I only deal with simple allocators like a linear, pool, and normal heap allocator. I haven't found the need for something super complex yet.