> If you make the array extremely large, your program will still compile just fine.
It may compile, but it might not link. With very large static arrays, in the link step, you might get something like this:
my_program.o: In function `my_function':
my_program.c:(.text+0xf38): relocation truncated to fit: R_X86_64_PC32 against `.bss'
It's due to the linker assuming statically allocated things can be addressed with a 4-byte pointer (I encountered this on x86_64 arch). Using malloc to allocate the array will fix this.
That being said, using malloc once to allocate a very large array will get the same dynamic allocation of backing RAM as described in the article on linux, assuming overcommit is enabled.
I don't know about other Unixes, but Linux allows the sysadmin to decide if overcommit is allowed or not. I believe Windows does as well, but the default is to not overcommit.
Windows does not allow overcommit. The commit limit is the total size of RAM + the size of the page file, attempting to commit more than that will fail.
This behaviour is platform specific unfortunately. For instance I'm pretty sure that Emscripten-compiled WASM running in browsers will fail if the array doesn't fit into the pre-allocated INITIAL_MEMORY size (even when ALLOW_MEMORY_GROWTH is enabled) - not sure about other WASM runtime environments though. Similar problems probably on embedded platforms without MMU.
I’ve dinkered around Wasm some but am certainly no expert. Were you referring to the initial memory that is allocated from the Wasm code and/or js, etc? Or is this something set by the Wasm runtime?
I haven’t run it myself but I’ve heard others mention that you can prob get several hundred MB to a GB or so but shouldn’t count in getting near the 4GB limit.
Nonetheless, that’s plenty for my use case and random hacking.
I'm not sure if it's a WASM requirement or specific to some WASM runtimes or compilers. At least in Emscripten you have an INITIAL_MEMORY linker option which describes the initial size of the WASM heap. The program, its static data and the stack need to fit into this initial size. From that point on, any dynamically allocated data is fine because that will grow the WASM heap as needed (up to 4 GB for wasm32).
The main disadvantage of this approach is that to free memory, you must terminate the program. This is fine for a program which runs for a second, but is unusable for long-running processes and servers where you want to free memory after finishing any memory-intensive task so that everything else that your machine is running concurrently gets memory too.
That’s not a problem for cases like a dispatcher that forks off a child per request and the memory intensive work is done in child space. In fact it can result in admirably simple and stable software. Things have probably changed but that used to be how telco software managed trunks.
madvise(MADV_FREE) (on *BSD) or madvise(MADV_DONTNEED) (on Linux) will remap the specified memory range to the zero page or leave it intact at the kernel’s discretion.
This is also fine for any long-running processes and servers where the "vector" in question is not expected to shrink its allocation, which I would guess is the main use of vectors. Shrinking a vector's allocation is a niche use, with some finicky APIs, that most programmers never needed or touched.
It is elegant in the regard that the eviction only happens of page pressure warrants it. If the vector happens to be small enough, nothing special happens.
And you save on having to write some kind of allocator for it.
I share a dislike for std::vector type automatically reallocating arrays -- because of the "reallocating" part. I like preallocated chunks and stable pointers.
To a degree, I agree with the premise of the post, we can preallocate huge chunks, and actual resource consumption will only happen on demand when the memory is first used. But note that there is still a resource whose allocation we can't delay: virtual address space, which must get acquired immediately obviously. When trying to be compatible with 32-bit machines (less than 4GB of virtual address space available), preallocating huge chunks is not practical.
As to using static globals, I'm not sure it's the best idea. One can use mmap() (or Virtual Alloc() on Windows) to the same effect. That might be better from an architectural view, with regards to encapsulation and such, and allows a little more dynamicity, making multiple separate (but still large) allocations and such. It's possible to give debugger-visible names to these allocations as well, for example by way of the memfd_create() API.
Somewhere I ran across the idea that programs should distinguish between errors and apologies in their messaging.
An error indicates the user's fault: they need to correct their input before retrying.
An apology indicates the program's fault: either logic for what had been incorrectly thought to be a rare^2 case is missing, or the program has only been configured to deal with N foos, and the user's input requires more. In the former case, the user needs to get a programmer fix before retrying, but in the latter, maybe they can just reconfigure and rebuild (even better, restart with different env vars?) then retry.
The relevant point of this is that when the user is likely to be another programmer, "reconfigure" can be as simple as editing a static allocation.
Isn't that basically the distinction in HTTP error codes? The 4xx range indicates the user did something wrong, the 5xx indicates the program did something wrong.
I like the idea of "apologies" -- sometimes the best thing a program can do is to try, and if it doesn't work then re-try, maybe with a small change guided by the user.
However, many programs require a bit more planning how to deal with unfortunate situations, at least without entirely restarting. It's good to not prevent repairing an unfortunate situation from the start by baking in certain constants that can't be practically assumed.
Better than editing a static allocation, is if the program can cope with that situation automatically in some way. Next better thing, have it runtime configurable (e.g. command line arguments, no recompilation needed). I don't see a huge benefit to prefer global arrays over memory maps generally.
The annoying thing about "reallocating" is that it's mostly the price of having the flat address space (and not dealing with 128-wide fat pointers). The virtual memory subsystem already juggles around physical pages to allow two non-overlapping virtual address ranges to transparently map to overlapping physical address ranges. The implementations of std::deque() from C++ standard library usually use the similar technique but it's arguably a bit silly since it essentially pushes a MMU-like system on top of an existing hardware one!
I don't think are current memory architecture is fundamental incompatible with a paging based vector. Doubling the size of pointers is expensive. However, given current RAM sizes, our existing 48-bit effective pointer size is already large enough. It isn't enough to give every allocation a massive amount of virtual address space to grow into. But it should be more than enough to seperate out actual memory allocation from virtual address space allocation.
This would mean a more complicated malloc, where you ask for both an actual memory allocation, and an amount of address space to be allocated. And will still require you to have some sense of the maximum size you might need in a given vector.
Also, it forces memory allocations to be multiples of the page size, which isn't always ideal.
> I share a dislike for std::vector type automatically reallocating arrays -- because of the "reallocating" part. I like preallocated chunks and stable pointers.
Use request at creation to set a capacity for 99.9% of use cases. Still have realloc there to stop any sneaky buffer overflows.
std::deque doesn’t invalidate existing elements when new ones are added to either end of it. It’s not a panacea, and has its own set of tradeoffs, but if non-invalidation is a requirement for you it is available.
I made an std::vector like wrapper[1] based on this principle, it was fun. Also focused on some of the other advantages you get, like never having to move data when you grow the array, and not invalidating pointers/iterators. Never really used it though.
I'd really want a vector variant that is still on the heap but init-allocated. That's most of the use cases for a std vector in my experience. It's trivial to write yourself of course, but non-reserved vector usage is a very common perf problem
As many (or most) of C++ features, std::array is a logical idea with terrible ergonomics, bordering to unusable. While they are potentially a little safer to use than plain C arrays (but compilers could easily add runtime checking for plain C arrays too), they are much less ergonomic, both in verbosity of declaration and errors.
They can't replace an init-allocated std::vector either, because they have a size fixed at compile time.
And unique_ptr itself is TERRIBLE to use too. If RAII is your thing, then still very often, it's totally worth the boilerplate to wrap individual objects in type-specific explicit "Holder classes", instead of relying on std::unique_ptr which has terrible ergonomics too, again both for declaration and use.
Oh, and if you're using a free-standing destructor function for the managed object (which is often a very good idea because exposing the class definition in typical C++ style leads to TERRIBLE TERRIBLE compile times), you'll have to bake in a function pointer that is carried by unique_ptr at runtime (I suspect that is because they couldn't make the function a templated parameter because in C++, this leads to ambiguitiy if it is a static inline function declared in a header (terrible!)). So you're paying for all the all the drawbacks of the templated type and get none of the benefits (except for saving a few lines of boilerplate).
TERRIBLE.
And now you propose combining these two terrible things? How would you use this?
#include <memory>
#include <array>
struct Foo
{
~Foo()
{
printf("Bye!\n");
}
};
int main()
{
// std::unique_ptr<std::array<Foo, 42>> foo = std::make_unique<std::array<Foo, 42>>(); // my fingers hurt and my eyes bleed
auto foo = std::make_unique<std::array<Foo, 42>>(); // fingers hurt a little less but my brain will melt on next compiler error
foo->at(0); // valid
foo[0]; // error. TERRIBLE!
return 0;
}
I wouldn't be so upset about your advice if I didn't know you often don't validate your claims and "references", but keep on suggesting and criticising existing practice that has evolved to certain points (and keeps evolving!) for a reason.
I validate my claims in production code, running in places like CERN and Nokia Networking infrastrure, and not being part of anti-C++ brigade, rather the nuke C one, since 1993.
Nope, I don't buy that this is a tried and tested pattern in your code.
> running in places like CERN and Nokia Networking infrastrure
Is there no bad code in these places? And, as an avid follower of your posts, how long ago has this been? 15 years? And now you're selling people ideas on HN instead?
You couldn't even be bothered to actually use type alias to make the example simpler, or take into consideration build configurations, only to make the typical C rant.
Type alias doesn't save me from typing that stuff at least once, plus I have to come up with a name which adds indirection (and a dependency that I have to manage!) that I have to recompute in my head everytime I read it.
It's also (mostly?) transparent to compiler which will still spew me with the same unreadable errors and types as before.
The C++ modules that you've been advocating for years as a workaround for slow compile times still haven't arrived in my (and most everybody else's) practice, sorry.
I do use precompiled headers in some settings to work around slowness, which does help a lot for build performance but is again terrible (breaks modularity, it's basically a choice: #include everything vs have slow build, with little possibility for making a tradeoff).
Mutable global state isn't a problem, only shared mutable global state may be (for instance when the global state is writable from unexpected places in the code base or from other threads).
There's a lot of problems that can be solved just fine with an upfront defined fixed memory layout in a global static variable.
Dismissing globals is basically the same problem as sweeping generalizations like "goto considered harmful", "avoid for-loops" or "almost always auto" - such statements can serve as advice for newbies, but following the advice blindly can also lead to less elegant and less maintainable solutions.
Global mutable state is still terrible design unless actually necessary. Any function at all in your entire program can change the state of the thing which makes it very difficult to debug and reason about.
In all languages I'm aware of you can limit the visibility of global variables to the current compilation unit / source file, it's still 'mutable global state' from the pov of functions in that file, but non-existent for anything outside the source file.
> (for instance when the global state is writable from unexpected places in the code base or from other threads)
...or from a function that calls itself, directly or indirectly. This automatically becomes possible basically whenever the function accesses a user-supplied callback. So you'll run into issues unless you have full control over your callees, or only access the global array in logically-atomic patterns.
(To give an ancient example: to enable function calls, PDP-8 programs allocated a word of static memory before the top of each subroutine to store the return address. However, if a subroutine tried to recurse into itself, then the old return address would be overwritten, and it would never be able to return properly. Supposedly this could result in very gnarly errors to debug.)
And did you know that if you put your entire program in the main function you can use labels and goto to have something like exceptions and coroutines.
Zonk. This is a very astute observation. Its awesome behavior, but its good by accident and luck. Therefore...
Programming languages should directly support a data structure which is guaranteed to have this behavior. Not just that it allocates a new array which is 4k bigger and the copies it over. But one which grows by having the OS map another 4k block to have the addresses past the old end of the array.
Looks cute on the surface, but if the program ever has to dump a core, it will eat up a fair bit of space, so RCA will be a massive pain.
I made a small experiment with 1-million int array, which has a single element written to at index 100000, and then force a core dump by writing to a dereferenced int pointer which is initialized as 0.
I feel the use of "vectors" and "arrays" together makes it a bit confusing but I guess the title "Static arrays are the best dynamic arrays" would be even more confusing...
"People should always be used when a collective noun referring to the entirety of a group or nation (i.e., "the French People") is called for. For references to groups of a specific or general number, either people or persons may be used. However, modern style guides tend to prefer people where earlier guides preferred persons, especially for countable groups. "
Genius. But I have an even better lifehack: if you compute the result of your program locally, you can replace the whole program with a single call to printf()!
It may compile, but it might not link. With very large static arrays, in the link step, you might get something like this:
It's due to the linker assuming statically allocated things can be addressed with a 4-byte pointer (I encountered this on x86_64 arch). Using malloc to allocate the array will fix this.That being said, using malloc once to allocate a very large array will get the same dynamic allocation of backing RAM as described in the article on linux, assuming overcommit is enabled.