Perhaps I'm misunderstanding, but do many C programmers understand not only the current state of malloc at any given moment in their code but exactly how it works?
I think not.
A lot of the things you do in C++ to reduce memory management overhead are the same things you can do in Java, C#, and Go to reduce memory management overhead. That effort is neither special nor universal.
HLLs often have to be careful about using language features that unexpectedly create garbage, but in terms of actual management and collection it's not like ANY competent modern language is slow at it.
People often seem to neglect the fact that Java is still surprisingly fast despite spending lots of time on memory management just because many developers are so insensitive to how things alloc. Modern GC systems can manage allocs as well as deallocs, so with care from the programmer and the runtime authors you can reach the kind of performance people talk about as critical for "embedded systems" (even though in practice SO many things do not deliver on this promise in shipped products!).
Good programmers understand how malloc works. What, are you kidding, or am I misunderstanding?
Performance-oriented programmers do not use malloc very much. As you say, you can also try to avoid allocations in GC'd languages. The difference is that in a language like C you are actually in control of what happens. In a language that magically makes memory things happen, you can reduce allocations, but not in a particularly precise way -- you're following heuristics, but how do you know you got everything? Okay, you reduced your GC pause time and frequency, but how do you know GC pauses aren't still going to happen? Doesn't that depend on implementation details that are out of your control?
> even though in practice SO many things do not deliver on this promise in shipped products!
But, "in practice" is the thing that actually matters. Lots and lots of stuff is great according to someone's theory.
First of all, it's not like most mallocs DON'T have heuristics they're following. Without insight into what it wants to do it is equally as opaque to how Java or the CLR manages memory.
And your behavior absolutely can and does influence how much things are allocated, deallocated, and reused. If you think that the JVM cannot be tuned to that level, you're dead wrong and I can point to numerous projects written for virtual machines that reach levels of performance that are genuinely difficult to reach no matter your approach.
> Good programmers understand how malloc works.
"Good programmers know how their GC works. What, are you kidding, or am I misunderstanding?"
> But, "in practice" is the thing that actually matters.
"In practice" Kafka is the gold standard of pushing bits through distributed systems as fast as possible. "In practice" distributed concurrency systems (that are often the speed limit of anything you want to build on more than one computer, e.g., etcd, Zookeeper, Consul) are I/O limited long before their collected nature impacts their performance.
And if we can eventually liberate ourselves from operating systems that give priviledged status to C and C++, that ecosystem will diminish further because its performance benefits come at too high a cost, and are generally oversold anyways.
> "Good programmers know how their GC works. What, are you kidding, or am I misunderstanding?"
I think you are not understanding what I am saying.
You link your allocators into your code so you know what they are. You see the source code. You know exactly what they do. If you don't like exactly what they do, you change them to something different.
A garbage-collector, in almost all language systems, is a property of the runtime system. Its behavior depends on what particular platform you are running on. Even 'minor' point updates can substantially change the performance-related behavior of your program. Thus you are not really in control.
As for your other examples, apparently you're a web programmer (?) and in my experience it's just not very easy for me to communicate with web people about issues of software quality, responsiveness, etc, because they have completely different standards of what is "acceptable" or "good" (standards that I think are absurdly low, but it is what it is).
In my experience, most C/C++ devs know what malloc/free or new/delete does, but how? They don't care as long as it works and doesn't get in their way. Sure in larger applications, the allocator/deallocator can consume quite some time - but even then it rarely is the bottleneck.
I happen to have a more hands-on experience with allocators, I had to port one a long time ago, but in C or C++, I rarely knew how the one I was using was implemented (except for the one I ported). Seeing the source code? Sorry, that's not always available and even if it is, not too accessible - not that many devs actually ever look into the glibc code...
And linking your allocator? Most of the times you just use the default-one provided by your standard library - so that happens 'automagically' without most developers realizing this. I yet have to see a modern C or C++ app that specifically has to link it's own allocator before it could actually allocate memory. Most compilers take care of this.
For most stuff I do - I like gc's. In most real-world situations, they are rarely the bottleneck, most applications are I/O bound. For most stuff, a GC's benefits outweigh it's disadvantages by a huge margin. And if the gc could be become a bottleneck, you should have been aware of that up front, and maybe avoid something using a GC, although I'm not a fan of premature optimization.
This is unrelated to GC.
> The cost of having GC is that you pay with higher memory usage for doing GC in batches thanks to which you do not have to pay for single deallocations. So this performance advantage cannot be used in embedded space because there is no free memory for it, you would need to GC all the time which would kill the performance.
The same is true for C. You don't get to make frequent allocations for free in any language. You have to trade space for performance; in GC-land, the answer is tuning the collector. In Go, this is one knob: the GOGC env var.
For example, languages that use closures have to have very smart compilers or even innocuous functions can create implications for the working set size, which puts the allocator and deallocater under a lot more pressure.
And that's not even the most subtle problem you might run into! A common cause of memory constraints in Java 1.7 and earlier stemmed from subarrays of large arrays. Java made a fairly innocuous decision regarding methods like String.substring that ends up biting a lot of people later on, even as it is the right decision for a slightly different set of performance considerations.
A very well know, Aonix, now part of PTC, used to sell real time Java implementations for military scenarios like missile guidance systems.