Nice, they are saying exactly the same as those pesky game developers.
"It's interesting that many games can afford a constant 10x interpretation overhead for scripts, but not a spikey 1% for garbage collection."
Which was it again, the engine chosen by Nintendo, Microsoft and Google as first party to their 3D APIs?
The anti-GC crowd on the games industry, is no different than the ones that fought adoption of C/Modula-2/Pascal over Assembly, and then fought adoption of C++ and Objective-C over C.
Eventually they will suck it up when the major platform owners tell them it is time to move on.
Why is that surprising?
Games are basically about humans predicting things and random spikes prevent that from happening in time sensitive games. Beyond game play implications, I suspect there's also something about jerkiness in movement that bugs human senses.
It's not a very satisfactory answer (and there are likely much better tradeoffs to be made), but given the 10x and 1% comparison (not entirely apples to apples though) the comment sounds a bit more interesting.
Performance-oriented code implies everything is on the stack and/or packed into large arrays, at which point you don't need a GC after all.
As the truism goes, if you require your software to be performant, you must first make Performance a Requirement.
In my mind performance is alway a requirement (and a feature). We just stopped caring a long time ago, because it is easier to think about performance as a hardware or capacity problem, memory as GC problem and so on.
That's how I've been thinking about it with Haskell at least (lots of GC knobs, manual performGC hook, compact regions for having long-lived data, good FFI, as high-level as any scripting language you could hope for)
The point was, they said "do manaual memory management" if you want speed.
It seems to me that this is where the GC “set and forget” model really shines, since otherwise you just have to do all that work manually using an allocation and a free, or a constructor and a destructor, or some similar pattern. Perhaps Rust has some clever answer for this?
The issue is not really the scope, the issue is how many owners the data needs to have. A variable that needs to live longer than a scope, but only has a single owner, can be directly returned from that scope. Once you have the need for multiple owners, the most straightforward answer is "reference count them," but it depends on your exact requirements.
> A typical example here would be a UI framework, where memory has to be managed for the window and its widgets, and then the entire application is suspended until the next event.
GTK heavily uses reference counting. People are also investigating what a "rust-native" UI toolkit would look like; taking strong influences from ECSes. It's an open question if those architectures end up better than refcounts.
Similar to C++'s std::shared_ptr
If not, is there something about OCaml that makes this strategy more suitable than it is for other languages?
If not, is this a case of this being the best strategy they have the resources to implement, rather than the best possible strategy?
I'm pretty sure that would have not performed well without the aggressive prediction logic in modern processors.
Java 1's object accesses always read through an indirect pointer, but that went away in the name of performance, either when Hotspot was introduced, or on the next round of GC impromevents.
Indirect pointers or Brooks pointers has it is called were used in Shenandoah v1 to allow an application thread that perform a read to not move the object during the evacuation phase.
This strategy has been removed in Shenandoah v2 to have a better throughput so now both read and write by the application move the object during the evacuation phase.
ZGC has never used Brooks pointers.
What hasn't been explored very well is how to formulate these solutions so mere mortals can comprehend how they work. Algorithms accessibility is, I believe, the limiting factor on building systems any bigger than the ones we have now. When there is one tricky bit in the code, you can get away with asking people to dive in and learn it. When there are 50? 100? Just figuring out the consequences of how those systems interact is a full time job, let alone how they function internally.
Give me a SAT solver, or a parser, or half a dozen other things, that decomposes a problem the way a human would do it, just faster and with far more accuracy, and I could learn it in a week. Take all my Paxoses and replace them with Raft, then swing by and make a second pass on a few.
Lets take WebAssembly as an example. It's only reason for existence is convention. You need everyone to agree on an IR that is not only cross platform and stable but also low level and allows execution of untrusted code within a sandbox.
If you look at competitors then it becomes obvious that they are not following these core principles. LLVM IR just isn't meant to be used in a browser. JVM bytecode just wasn't meant to be used in a browser. So what are we going to do? Use them anyway? That's how you get situations like the horribly insecure JVM plugin. You can restrict the JVM plugin to a subset that is secure enough for browsers and add additional opcodes for low level behaviors but then you are no longer using JVM bytecode. It's a completely new invention at that point but Oracle will still hunt you down.
The majority were not bytecode related, but were bugs in the interface to the outside world. There is no reason why WebAssembly is better in this regard. E.g. webglvs java' s graphic APIs. This mainly comes from corporate culture prioritizing security. If financial hardship or other stresses befalls the browser maker, webassembly will probably give the same trouble as java had.
The minority were related to the soundness of the jvm itself. Most of these have been fixed. These bugs are in general nasty, as the basics have been valisated by a mathematical proof. A few are still there and very hard to fix, like locking system objects like Thread.class I think this was a learning experience for all secure VMs that follow it, and WebAssembly knew what to avoid because of Java. Only time will tell how good WebAssembly withstands the nasty ideas humanity throws at it.
In contrast, the "danger" of heuristics is that they fail to dig up an exceedingly clever combination of archaic package versions that technically fit the user's specified requirements. It's such a small problem that it might even be considered a feature, since said exceedingly clever combinations are likely to be the result of poor version definitions and unlikely to be what the user actually wants.
Of course, if the only people who can be persuaded to write package managers are people doing research in the subject, then I suppose letting them inflict their pet projects on us is one way to compensate them for an otherwise thankless task, and perhaps in that sense it's fair.
Typical malloc implementations today use slabs:
A variety of allocation-classes is defined; for example, 1B, 2B, 4B, 8B, ...
Each allocation-class is essentially its own independent heap.
Slabs are really good:
Allocation is fast: a few cycles to determine the slab, then pick the first available cell, done.
Compaction is easy: all cells have the same size!
And I repeat, all cells within an allocation-class have the same size! This means things like pin cards, etc...
Compared to the pointer-chasing inherent to a splay-tree... I do wonder.