It's not related to your english, ability to express or something.
Anyway, thanks for the effort
Personally I think it just doesn't matter. 50k lines of code is 50k lines of code. Whether it is in one file or 10 or 100, only changes whether you jump around via a file tree or other editor tools.
There are some practical problems when you get too huge though, like not being able to view the thing in github!
Java's new collectors ZGC and Shenandoah are state of the art, perhaps the best GC's in the world. There's collectors with lower latency and higher throughput but nothing out there that achieves both like Java's new collectors do.
It's a large part of why you hear that Java "performs better in production" than Go and C# even though those languages are faster in micro benchmarks. Java doesn't have value types and generates a ton of garbage but the collectors are good enough that it's still competitive with these faster languages.
Some will tell you that Go's GC is state of the art but I disagree. Last time it came up I started a huge flamewar in the comments so I'll avoid comparing them :).
.NET has made a ton of performance progress in the last 4 years, and a lot of people's ideas about the performance differences are based on .NET from the pre .NET Core days.
¹ Plus a number of algorithmic improvements and the ability to leverage specific CPU instructions that help as well.
I haven't found a comparison of JVM vs CLR GC but I know the .NET team has intended to introduce a ZGC style collector for a while.
The need is not as severe in CLR because it produces less garbage.
But CLR stop the world pauses are long and it's definitely a problem for some uses cases. Many companies focus on their P99 latency these days and a 1 second GC pause will ruin those metrics. And stuff like HFT just can't be done in C# without turning GC off
Yes good point, this is likely a big difference. Maybe Java records will help a lot.
Even if it has the same name to other vendor, it doesn't mean it is 1:1, as each vendor fine tunes their own implementation.
This also applies to .NET actually, as there are other implementations besides .NET Framework/Core.
So it always boils down to profiling.
Isn't the Shenandoah GC supposed to be able to gc asynchronously without pauses?
I'm assuming you're trying to allocate a large array, since otherwise it's pretty difficult to have a large object in .NET.
One thing you can do is use Marshal.AllocHGlobal (this is essentially an unsafe malloc) which gives you a pointer (not reference) to a chunk of unmanaged memory, which you can access with unsafe pointer. This is pretty messy.
The other, more modern thing is using MemoryPool<T>, which gives you a manually managed "array" (Span, .NET-s version of slice) of structs of type T, which you can manually release after you're done with them.
The third option, is just allocate it using new and abandon all references, the create a different copy. The memory pressure of allocating a big object, will probably trigger a GC. This is dangerous, since there can still be dangling references to the old object (that might not even present in the source, but compiler generated), leading you to retain both the old and new memory.
If you said "Hey, I'm done with this" that's great, but .Net can't actually delete it until it's checked for itself that nothing else is using it. Otherwise you'll inject an error into the memory manager when you delete an object that's still in use.
So you can kinda fake a delete by releasing the last reference to an object and then forcing a garbage collection (in .Net I think you can call gc.collect() or something)(it's worth noting that the .Net docs specifically said that .Net might ignore your request to do a gc so even calling that is more of a suggestion than a guaranteed garbage collection)
I had instances where I knew only one of these large structures could fit in memory and had to call gc.collect before allocating a new one, as I would get an outofmemory exception before the garbage collector would kick in by itself.
You can do that with unmanaged objects but it doesn't look like you can with managed objects (other than gc.collect which I saw on other videos is not recommended by microsoft).
There's probably something to be said for allowing such a thing in explicitly "unsafe" code, but not in the normal runtime. In fact, it might even be possible now by leveraging some of the existing unsafe bits in .net.
- Extra space for every object to have a refcount
- Extra refcount bookkeeping every time you (re)assign a reference (possibly triggering cache thrashing/false sharing in some multithreaded scenarios), xchgs instead of movs, etc.
- Pointless if you're using bump allocators (they can't reuse the 'freed' memory until the next GC cycle compacts memory anyways), so you're forced to use more complicated allocator designs if you're to reuse said memory.
Cycles mean it still doesn't give you 100% deterministic object destruction either, so you want extra mechanisms for disposing unmanaged resources at controlled times ala IDisposable anyways.
When you're done with an object (it will almost certainly be an array), push it onto a stack; when you want to allocate, try and pop it off first before allocating fresh. Use stack per thread or locking as appropriate; use multiple stacks with bucketing by size if there's a lot of variance. Use suballocation to reduce reallocating - e.g. allocate 1MB, 2MB, 4MB and so on arrays and keep track of length separately via a slice (array segment).
(ArrayPool in .net encapsulates most of this for you these days, but it's a thing I implemented myself back in .net 2 days.)
That's a GC bug surely?
 Refcounting will be instantaneous of course, but if you have a large heap with a lot of objects and GC kicks in you can have very long gc pauses (even if next to nothing will be reclaimed).
You only pay a price for living things (as in, some object holding a reference to it), the rest is free.
A generational GC as the one in .NET allocates memory up-front, then passes out references/pointers from that allready allocated memory.
When the GC is getting close to the end of the pre-allocated memory, it will analyse all living objects (the objects it can reach from the stack and global variables, and objects referenced by those objects) and copies them over to a different area of memory (generation).
The area the memory was copied from, is now all garbage, and can be overwritten by new objects.
I guess you could say that instead of collecting garbage, it de-fragments your living objects.
If the GC still doesn’t have enough memory, it will try to allocate more.
In any case. The cost of a generational garbage collector is associated with living objects, not dead ones, so manually deleting doesn’t make sense.
This is true in the context of the discussion, and in general for copying garbage collectors, but it's not always true. Copying is the most common way (and the current .NET way) to implement at least the Eden generation of generational collection, but it could be implemented in other ways.
But I find it most helpful to instead focus on the current context, otherwise I’d be spending all typing.
For instance, you could have a mark-and-sweep collector that would mark everything and then first sweep just the most recently created arena, and only do a full sweep if not enough space was freed. It wouldn't be perfectly generational unless it was also compacting, but the youngest arena might be a decent heuristic. Or, for the cost of one pointer in every object header, the GC could keep a singly linked list of the youngest generation.
I don't think it's a great idea, but you can do a non-moving generational GC if you want to interact with C/C++ without forcing pinning/un-pinning of objects (or forcing C/C++ to only interact with GC'd objects via pinned handles).
I believe objects with destructors can keep an object alive for an extra collection phase, but I believe if that’s the case it can easily be solved with a using block before allocating the next object.
A System.Drawing.Image holds operating system resources (GDI+) and these resources are released when the Image instance is disposed. If the instance isn't explicitly disposed then the finalizer will do it but this only happens when the garbage collector collects the Image instance.
Allocating many Image instances without disposing them might exhaust the available resources (Windows bitmap handles or whatever) but the garbage collector doesn't see any memory pressure and does not perform any collection so the finalizer does not dispose Image instances that are no longer used.
The garbage collector has an API (GC.AddMemoryPressure) where an object that has unmanaged resources can signal that it's consuming additional memory to inform the garbage collector's decision of when to perform a collection.
It’s true that in that case it makes sense to «delete» an object. But in that case you can either use a using block or manually call Dispose, right?
For unmanaged object I am surprised it would keep it in memory for a bit since it is also the mean by which you release any lock on a file or a connection. If you don't execute it straight away, you potentially create bugs.
I think the reason why Microsoft was telling people not to call gc.collect is that it interferes with the optimisations and heuristics that the garbage collector maintains to optimise when to do a GC. But I must say I didn't notice any abnormal behaviour when I did. But I would only do if I absolutely have to.
Then you can allocate and deallocate at will.
Or just use a struct.
This feature is enabled in the official builds though.
Ain't that relatively young, not so mature and and designed with specific GC design in mind interface?
Likely GC abbreviates garbage collection.
Going way back, some programming languages have permitted dynamic storage allocation, that is, a programmer using that language could during execution of the program ask for storage, that is, bytes in main memory, to be allocated, i.e., made available for use. Later the programming could free that storage. E.g., early versions of the programming language Fortran did not offer dynamic storage allocation, but some programmers would implement their own, say, in a Fortran array. Then for pointers to the allocated storage, just use a subscript on the array name. The array might be in storage called COMMON which to the linkage editor was external, thus permitting all parts of the program to use dynamic storage. The programming language PL/I had versions of dynamic storage AUTOMATIC, BASED, and CONTROLLED. The programming language C has storage allocation via the function MALLOC and freeing via FREE.
Well, first cut, intuitively can think of garbage collection (GC) as automated dynamic storage freeing.
In the case of the original post (OP) of this thread, what is going on is, in the .NET languages, C#, Visual Basic (VB), F#, etc., can, e.g., in a function, allocate storage, e.g., with the VB statement ReDim, likely use that storage, have flow control leave that function, leave the storage allocated, and, then, have garbage collection notice automatically when that storage will not be used again and free it, i.e., make it available again for allocation and use. In addition, likely the code for some programming language features need at least dynamic storage and might use GC for freeing.
The broad idea of garbage collection is old, in several programming languages goes back decades. E.g., in PL/I, AUTOMATIC gave automatic storage freeing.
Why should .NET implement garbage collection, that is, why bother? Otherwise sometimes programmers forget to do the garbage collection themselves resulting in allocated storage growing until it is too large. One of the old examples was from cases of handling exceptional conditions; in some cases the code that got control did not have the data to know what storage should be freed.
GC has some challenges:
First, in a rich language, it can be not easy to know what storage should be freed. So, there can be some bugs in GC implementations.
Second, GC takes time, and maybe in some situations, that is, in some programs, takes too much time and results in, say, noticeably slower response time. One place where GC tends to be unwelcome is real time programming where want the software to respond in no more than a few milliseconds to external events that occur at unpredictable times.
One of the main ideas for GC implementation is reference counting where the programming language compiler inserts extra code that, for each instance of appropriate cases of allocated storage, keeps track, say, just a count, of essentially how many variables in the code (for each thread of execution) might use, reference, the storage. Then for such an instance of storage when its reference count reaches zero, free the storage.
Even today people do the same thing in languages like Java and Rust as workarounds for performance or semantic constraints of the environment while still nominally obeying language semantics. I assume the same phenomenon is true in the C# universe.
When you bring this up many users of those languages are quick to explain why array indices are safer than managing raw memory addresses from outside the language's object model, but let's not go there ;)
Pointers require hardware support for the same purpose, which Oracle, ARM, Microsoft, Apple and Google are putting money into to, as means to fix C and C++.
I myself didn’t know anything about this person so I poked through their blog a bit. They are clearly marketing their education business (and I am not their target audience) but also seem to know what they’re talking about. More to the point: their material appears to be sincerely helpful to intermediate (and even advanced) NET devs. This relatively short post on unsafe array access in C# although not especially deep, does at least dive into the runtime / IL internals enough to give a clear picture of what the unsafe code is actually doing.
I haven’t watched the video either but they are creating good-faith tutorials, they are obviously capable of understanding the .NET GC, and there’s no reason to suspect they would change type and BS their way through a new video series.
Don't spend your time on anything you don't want to, but if you want to learn about .NET GC then either research the author or research other deep dives into .NET GC.
To research the author you could.
Go to the About page on their blog...
You could also look them up in Stack Overflow. https://stackoverflow.com/users/2894974/konrad-kokosa
You could watch the first 10 minutes where he addresses this.
* David Fowler (ASP.NET Core creator)
* Andy Gocke (Lead developer for the CLR)
* Jared Parsons (Lead developer for C# Compiler team)
* Miguel de Icaza
Presumably if he wasn't saying anything worth listening to, they would have unfollowed him by now.
Miguel de Icaza is the creator of Gnome and Mono.
Select his name, right-click, "search in Google".
Or spend an hour listening to this person. Who knows his videos might have some bits of important information.