Hacker News new | past | comments | ask | show | jobs | submit login
.NET GC Internals mini-series (tooslowexception.com)
198 points by GordonS 3 months ago | hide | past | favorite | 74 comments

Is this the GC that’s contained in one single 50,000 loc file?

Is this somehow concatenated together in a build process, or is it actually written and maintained like that?

Originally it was written in lisp then translated using a tool. But I believe it's maintained like that.

I think this lisp part was in this talk, but I'm not 100% sure what he did say

"I'm not 100% sure what he did say" - this makes me a little worries. What wasn't clear exactly? I've hoped that the GC history from 48:50 covered "the Lisp part" good enough.

I've been jumping through video to see what is this about and heard something about lisp2cpp transpilation myth (that I've heard before)

It's not related to your english, ability to express or something.

Anyway, thanks for the effort

Lol this might be a common thing in MS language projects. I heard the Typescript compiler is also a single notoriously large file

When you get into more "low level" domains, huge single files are more common. Some feel there is a correctness justification, being able to follow the code from top to bottom rather than jumping around, see: http://number-none.com/blow/blog/programming/2014/09/26/carm...

Personally I think it just doesn't matter. 50k lines of code is 50k lines of code. Whether it is in one file or 10 or 100, only changes whether you jump around via a file tree or other editor tools.

There are some practical problems when you get too huge though, like not being able to view the thing in github!

Anyone have a good comparison of .net vs java GCs? I never had trouble with the former but Java has been terrible. I'm not sure if its just my project or having the xmx memory cap.

.NET GC is nothing to write home about, pretty standard with long stop the world phase. It's not super interesting so nobody raves about it.

Java's new collectors ZGC and Shenandoah are state of the art, perhaps the best GC's in the world. There's collectors with lower latency and higher throughput but nothing out there that achieves both like Java's new collectors do.

It's a large part of why you hear that Java "performs better in production" than Go and C# even though those languages are faster in micro benchmarks. Java doesn't have value types and generates a ton of garbage but the collectors are good enough that it's still competitive with these faster languages.

Some will tell you that Go's GC is state of the art but I disagree. Last time it came up I started a huge flamewar in the comments so I'll avoid comparing them :).

Citation needed. Overall .NET performance usually beats Java in benchmarks today , so if the GC is substantively less good, that shouldn't be possible. I've not seen GC comparisons specifically though so if you know of some recent ones let me know.

.NET has made a ton of performance progress in the last 4 years, and a lot of people's ideas about the performance differences are based on .NET from the pre .NET Core days.

Many recent improvements in .NET performance are enabled by reducing or removing allocations in hot paths, so the GC has to do less work¹. The point was mostly that the .NET GC can get away with not being as advanced since it commonly has to do less work than in Java anyway. C# gives the developer more low-level control over what actually runs on the CPU in the end, so there's a slight shift in responsibility for performance from the JIT/GC towards the developer (although the latter two also improve, of course).


¹ Plus a number of algorithmic improvements and the ability to leverage specific CPU instructions that help as well.

Micro benchmarks are written to avoid creating garbage since code is faster that way. C# wins these comparisons easily.

I haven't found a comparison of JVM vs CLR GC but I know the .NET team has intended to introduce a ZGC style collector for a while.

The need is not as severe in CLR because it produces less garbage.

But CLR stop the world pauses are long and it's definitely a problem for some uses cases. Many companies focus on their P99 latency these days and a 1 second GC pause will ruin those metrics. And stuff like HFT just can't be done in C# without turning GC off

> Java doesn't have value types and generates a ton of garbage

Yes good point, this is likely a big difference. Maybe Java records will help a lot.

I am quite interested in this also and I confirm it from empirical standpoint - I never had a single episode of .NET GC but it was usual stuff on Java apps. We even had 3 month session in GC optimization directly with the guy working at Sun on it.

Although there are many valid comments here, I will just leave here a voice from the .NET GC architect herself - https://devblogs.microsoft.com/dotnet/how-to-evaluate-info-y....

Regarding Java GC's it depends pretty much which JVM implementation, and what GC is chosen from the pleothara that each vendor packages.

Even if it has the same name to other vendor, it doesn't mean it is 1:1, as each vendor fine tunes their own implementation.

This also applies to .NET actually, as there are other implementations besides .NET Framework/Core.

So it always boils down to profiling.

I don't use Java but I was under the impression that it has the most sophisticated gc's available.

Isn't the Shenandoah GC supposed to be able to gc asynchronously without pauses?

ZGC is the state of the art

Thank you for all the comments, here's the link to the second episode if you are interested: https://www.youtube.com/watch?v=OXvT9f5PPbs

Please forgive me ignorance, but why can't I manually delete an object? Lots of times I know this one object is pretty big and want to delete it when I am done with it, but have to force the GC to run to clean it up.

Well, this is tricky.

I'm assuming you're trying to allocate a large array, since otherwise it's pretty difficult to have a large object in .NET.

One thing you can do is use Marshal.AllocHGlobal (this is essentially an unsafe malloc) which gives you a pointer (not reference) to a chunk of unmanaged memory, which you can access with unsafe pointer. This is pretty messy.

The other, more modern thing is using MemoryPool<T>, which gives you a manually managed "array" (Span, .NET-s version of slice) of structs of type T, which you can manually release after you're done with them.

The third option, is just allocate it using new and abandon all references, the create a different copy. The memory pressure of allocating a big object, will probably trigger a GC. This is dangerous, since there can still be dangling references to the old object (that might not even present in the source, but compiler generated), leading you to retain both the old and new memory.

Because they're memory objects that are managed by .Net.

If you said "Hey, I'm done with this" that's great, but .Net can't actually delete it until it's checked for itself that nothing else is using it. Otherwise you'll inject an error into the memory manager when you delete an object that's still in use.

So you can kinda fake a delete by releasing the last reference to an object and then forcing a garbage collection (in .Net I think you can call gc.collect() or something)(it's worth noting that the .Net docs specifically said that .Net might ignore your request to do a gc so even calling that is more of a suggestion than a guaranteed garbage collection)

But gc.collect is quite expensive. I think the parent question was, I have a large object in memory, I want to delete it now so that I can load another large object without running into memory limits. But I don't want to call gc.collect which will stop my whole application, interfer with the garbage collector heuristics, and do all sort of unecessay steps.

I had instances where I knew only one of these large structures could fit in memory and had to call gc.collect before allocating a new one, as I would get an outofmemory exception before the garbage collector would kick in by itself.

You can do that with unmanaged objects but it doesn't look like you can with managed objects (other than gc.collect which I saw on other videos is not recommended by microsoft).

There's an inherent issue with doing that while still being safe: what if there's still a reference to your "big object" somewhere? The only way for the runtime to know for certain it's safe to delete the object is to effectively run the GC anyway. The alternative is that the object gets deleted without any checks and any references to it will now (probably) cause a crash - and it'll be a hard native crash rather than a .net exception since it's outright accessing invalid memory rather than just a "managed" null pointer.

There's probably something to be said for allowing such a thing in explicitly "unsafe" code, but not in the normal runtime. In fact, it might even be possible now by leveraging some of the existing unsafe bits in .net.

True but all it takes to fix this is to add if on the reference count of the object. You only need a full scan to handle circular references. You could have a gc.collect(obj) which will collect this object and all of its dependencies provided their reference count has gone to zero. And otherwise do nothing until the next full garbage collection.

There are GCs that take this approach (refcount + sweeps to break cycles). It has tradeoffs however.

- Extra space for every object to have a refcount

- Extra refcount bookkeeping every time you (re)assign a reference (possibly triggering cache thrashing/false sharing in some multithreaded scenarios), xchgs instead of movs, etc.

- Pointless if you're using bump allocators (they can't reuse the 'freed' memory until the next GC cycle compacts memory anyways), so you're forced to use more complicated allocator designs if you're to reuse said memory.

Cycles mean it still doesn't give you 100% deterministic object destruction either, so you want extra mechanisms for disposing unmanaged resources at controlled times ala IDisposable anyways.

There's no reference count on the object. That's not how garbage collection works in .NET, or Java for that matter.

They don't define which algorithms are to be used, and at least in what concerns Java, there are some implementations that experimented with reference counting based GCs.


But to get the reference count you’d need to run the entire mark phase.

If you have large objects which you know you want to deallocate, the easiest way to speed them up is to effectively do manual allocation on top of GC.

When you're done with an object (it will almost certainly be an array), push it onto a stack; when you want to allocate, try and pop it off first before allocating fresh. Use stack per thread or locking as appropriate; use multiple stacks with bucketing by size if there's a lot of variance. Use suballocation to reduce reallocating - e.g. allocate 1MB, 2MB, 4MB and so on arrays and keep track of length separately via a slice (array segment).

(ArrayPool in .net encapsulates most of this for you these days, but it's a thing I implemented myself back in .net 2 days.)

> I had instances where I knew only one of these large structures could fit in memory and had to call gc.collect before allocating a new one, as I would get an outofmemory exception before the garbage collector would kick in by itself.

That's a GC bug surely?

In extreme cases you might be able to emulate arena allocation by spawning an external process, create huge object, do some work on it, throw away the process. How viable this is depends on the programming language you are using. In erlang it can work great, because gc is per process (no shared memory) and you can use green-threads; in languages with very grummy GC like python it can also be worth considering[1] even with the cost of spawning a system process. I haven't kept up to date with Net, but my guess would be it's generally not a very attractive strategy.

[1] Refcounting will be instantaneous of course, but if you have a large heap with a lot of objects and GC kicks in you can have very long gc pauses (even if next to nothing will be reclaimed).

With a GC like the one used in .NET, «deleting» an object is a noop. It wouldn’t give you any benefits over not deleting it.

You only pay a price for living things (as in, some object holding a reference to it), the rest is free.

I guess I could provide a bit more info.

A generational GC as the one in .NET allocates memory up-front, then passes out references/pointers from that allready allocated memory.

When the GC is getting close to the end of the pre-allocated memory, it will analyse all living objects (the objects it can reach from the stack and global variables, and objects referenced by those objects) and copies them over to a different area of memory (generation).

The area the memory was copied from, is now all garbage, and can be overwritten by new objects.

I guess you could say that instead of collecting garbage, it de-fragments your living objects.

If the GC still doesn’t have enough memory, it will try to allocate more.

In any case. The cost of a generational garbage collector is associated with living objects, not dead ones, so manually deleting doesn’t make sense.

> The cost of a generational garbage collector is associated with living objects, not dead ones, so manually deleting doesn’t make sense.

This is true in the context of the discussion, and in general for copying garbage collectors, but it's not always true. Copying is the most common way (and the current .NET way) to implement at least the Eden generation of generational collection, but it could be implemented in other ways.

Just as a clarification, "Copying is the most common way (and the current .NET way)" is not the case - .NET GC is generational but it does not promote through generations by copying.

True. Go, for instance, does not use a generational garbage collector. I also believe that Java’s Zgc is not generational, yet.

But I find it most helpful to instead focus on the current context, otherwise I’d be spending all typing.

I'm talking about generational non-copying collectors (possible, but certainly not common). Examples of non-generational collectors are non sequitur. You're talking the diagonally opposite corner of the square diagram. (What's the name of those 4-part square diagrams? I forget the name of the guy they're named after.)

For instance, you could have a mark-and-sweep collector that would mark everything and then first sweep just the most recently created arena, and only do a full sweep if not enough space was freed. It wouldn't be perfectly generational unless it was also compacting, but the youngest arena might be a decent heuristic. Or, for the cost of one pointer in every object header, the GC could keep a singly linked list of the youngest generation.

I don't think it's a great idea, but you can do a non-moving generational GC if you want to interact with C/C++ without forcing pinning/un-pinning of objects (or forcing C/C++ to only interact with GC'd objects via pinned handles).

No I had instances where I deleted an object, wanted to allocate a new one, only one of these would fit in memory, and got an outofmemory exception because the GC didn't kick in between the two. So it is not equivalent.

The GC kicks in when allocating a new object and it decides it needs more memory. GC likely kicked in as you allocated the last object, and after the collection phase, still didn’t have enough memory and so failed.

I believe objects with destructors can keep an object alive for an extra collection phase, but I believe if that’s the case it can easily be solved with a using block before allocating the next object.

No that wasn't the problem. You should be able to test it yourself for instance by opening and dereferencing a lot of system.drawing images but not disposing them. You will most likely get an out of memory exception (unless the behavior changed in the last couple of years, and I have not tested it on .net core). Unfortunately the GC doesn't immediatly kick in when you have a memory pressure and you will get an out of memory exception even though there is a lot of garbage ready to be collected (and in my case it was managed objects, not images).

I want to point out that disposing resources (using the IDisposable) interface does not free memory (unless the memory isn't managed but then it's not related to garbage collection).

A System.Drawing.Image holds operating system resources (GDI+) and these resources are released when the Image instance is disposed. If the instance isn't explicitly disposed then the finalizer will do it but this only happens when the garbage collector collects the Image instance.

Allocating many Image instances without disposing them might exhaust the available resources (Windows bitmap handles or whatever) but the garbage collector doesn't see any memory pressure and does not perform any collection so the finalizer does not dispose Image instances that are no longer used.

The garbage collector has an API (GC.AddMemoryPressure) where an object that has unmanaged resources can signal that it's consuming additional memory to inform the garbage collector's decision of when to perform a collection.

Ideally such handles are also wrapped in SafeHandle classes.

Yeah, which is why I mentioned the thing about destructors. The first GC will schedule an object to be disposed/destroyed, but still keep it around for abit.

It’s true that in that case it makes sense to «delete» an object. But in that case you can either use a using block or manually call Dispose, right?

For an unmanaged object you can use Dispose/using, but for a managed object, I am not aware of any way to explicitly delete it from memory once it has been dereferenced other than calling gc.collect.

For unmanaged object I am surprised it would keep it in memory for a bit since it is also the mean by which you release any lock on a file or a connection. If you don't execute it straight away, you potentially create bugs.

I think the reason why Microsoft was telling people not to call gc.collect is that it interferes with the optimisations and heuristics that the garbage collector maintains to optimise when to do a GC. But I must say I didn't notice any abnormal behaviour when I did. But I would only do if I absolutely have to.

Allocate it via Marshal.AllocHGlobal, and use unsafe to place it there.

Then you can allocate and deallocate at will.

Or just use a struct.

What will deleting it achieve if the GC doesn't run?

The runtime can't guarantee no use after free if it also allows manual, unchecked free.

Why .NET still has no interface for serious, custom GCs?

CoreCLR does have the ability to compile the GC as a DLL and then choose different GCs are runtime by loading different DLLs. Search for FEATURE_STANDALONE_GC in the code base.

This feature is enabled in the official builds though.

You meant "local GC" initiative?

Ain't that relatively young, not so mature and and designed with specific GC design in mind interface?

It is. A few key assumptions are based into the runtime, namely that the GC is a generational GC with contiguous regions and that the GC doesn't need a read barrier, among many other things. There's a bunch of assembly in the runtime that embeds some implementation details of the GC in a way that's hard to decouple. The API came about by untangling the GC and the rest of the runtime; it took a lot of work and the resulting API probably isn't what we'd choose if we were building a GC with it in mind from day 1, but the whole scheme of sideloading a GC works pretty well.

You may be interested in watching my talk about it at NDC - https://www.youtube.com/watch?v=zVbTmgbiZsA&t=23s

Maybe for some readers:

Likely GC abbreviates garbage collection.

Going way back, some programming languages have permitted dynamic storage allocation, that is, a programmer using that language could during execution of the program ask for storage, that is, bytes in main memory, to be allocated, i.e., made available for use. Later the programming could free that storage. E.g., early versions of the programming language Fortran did not offer dynamic storage allocation, but some programmers would implement their own, say, in a Fortran array. Then for pointers to the allocated storage, just use a subscript on the array name. The array might be in storage called COMMON which to the linkage editor was external, thus permitting all parts of the program to use dynamic storage. The programming language PL/I had versions of dynamic storage AUTOMATIC, BASED, and CONTROLLED. The programming language C has storage allocation via the function MALLOC and freeing via FREE.

Well, first cut, intuitively can think of garbage collection (GC) as automated dynamic storage freeing.

In the case of the original post (OP) of this thread, what is going on is, in the .NET languages, C#, Visual Basic (VB), F#, etc., can, e.g., in a function, allocate storage, e.g., with the VB statement ReDim, likely use that storage, have flow control leave that function, leave the storage allocated, and, then, have garbage collection notice automatically when that storage will not be used again and free it, i.e., make it available again for allocation and use. In addition, likely the code for some programming language features need at least dynamic storage and might use GC for freeing.

The broad idea of garbage collection is old, in several programming languages goes back decades. E.g., in PL/I, AUTOMATIC gave automatic storage freeing.

Why should .NET implement garbage collection, that is, why bother? Otherwise sometimes programmers forget to do the garbage collection themselves resulting in allocated storage growing until it is too large. One of the old examples was from cases of handling exceptional conditions; in some cases the code that got control did not have the data to know what storage should be freed.

GC has some challenges:

First, in a rich language, it can be not easy to know what storage should be freed. So, there can be some bugs in GC implementations.

Second, GC takes time, and maybe in some situations, that is, in some programs, takes too much time and results in, say, noticeably slower response time. One place where GC tends to be unwelcome is real time programming where want the software to respond in no more than a few milliseconds to external events that occur at unpredictable times.

One of the main ideas for GC implementation is reference counting where the programming language compiler inserts extra code that, for each instance of appropriate cases of allocated storage, keeps track, say, just a count, of essentially how many variables in the code (for each thread of execution) might use, reference, the storage. Then for such an instance of storage when its reference count reaches zero, free the storage.

> E.g., early versions of the programming language Fortran did not offer dynamic storage allocation, but some programmers would implement their own, say, in a Fortran array. Then for pointers to the allocated storage, just use a subscript on the array name.

Even today people do the same thing in languages like Java and Rust as workarounds for performance or semantic constraints of the environment while still nominally obeying language semantics. I assume the same phenomenon is true in the C# universe.

When you bring this up many users of those languages are quick to explain why array indices are safer than managing raw memory addresses from outside the language's object model, but let's not go there ;)

Because at least array indices are bounds checked.

Pointers require hardware support for the same purpose, which Oracle, ARM, Microsoft, Apple and Google are putting money into to, as means to fix C and C++.

Well, without hardware support, you need either a huge overhead ([cached] binary search for a bounding allocation for a pointer every time you use it) or a moderate overhead with fat pointers. It can be done purely in software, but it's not efficient.

How do I know this person really knows about .NET GC internals for me to spend any time watching some videos?

You could make a similar argument about... any video, song, essay, etc, created by anyone. So I’m not sure what the point of your comment is. I think you are getting at the fact that there’s a lot of bad-faith grifters out there who prey on entry-level devs. This is a legitimate concern! But in my experience it’s pretty easy to sniff those folks out with a superficial investigation. You don’t want to “spend any time” watching the video but I think you can spend two minutes skimming their other work, their CV, etc.

I myself didn’t know anything about this person so I poked through their blog a bit. They are clearly marketing their education business (and I am not their target audience) but also seem to know what they’re talking about. More to the point: their material appears to be sincerely helpful to intermediate (and even advanced) NET devs. This relatively short post on unsafe array access in C#[1] although not especially deep, does at least dive into the runtime / IL internals enough to give a clear picture of what the unsafe code is actually doing.

I haven’t watched the video either but they are creating good-faith tutorials, they are obviously capable of understanding the .NET GC, and there’s no reason to suspect they would change type and BS their way through a new video series.

[1] https://tooslowexception.com/getting-rid-of-array-bound-chec...

While actually a good question, I am reading a bit of snark here.

Don't spend your time on anything you don't want to, but if you want to learn about .NET GC then either research the author or research other deep dives into .NET GC.

To research the author you could.

Go to the About page on their blog...

You could also look them up in Stack Overflow. https://stackoverflow.com/users/2894974/konrad-kokosa

You could watch the first 10 minutes where he addresses this.

Not that this proves anything, but this author's twitter account is followed by:

* David Fowler (ASP.NET Core creator)

* Andy Gocke (Lead developer for the CLR)

* Jared Parsons (Lead developer for C# Compiler team)

* Miguel de Icaza

Presumably if he wasn't saying anything worth listening to, they would have unfollowed him by now.

* Maoni Stephens (Works on the .NET Garbage Collector ;)

And that is the key fact :). And btw: Thanks David and friends for .NET Core

> Miguel de Icaza

Miguel de Icaza is the creator of Gnome and Mono.

The author is MVP in .NET and author of the book Pro .NET Memory Management.

Yep, and this info is 2 clicks away, literally.

Select his name, right-click, "search in Google".

Or hamburger menu -> About.

And that book is as big as War and Peace by Tolstoy and Atlas Shrugged by Rand. So if you want to know the role of an individual memory allocation in the history of the entire application performance, it's a goto thing.

Just read the source code. It has everything you'll ever need:


Or spend an hour listening to this person. Who knows his videos might have some bits of important information.

He's very well known in the community. If one has been in any talks regarding .net gc, chances are they meet him there. Oh yeah he is also the author of an entire book about .net gc Pro .NET Memory Management. So I trust he knows what's he doing.

A minimal amount of effort would lead to an author with very good reviews ( 5/5 on amazon)


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact