A quick note: Shenandoah is not generational, according to the article. Most bog-standard web apps (including REST thingys; not sure why the author calls out those) do strongly obey the generational hypothesis. For most web apps, in my experience, if you can tune your GC to serve the vast majority of your requests from a young generation, your latencies will be good, your performance will be good, your pauses will be infrequent and short, and plump unicorns and bunny rabbits will gather in your cubicle to share their rainbows.
Hi, author here. You are saying exactly what I was thinking before. But turns out, generational GCs have nasty failure modes when things don't go as expected. E.g., if an upstream experiences its own difficulties and returns responses slower, our service has to keep all the requests in memory longer, so the heap runs out, and G1 performs a few fruitless YoungGCs (without freeing much) and then tenures all those requests to OldGC, and now you have a big OldGC pause bomb waiting for you.
Non-generational GCs don't have this problem, and it's one of the reasons why Shenandoah suited us well there.
If practically everything is collected in the young gen GCs like most request/response applications, do you even gain anything from GC being generational?
DirectByteBuffers allow java programs to use unmanaged memory without needing to drop to JNI or similar. There are open source and commercial libraries that wrap that API with caching code. Using one of those solutions keeps your cache out of GC memory.
Caches violate the generational hypothesis. Entries die in middle age: long enough to have survived multiple young generation collections, so that they are promoted to older generations. The problem is that older generations are (a) not collected as frequently, (b) are often larger than newer generations, and (c) have a lower proportion of dead space to live objects, so the effort of tracing has lower value.
Caches that are in scalar data forms (e.g. byte arrays) or off-heap aren't too bad - bytes and off-heap memory doesn't need to be traced. If you're caching an object graph dense with pointers, then not so great.
Completely vice-versa, Shenandoah is much better for caching because it is NOT generational. [LRU] Caches go against generational hypothesis because the oldest elements are evicted first.
I understand what you mean, but wouldn't the majority of allocations still hapen during a request? For example, generational GC works really well with Elixir and Erlang caches.
> wouldn't the majority of allocations still happen during a request?
Could you please clarify this question? Do you mean that if cached objects are a small part of the total allocation rate, then generation GCs work well with that?
Well, if caching takes a small part of the overall workload, then you can't really say it's a "cache workload" or "cache-heavy workload", right?
My answer meant that Shenandoah would work well in a program where cache occupies like 70-80% of the heap, and generational GCs might not. But surely, neither are going to break from a 1%-heap cache.