Hacker News new | past | comments | ask | show | jobs | submit login
GC progress from JDK 8 to JDK 17 (kstefanj.github.io)
318 points by carimura 5 months ago | hide | past | favorite | 190 comments

As Java is generally the fastest GC'd language, what's the current state of Java gamedev?

Once upon a time, this indie Java game called Minecraft became the most successful game of all time.

But from the few minutes of research I just did, Java cannot be deployed to many commercially important systems

  - Nintendo Switch
  - PlayStation
  - iOS
It appears Java is only still viable for Windows and Android, and the 1% Linux desktop market.

There used to be the GCJ project which would in theory let you run Java anywhere you had a C/C++ compiler, but Oracle's litigiousness killed that because the Java[TM] "platform" must run the official Java[TM] bytecode.

It appears C# via Monogame lets you deploy to all desktops (Win/Mac/Linux), mobiles (iOS/Android), and consoles (PS/Switch/Xbox). So ironically C# seems to now be the "write once, run anywhere" fulfillment of the original Java promise.

[EDIT: grammar.]

Java's FFI and value type situation are the two major missing pieces for Java gamedev. C# has better stories for both, and has from the beginning.

Don't confuse Java having the fastest GC with Java being the fastest GC'd language (especially not in all situations)

> Don't confuse Java having the fastest GC with Java being the fastest GC'd language

I've always wondered, it's likely that Java has the fastest GC because it needs to have the fastest GC, otherwise it would be a bottleneck. Other popular languages probably don't depend as much on the performance of their memory allocation primitives.

Good point, the Debian Benchmarks Game for example shows C# matching or considerably beating Java at everything except the binary trees benchmark which stresses GC/mempool performance.


I haven't tried C#/MonoGame yet but all this discussion is considerably warming me up to it. The recent extremely successful indie game Hades was made with MonoGame.

Microsoft in typical MS fashion makes MonoGame higher friction if you aren't using a Windows box for development with MSBuild. On Linux it appears you need to use Wine to run DirectX effect compilation, though once compiled it works on OpenGL backends:


Thats cause most of the C# implementations in the benchmarks game are just calling into C libraries. For example the regex benchmark:

     [DllImport("pcre2-8", EntryPoint = "pcre2_compile_8", CharSet = CharSet.Ansi)]
        extern static IntPtr PcreCompile(string pattern, long length, uint options,
            out int errorcode, out long erroroffset, IntPtr ccontext);

I just looked at the F# vs Java benchmarks and .NET is still faster on most of them just to get a sample on more idiomatic code. The F# codebase doesn't use a lot of externals looking at the code list


Only the PiDigits benchmark uses "extern's"; only because it seems like a lazy port. Everything else is native F# code. It beats Java in all benchmarks expect for the binary-tree one. To see a functional language match or beat benchmark level Java on many cases, at least for me, feels kinda nice. Many other real world benchmarks in house (tech choice evaluations) and third party I've seen JVM vs .NET Core also show .NET usually coming out on top recently.

The .NET GC isn't as good as the Java one. But I feel that's because the cost/benefit of improving the CLR's GC is less than Java's so the work is put elsewhere. The language (C#/even F#) generates less garbage in the first place with typical code. Any GC improvements there probably don't have the same bang for buck as in Java where allocations IMO are more frequent in day to day coding.

There was an article ages ago I read about the rewrite of C#'s Dictionary to the new generic one that went from being reliant on allocations in the older releases (C# 1.0 didn't have generics so iirc it behaved more like Java's) to using a generic struct array.

Doing so cut down the number of memory allocations to _3 allocations_ from something like _2 * N allocations_ (N being the number of elements in the map).

The improvement in performance was apparently staggering when it came to lessening the GC pressure. (Sadly I couldn't find it with a quick google).


Reified generics avoid a lot of "boxing" that comes with standard Java, and that's only one of many features there that helps. Its just easier to avoid allocations in .NET in general over Java IMO. From recent articles and improvements to the platform the team spend their effort to reduce allocations in the first place equal to trying to improve the GC (e.g. ValueTask over Java/Scala futures, etc)

2 out of 10 tasks regex-redux and pidigits accept third-party libraries, the other 8 out of 10 tasks do not.

FYI, Hades was ported from C#/MonoGame to C++ late in development in order to ship on Switch. Their previous titles (Transistor, Pyre, etc) were C# from start to finish though. AFAIK some of their releases use FNA instead of MonoGame (a similar library with more of a compatibility focus.)

Great catch! The Wikipedia article for MonoGame lists Hades, but is evidently wrong. Hades used The Forge for graphics. Ctrl/Cmd-F "Hades" on the Github page has some info:


The Forge is graphics only; audio, input, etc., have to be handled by something else and Hades used a custom C++ engine apparently.

Wow, first time I've heard about Monogame, I wonder how it'd compare to Godot.

So I just tried to set up MonoGame as a result of this thread, but with a twist: only use non-Microsoft servers as sources, to see if it really is 'open source'.

It's impossible.

I just spent 30 minutes trying to find a single non-Microsoft mirror for the .NET Core dependency. If MonoGame is your Spice Melange then Microsoft's servers are the planet Arrakis, the only source in the known universe.

On Ubuntu you need to add a Microsoft server as a repo, you can't just "apt-get install dotnet-sdk-6.0".

So, yeah, MonoGame is not for me. I'll stick with Godot or SDL2.

This is absolutely true. When I turn kitchen upside down while making one simple dish. Cleaning up fast afterwards is not my awesome superpower, its bare minimum necessity to live amicably in house :-)

Java’s new better FFI is almost ready with project Panama and value types will come, but it is exceedingly hard to retrofit into a language (but the idea for the exact semantics might have formed already!)

Project Panama should greatly improve the FFI & game dev situation greatly.

For native binaries, we now have https://www.graalvm.org/reference-manual/native-image/, but it probably doesn't yet work nicely with game frameworks? Not sure.

There are some engines, frameworks: https://jmonkeyengine.org/, https://litiengine.com/, https://libgdx.com/, https://www.lwjgl.org/.

But I have no real experience with any of those.

My (admittedly not huge) experience with Quarkus would tell me that GraalVM is nowhere near ready, most specifically because of reflection.

You either plan your work around zero reflection Java or pray to whatever god that your code passes.

At that point why even bother with Java or native images.

So I just spent some time looking at all of these you listed. With the exception of libGDX they are all essentially desktop-only (Win/Mac/Linux).

LibGDX appears to support iOS/Android and HTML5 desktop browsers, but not consoles (Switch/PS/Xbox).

I’ve used libGDX. Porting to iOS/Android/HTML5 is honestly a pain.

iOS is actually not that hard, I got it to work using RoboVM, but you have to do some research because the documentation is outdated. HTML doesn’t work with Kotlin because it compiles directly from Java, and because it compiles directly it probably has some extra quirks.

No official support (console makers other than MS are making things a PITA) but Robotality has ported their Pathway game (Based on LibGDX) to the Switch.

You can on iOS, for example with Codename One or Gluon Mobile,



They are also quite happy to sponsor possible console ports.

Seriously, with sub-ms pause times of ZGC you could run a GC every frame and have tons of time to space for actual processing.

It sounds very nice but if more than half an indie game's revenue comes from Nintendo Switch then you need a low friction way to get your .java files running there. You also need a proven audio mixer stack so you can load background_music.mp3 and play bang.wav and boom.wav without any popping or leaks. Right now Java seems sort of iffy on consoles, as in, you could maybe spend a ton of time and/or money to get an effort like GraalVM production-grade working there, but no one has bothered to do it yet. It seems to be much lower friction to just use C#/MonoGame, or SDL2 with your scripting language of choice that compiles with a C/C++ compiler, or a proven game engine.

A pause does not correspond to a full GC, of course. The work of GC is split into a lot of concurrent work and a lot of really tiny pauses on each application thread. The entire GC cycle might be split over hundreds or thousands of tiny pauses and take many seconds--there's just no particularly big pause. During the GC cycle the application is still running and allocating, and there is enough slack space that it doesn't run out of memory before the GC cycle finishes and transparently adds freed memory back into the usable pools.

It would be great if the whole cycle were that fast! But alas, there simply isn't enough memory bandwidth to GC 128GB of memory in 0.1 millisecond :)

That's fair, although you're not free()'ing 128GB in 0.1ms either :)

The amortized cost of GC is more efficient that malloc/free, but it's traditionally not good for latency sensitive systems due to long pause times causing jank/dropped frames. Now that GC algorithms like ZGC have advanced to give sub ms pause times, you no longer have that worry.

Technically we didn't need ZGC for this to begin with. IBM launched metronome a while back for real-time systems, but it was never as widely available as ZGC.

In a game you should keep allocations during gameplay to a minimum anyways, malloc() is not O(1) is has variable runtime based upon the current layout of free memory. Additionally, long running malloc/free based applications have unfixable memory leaks due to memory fragmentation.

In my opinion there's very few cases where you should be dynamically allocating memory, and not using a garbage collector.

> That's fair, although you're not free()'ing 128GB in 0.1ms either :)

Well I guess that's not technically true, free() could very well be a quick operation if you have a single 128GB object.

You can compile to iOS/Android with libgdx.

I've been doing this with a game I'm building that has a custom 3D OSM map renderer.

Well, Minecraft is notoriously slow, but I guess that's very much nothing to do with Java, but the un-optimized code... Have seen quite some developers criticizing Mojang for this.

Is it really, though? Just for kicks, last year I got an ibook g4 for a few dollars, and I got Minecraft running on it, not the absolute latest version, but one a few years old (Minecraft dropped 32 bit LWJGL support a bit ago). With some tweaking, I managed a solid 40-50 FPS, on what would have been a pretty anemic processor back in ... 2002, 2003?

Although, maybe it doesn't count as I used mods like Optifine, which are made to ... replace said un-optimized code, but I thought it was a good showing for Minecraft and the JVM anyway.

Minecraft itself it good enough for... pure Minecraft gamplay most of the time.

But it can overwhelm itself. There's a equipment called Elytra you can wear to slide down the air, and if you use Rocket when sliding, you gain a huge momentum boost.

If you keep boosting on a server, being fast enough to challenge the serverside world-loading, you can crash the server.

Another big defect is it's rendering is deeply tied to cpu time, and the game itself has limitation. My recent experience with a 200+ mods & shader setup is that with RTX3080TI (also better cpu) and GTX980TI it runs at same fps (20~30).

> but I guess that's very much nothing to do with Java, but the un-optimized code

You guess right. There are some few fan made mods (optifine, sodium/etc) that often improve performance by an order of magnitude, from tens of frames a second to hundreds.

Minecraft Bedrock, which is written in C++, runs dramatically faster than Minecraft Java. (That's the impression I get from the comparisons I watched on YouTube at least.)

It's a shame for Java since the increase in render distance makes the game much more immersive.

Lack of mods is a big problem...

Did Java already beat Common Lisp and C#? I doubt so. Esp. since the commercial lisps with much better GC's are not listed on the benchmarkgame. But also not Shenandoah, and the other commercial Java GC's

> As Java is generally the fastest GC'd language, what's the current state of Java gamedev?

In my eyes, there are no truly viable options out there, mostly due to a lack of approachable GUI game development software or toolkits.

For example, compare the one option that comes close, jMonkeyEngine (https://jmonkeyengine.org/) to the likes of Unreal (https://www.unrealengine.com/en-US/) and Unity (https://unity.com/), or even Godot (https://godotengine.org/).

Sure, many out there enjoy developing games in a code first approach, or even writing their own engines (e.g. Randy, whose videos are pretty interesting and comedic: https://www.youtube.com/c/RandytheSequel/videos or https://www.youtube.com/c/RandallThomas/videos), but i'd argue that the success of an engine largely depends on the popularity that it gains, which is largely influenced by how easily approachable it is.

Java game development doesn't have such a tool or set of tools, even the activity on jMonkeyEngine's GitHub (https://github.com/jMonkeyEngine) is really low, compared to that of Godot (https://github.com/GodotEngine), even if the technologies themselves could be used to similar degrees of success in many situations.

Come to think of it, it would be nice to actually benchmark something like Unity (C#), Godot (C#), Godot (GDScript) and jMonkeyEngine (Java) in similar real world applications, to see how they fare, performance, resource usage and development speed wise.

My intuition tells me that Java would be faster than GDScript, which would make talking about its (and also Java's, and thus also C#'s) performance a moot point for many of the indie games out there, since GDScript's slowness doesn't prevent many wonderful games from being developed in Godot, here's their latest showcase reel: https://www.youtube.com/watch?v=iAceTF0yE7I

Is Java GC (“the best”) really that much better than something with immutable heap ie Haskell?

I love functional languages but is anyone actually using Haskell to develop for Nintendo Switch and PlayStation? Can you use Haskell on the Nintendo Switch to access a robust audio mixer stack?

F# can be used with C#/MonoGame which seems to run well everywhere so that's one route to functional programming gamedev. Another route appears to be to use SDL2 with a functional language that supports ANSI C bytecode runtime fallback. E.g. OCaml has a bytecode compiler and an ocamlrun runtime that can be compiled with an ANSI C compiler. But I don't know what the GC latency guarantees are for the bytecode runtime. OCaml's native low latency GC benefits from Jane Street's contributions because Jane Street uses OCaml for high frequency trading. But bytecode OCaml running on Nintendo Switch isn't the same thing as native Linux OCaml.

I don't think that anyone is using Haskell to develop any serious games yet.

The toy demos are quite elegant, although the community is probably lacking a good scene editor ie Godot or Unity.

Edit: godot-haskell exists, but it's still a little edgy.

For what it's worth I've had a good experience doing gamedev in Scala using LibGDX. (Though not on consoles, and not pure FP)

> It appears Java is only still viable for Windows and Android, and the 1% Linux desktop market.

Funny how you frame Java "only" being available for some of the most popular platforms, which is billions of devices.

Shame this doesn't show the CMS collector, which was present in 8 and 11, but removed before 17. CMS was the go-to option for low-latency collection for a long time, so it would be good to see how it compares to the modern options.

It would be particularly interesting from the perspective of someone working in a shop which still has lots of latency-sensitive-ish workloads running on JDK 8 with CMS!

Here is the comparison I did with Cassandra workloads, comparing CMS with ZGC: https://jaxenter.com/apache-cassandra-java-174575.html.

(TLDR, ZGC is a huge improvement.)

I'm curious what your applications' allocation patterns are like.

I've worked on a couple projects where switching from CMS to G1 was a pretty big latency win. Most of the were pretty strongly request-response-based. Pretty quickly G1 would converge on having most of the regions being young, and, by the time G1 wanted to do a mixed collection, most of them would have no live objects and would be summarily killed.

Also, it would be interesting to see this progression from at least 1.4. By the time 1.8 was released I had the impression that GC is already pretty well optimized for throughput.

Although if your ulterior motive is to persuade people to upgrade beyond 8 and 11, you probably don't want to suggest that performance has basically plateaued.

I don't see how this is bad.

JVM is pretty well optimized and it is much closer to raw C performance than most other popular languages. You could also say that C is bad because its performance plateaued a long time ago.

C maps more closely to assembly/hardware than most other languages. Saying C has plateaued gets pretty close to saying it’s hardware performance that’s plateaued.

If C mapped to "hardware" so well then OpenCL, CUDA C/C++, SYCL, ispc, etc wouldn't be necessary. The rising importance of accelerators is a big issue for the future of C.

Those languages maps to GPU's, C maps to CPU's. Those are on the same level as C, they aren't the real instructions that GPU's run but they are a pretty good abstraction for GPU instructions.

ispc is explicitly designed to take advantage of SIMD on CPUs and GPUs, its existence is directly related to the shortcomings of C in this area. Likewise, SYCL exists to target accelerators because C isn't even close to supporting heterogeneous hardware or programming. In any case, C does not map well to a massive amount of hardware running in production right now.

A CPU can run any program written for a GPU, yes. But the languages you talk about are no closer to how a CPU work than C is, they might be more ergonomic if you want to take advantage of SIMD instructions but they don't do anything you can't do in C and there are a lot of things you can't do in them since GPU's are much more limited than CPU's.

C doesn’t map any more closely to assembly/hardware than most other low level languages. If anything, hardware tries to conform as much to C programmers as they can.

Hell, now Java has much better SIMD support than C, even as a high level language.

Not sure what you mean, SIMD instructions maps perfectly well to C. You just call them like you call any other thing. What C doesn't do well are the exact CPU memory load order, how it caches things in the CPU etc. But no language can do that as you can't even control that in the machine code sent to the CPU. But most things you can do in machine code can also be done in C, and then things you can't can be done in inline assembly.

I suppose what they referred to is Java's (currently incubating) vector computation API. It lets you express vectorized algorithms in high-level way, with the API's methods being translated of the corresponding SIMD instructions of the underlying platform. I.e. you'll get vectorized execution on x86 and AArch-64 in a portable way, including transparent fallback to scalar execution if specific operations aren't supported on a specific target platform.

Right, but that would still mean that C is closer to the hardware than Java. Java has a high level but less powerful solution, since you can only use it on vectors and not arbitrary data anywhere. You can write a similar function in C which compiles differently depending on where you compile and falls back in the same way, just that C gives you the option to use the hardware dependent instructions anywhere if you want.

I’m not sure I understand you: in C there is no standard way for SIMD afaik. There are pragmas on for loops, or other compiler specific tools but the language itself don’t have any notion of lanes or SIMD instructions.

> but the language itself don’t have any notion of lanes or SIMD instructions.

C doesn't need it, you can just call CPU instructions like functions. SIMD is just another kind of CPU instruction, so C supports it. That works in C since you have a direct view of the memory layout. It doesn't work in higher level languages where memory is abstracted away from you, in those you need the higher level concepts you are talking about in order to take advantage of SIMD.

These instructions are ISA specific though; i.e. in C you'd have to implement your solution once using x86 SIMD instructions and once using the AArch64 counterparts. You'd also have to account for different vector lengths. Whereas the Java API does all that for you automatically, e.g. automatically taking advantage of longer vectors when running on AVX512, and shorter ones elsewhere.

I think that's what people consider "better" about the Java approach (at least I do). That's of course not to say that you cannot do all this in C as well, but I think having these capabilities in that portable way available in Java makes SIMD useable for a huge audience for the first time which didn't consider that a realistic option before.

Shenandoah GC is conspicuously missing despite having similar availability to ZGC and competitive performance characteristics.

I've used Shenandoah GC with great success and I recommend anyone considering ZGC (specially on Kube and container environments) to try Shenandoah first. You won't have to do additional work to enable large pages, yet your performance is pretty close.

It would be good to see the numbers, but I suspect the reason is just that the author used the Oracle build of OpenJDK which doesn't include it.

As a .NET developer I am a little envious of all the GC knobs that Java developers get to play with.

If I could have only 1 new GC thing from Microsoft, it would be the ability to totally disable GC during the lifetime of a process. I don't even want to be able to turn it back on.

I have a lot of scenarios where I could get away with the cruise missile approach to garbage collection. No reason to keep things tidy if the whole world is gonna get vaporized after whatever activity completes. Why waste cycles cleaning things up when you could be providing less jitter to your users or otherwise processing more things per unit time?

I haven't personally used this, but does this work?

GC.TryStartNoGCRegion Method

Attempts to disallow garbage collection during the execution of a critical path.


Pretty sure that is wired up like an elevator close button. Never seen it make a difference and not for lack of trying.

Is the .net gc still contained in one gigantic source file?

yup, exactly

Just sharing this really great benchmark series on JVM GCs, though it doesn’t include the latest versions of OpenJDK:


I like how this reached the top page soon after "Go does not need a Java-style GC" (https://news.ycombinator.com/item?id=29319160).

Moreover, are the significantly lower GC pause times help improve snappiness of GUI apps, such as IntelliJ IDEA?

Intellij uses UseConcMarkSweepGC gc, which has been deprecated since Java 9. Jetbrains is still investigating the use of ZGC. https://youtrack.jetbrains.com/issue/IDEA-247824

It is defaulted to G1 now(I have 2021.2.2 version).

I am running 2021.2.3 now. `ps aux | grep GC` still shows "-XX:+UseConcMarkSweepGC"

-XX:+UseG1GC on my install of clion. Maybe old JVM parameters have persisted in settings somewhere?

> Maybe old JVM parameters have persisted in settings somewhere?

This is an annoying behavior on at least Intellij Idea (I don't know about other Intellij products). If you ever increased its memory limit (it's an option on one of the top-level menus, and it will also occasionally suggest increasing it if it ever notices it's using too much memory), it copies the idea64.vmoptions file which contains not only the memory limit (-Xmx), but all the JVM parameters, from the binary directory to your configuration directory, and so the JVM parameters on it will be kept forever, instead of being changed when you update the IDE. The fix is easy: find that copied idea64.vmoptions file, write down the memory limit on it, remove that file, restart the IDE, and go to that top-level menu option to set the limit again. This will copy an updated set of VM options to your configuration directory.

They changed the default garbage collector to G1 last year.[1]

[1] https://github.com/JetBrains/intellij-community/commit/edf1c...

Honestly this is the most frustrating thing about Jetbrains. Stop investigating shit and just use the new JVMs already. There is no excuse for using deprecated JVMs.

Edit the idea64.vmoptions config file and use the GC of your choice. I've been running with ShenandoahGC for quite some time now and it's been working great.

Also, see https://github.com/JetBrains/JetBrainsRuntime/releases/tag/j... for a JDK17 runtime for Intellij products. I've been running this too and it works great (but does require some tweaks to the vmoptions file).

Text can look pretty bad if you're using other JVMs without their patches.

Yes particularly on Linux, any non jetbrains jdk running intellij looks awful. I don't know if the consider it as some kind of competitive advantage to not submit fixes upstream.

What are they doing in the JVM to make text look good?

I tried this last year but the version they shipped with IDEA didn't have ZGC and using an external JVM caused some weirdness in the GUI for me.

Curious before I try it, is the speed/performance any better?

It depends. There is less latency in the IDE due primarily to better GC (just as the article describes). I think the speed is better too on things like indexing the code base, but that could depend on your computer. I'm running on a 32core high end AMD system with lots of RAM, and I increase the heap setting in Intellij to 4G, etc. I think the IDE is subjectively better.

We run all our production applications on JDK17 too, which is why I push to run everything on 17.

I am pretty sure Jetbrains cares their product works well for a lot of people and they know better than you which JVM is better for their product.

It is not like it restricts what you can build with it.

We may not know their exact decision process but it is incredibly ignorant to assume they are just doing this out of laziness or incompetence.

They've already explained their reasoning, I don't have to assume anything. Their reasoning is bad:

* They've fixed long-known bugs in the JVM

* They've added sub-pixel anti-aliasing and other visual rendering enhancements.

Both of these are clear improvements on the JVM, and very likely to be accepted upstream, but instead of upstreaming them, they decided to create a long-running fork and package it as their own runtime. And like all long-running forks, they're struggling to keep up.

So we are left with two choices: abandon the hope of a functional IDE that only works well on a customized runtime, or abandon the litany of performance, capability, security, and stability improvements in the JVM standard and reference JVM implementation.

But what is your problem with this? Do you really care which JVM is being packaged or you are just picking on something that isn't a problem?

Have you had a problem running IntelliJ IDEA?

I am pretty sure that if I was distributing a very, very complex Java application for a very, very wide variety of audience (like people who aren't even developers because IDEA also caters to these people) I would distribute it with JVM packaged.

And then the user should not care which JVM exactly is being distributed. You only need to care about the end result.

Let me know the last time you cared which version of JVM is used by any of the SAS products you subscribe to.

Yes, I care. Because while jetbrains makes an excellent IDE in terms of functionality, they have extremely shitty startup times and garbage collection pauses. These are both areas that have received a lot of attention in the reference JVM in the versions subsequent to their fork. All of the performance problems literally disappear by using a newer OpenJDK version, but then the GUI starts to look and act weird.

I don't know why you would assume that I am asking for them to not package a JDK at all. That's a fucking absurd assumption, and nothing i have said has even suggested that. All I'm asking for is for them to upstream their improvements and package their IDEs with OpenJDK by default so we don't have to choose between Jetbrains improvements or OpenJDK improvements.

This is so removed from reality that it's difficult to know where to start.

For starters, sub-pixel aliasing is a standard part of the JDK and isn't a Jetbrains addition.

What exactly is stopping you from using a standard JDK17 build with Intellij? How is the IDE not functional for you when doing so? How is it improved when using the JDK17 runtime they have on their github page?

I think the Jetbrains Runtime as sub-pixel AA for Linux in particular[0].

I just tried running idea with adoptopenjdk-17 on macos, and it failed to start up, so it doesn't seem that simple.

[0]: https://confluence.jetbrains.com/display/JBR/JetBrains+Runti...

Sub-pixel rendering was added in the Java "Mustang" 1.6 release 15 years ago, long before Jetbrains started providing a custom JDK. http://www.ffnn.nl/pages/articles/java/java-2-se-6.0-aesthet...

Trying to use a newer JDK on some applications, like Intellij, may require adding entries to the idea64.vmoptions file to relax the module restrictions that were tightened in JDK 15 and 16, if that app hasn't been updated for those changes. Entries like this: --add-opens=jdk.jdi/com.sun.tools.jdi=ALL-UNNAMED might be needed.

See https://youtrack.jetbrains.com/issue/IDEA-261033 for entries that might be needed.

This is easy:

Standard Jetbrains Runtime - slow as fuck, missing a whole host of performance improvements, security updates, and bug fixes since the fork occurred back in JDK8/9 Era.

OpenJDK - ugly as sin, gui becomes glitchy, lose out on customer support channels (first thing they tell you is to use the standard jetbrains runtime)

Newer versions of Jetbrains Runtime - have to reinstall manually every time you update your IDEs. You also lose out on customer support, as they are beta software.

This isn't really a hard problem to solve: find a way to upstream your improvements to the JVM, and then package the upstreamed version. All of these problems go away, and you're even relieved of the burden of maintain a long running fork.

Yes, this is easy:

- The current JBR is JDK11, not JDK8. - You don't need to reinstall anything to use an alternative JDK with the IDE. Just set the IDEA_JDK envar to the JDK of choice. - Ugly as sin is an opinion. I use the IDE with linux on a hidpi display. Looks perfectly acceptable to me...even running with JDK18. - I just reported an issue while running with the JBR17 a few days ago. Jetbrains was perfectly responsive to my issue.

They are in bed with Google, so they need to spend their resources improving Kotlin use cases instead.

I don’t understand why they are not shipping their own bespoke GC to go with Intellij, Jetbrains is already building their own JVMs.

The GC might help, but it doesn't do magic when threads aren't used the correct way, too much stuff lands on the UI thread or synchronous IO is used all over the place.

That’s the magic of HN (or maybe it’s a form of the Baader-Meinhof Phenomenon.) When one topic reaches the front page it seems like a couple more based around it hit the front page the same or next day, which allows us to expand on what we’ve learned.

Snappiness problems are often due to lazy loading Java classes.

For example first time loading project settings takes 1+ seconds while next less than 0.5. This value might get lower later if JVM decides to optimize more some parts of UI application.

I don't think that's where the issues are. IntelliJ freezes when reindexing a project when you change branches from the command line.

Current IntelliJ uses Java11 and has gotten slower over the past few years (anecdata from my own use) - so probably not.

This is what I've noticed too. Intellij has never exactly been speedy, but it had so many great features that it was worth it compared to other slow IDEs like Eclipse. And using a basic text editor like VIM or even VS Code with a big Java project isn't really feasible like it might be in other languages like Python or Node.

However, now it is really bogging down. I'm sure it doesn't help that I've got several IDEs open along with some monstrosity of a "microservice" framework running half a dozen docker services and DBs on an older Macbook Pro, but it's true that app and OS devs tend to use all increases in CPU and ram, and so absolute performance and battery life never seems to get better over the years, regardless of how much the underlying hardware itself has improved.

VS Code is fine with with big projects, I've been using it for 2 years. The language server is Eclipse after all.

Red-Hat is behind most of the Java plugins, so their quality is quite good, there is only the pain to live with Electron.

Not just slower, also energy hugry. My m1 max running pycharm loses about 10pct of charge an hour on their crappy bundled jvm, and around 4% on the latest java17-based ones, but does act weird at times when using them. Its probably down to metal vs opengl. Disappointing.

Since 2020.x it's definitely slower. The addition of jetbrains space doesn't seem to help either and removing the plugin itself makes the whole ide unstable

I wish they would just release a light version, without all the database, spring integration, and other crap. I just want a light language server.

wouldn't community edition fit the bill? Those things you mention are the main difference with ultimate.

Not sure. I moved to VS Code + Eclipse LS and it's such a more pleasurable experience. However, it doesn't support SBT which is the build tool my company uses.

The community edition is definitely enough for sbt projects.

Any reason you're using sbt for Java projects? That's an odd combination. Anyway, I believe Metals is getting close to supporting this particular use case if you want to keep using VS Code.

For some odd reason my company chose the play framework + Java. It's one of the most baffling choices I've ever seen. I've actually managed to get the Java plugins to work with sbt, but it doesn't understand when a Scala object is imported.

Just deactivate the plugins?

Try power saving mode (look in ctrl+shift+a).

I’ve given up on IntelliJ and java-based UI’s long ago. So disappointing.

That's the right way to go. Like IntelliJ or not, Java is a legacy technology. I wouldn't want to depend on that.

Lol, you are laughably bad to think that java is legacy.

The JVM is improving with never before seen speed, has state of the art GCs, a very good JIT compiler, upcoming green threads that will make blocking code automagically unblocking and once Valhalla hits with value types, there really will be very few areas where java would not be applicable.

And on top of that, there is also Graal, which is a novel way to run and optimize mixed-language code bases.

>Lol, you are laughably bad to think that java is legacy.

Seriously. It might not be one of the hip new languages with unique features but it’s a very mature language that is well battle tested, has plenty of libraries available, and the talent pool is very deep for Java hires. It’s also something that stands the test of time —- write code in 1999 on an old Java version it will still likely run on that version today.

Note, only relevant for OpenJDK and the JVMs based on it, there are other GC available to other JVMs, including real time ones.

graalvm and graalvm's svm/native-image as well, the latter having only SerialGC, with G1 in EE only, and crashing ;(

Genuinely interested - could you share some good resources about it?

For example, in what concerns real time GCs, these are good books,

"Hard Realtime Garbage Collection in Modern Object Oriented Programming Languages."


The author is one of the founders of Aicas real time JVM, https://www.aicas.com/wp/products-services/jamaicavm/

"Distributed, Embedded and Real-time Java Systems"


PTC is the other company alongside Aicas, that still sells real time Java systems, https://www.ptc.com/en/products/developer-tools/perc

IBM J9 has the evolution of the Metronome GC, https://www.researchgate.net/publication/220829995_The_Metro...

And it has extensions for value types, via packed object data, https://www.ibm.com/docs/en/sdk-java-technology/7.1?topic=ob...

Someone else already referred Azul, they also have extensions for value types, called object layouts, https://www.slideshare.net/AzulSystems/jvm-language-summit-o...

Then there are special flavours like microEJ or the Android Java snowflake.

How much do you need to worry about the GC causing frame stuttering when writing games now? Object pools are still important?

Well, jdk17 was released just few months ago so I doubt anyone has real experience with it for gamedev. But considering that game heaps are relatively small (as in <100G) and modern Java GCs can manage consistent <1ms pause times, I'd imagine Java is more viable for gamedev from that point of view. I'd also emphasize that there is large spectrum of performance requirements for games and you can make quite a lot these days with relatively poor performance characteristics, and on the other hand in the top end Java probably is still not good enough.

IMO java for gamedev is pretty much un-viable due to the lack of control over memory layout. You clearly can make it work (see Minecraft), but it just sounds painful, and there isn't a clear benefit.

Minecraft wouldn't be anywhere near where it is today if it was not written in Java (which made modding much easier compared to native code, and made heavy modding possible). So there is a strong benefit, it's just not very obvious.

Minecraft performance is a joke in 2021, when you see how much CPU you need to get 10 players in a world.

It is written in Java because the original devs probably did not know anything better.

It uses that much CPU because the original devs did not know how to program well and it allocates a shitton of objects. It is not a good example of a well-written java game.

There are many types of games. For 2D games Java is more than performant enough, without question.

Java can be ridiculously fast when only primitives are used so I wouldn’t put a quite performant game engine past it, but yeah, it would perhaps not be my first choice for a new AAA game engine.

C# has a similar memory model and it's the primary language used by Unity games... would you say that's un-viable?

The game engine is different from the game logic. This may mean Java is more viable for game logic, but still not viable for the engine itself

That said, C# has value types and has for years now. Which is hugely important for arrays, and a major, major missing piece to the Java performance puzzle. C#'s FFI is also way better than the disaster that is JNI, which also plays a role here.

You are right, but I would like to add that both of these concerns will be solved in the near-far future. Java’s project Panama tries to tackle the less than ideal FFI case with a tool that can create java classes from a C header and an API for manually managing memory regions.

While Valhalla is in the works, but it is, I quote, at least 3 PhD’s worth of knowledge combined to figure it out in a backwards compatible way with all the interactions of generics, etc.

For memory layout C# actually has an important difference in that it supports value types (so you can have an array of vertices without individually allocating each one, which has a lot of overhead and drastically reduces cache efficiency). They’ve been working on adding support for that to Java for some time but it’s not in yet.

There is no control over it in pure standard APIs, but if you are shipping a game you could do it with JVM specific ways or even ship a custom / customized GC.

But most games don't actually need this. I'd be more concerned about whether there's a JVM language you like over other non JVM options for a game.

What kind of games do you need to worry about memory layout for? From the outside, I'd expect that it's as diverse as most software, where sometimes you need control like C and other times python will do.

> From the outside, I'd expect that it's as diverse as most software, where sometimes you need control like C and other times python will do.

I develop games as a hobby, and my understanding is that this is literally true, but a little misleading. Games are as diverse as other areas of software, but they're skewed toward the more complex and demanding end. I don't think it's that atypical for a moderately complex game by an indie studio to get to the point where memory layout is a concern, which I don't think is as true for web apps, desktop apps, or other areas of development. That being said, I think worrying about memory layout is more than a lot of games need.

Why is controls over memmory layout important for Games?

If you work on Android games for example would you get any kind of control over mem layout, irrespective of the stack you will be using?

For a simple example if you make a simple type to store the color of a pixel

class RGB { float r float g float b ... }

and then make an `ArrayList<RGB>` the list will end up using over 2x the memory (128 bytes for object headers, 64 bytes for pointers in the list) compared to a language with value types. What makes this even worse is that since you are using a list of pointers, you can't use SIMD for any of your computations, and accessing elements will be slow since the values won't be in cache.

Just to add, the really performance oriented hot loops can be rewritten with Class-of-Arrays with three int arrays for r, g and b values, or even a single one with flattened ints, with a user-friendly wrapper RGB class provided for outside use. With good OOP-usage it would not even be ugly. Performant java code has been written like that for decades, and these will get really close to C-programs.

Also, there is now a Vector API that let’s you use SIMD operations with configurable lane width (and a safe fallback to for loops for processors without the necessary instructions)

To add to adgjlsfhk1’s answer, if you are writing Android games you are probably using a cross-platform game engine that doesn’t use Java (like Unity) which solves the issue but more importantly also allows you to sell your game on iOS without a re-write.

Look up "data oriented design" GDC talks or "AoS and SoA" for more in depth information to your question. The very brief tldr is that if you want to go fast, you have to design for cache locality and memory access patterns.

In particular for Java there's regularly dependent pointer loads, which are dreadfully slow on modern CPUs and also waste a significant amount of L1/L2 cache.

While not the most ergonomic, one can definitely create SoAs in Java and those will be just as fast as they are in low level languages.

These GCs have a 1GB overhead when working with a 16GB heap. That’s 6% of your available memory. I was expecting modern GCs to be way more efficient so I’m kinda shocked this is what state of the art is.

I also wonder how much of the progress to “Sub (200) millisecond” latency target is due us just having faster machines. I honestly have no model to tie this to actual performance of my code. I guess it translates but not really sure how.

Not bagging on Java —— I am just surprised how inefficient an industrial strength GC can be. I understand why manual memory management still holds its own now.

> These GCs have a 1GB overhead when working with a 16GB heap. That’s 6% of your available memory.

6% memory overhead while preserving throughput is actually quite excellent. Just a little more than the average fragmentation overhead under manual memory management.

Manual memory management can "hold its own" because it can tailor the allocation/release profile to the problem and aggregate some of those overheads.

> due us just having faster machines

?? These benchmarks keep the machine constant. And I’m not sure we’ve seen faster machines in a long time. Clock speeds have remained fairly constant and gains are solely in more cores.

> Clock speeds have remained fairly constant and gains are solely in more cores.

Instructions per clock have risen steadily, even though the frequency stays the same. About doubled since 2011 according to cinebench[0].

That being said, I agree the faster machines argument doesn't hold much water.

[0]: https://cpugrade.com/a/i/articles/cbr15-ipc-comparison.png


Sub-millisecond is already here with ZGC, and it's not much to do with faster machines (machines haven't become that much faster over the past ten years) but with more sophisticated algorithms.

Seeing that RAM is the cheapest resource to scale (and that cleverly using it saves energy, given that Java uses the least energy out of managed languages) it seems to be a very good tradeoff.

Accessing RAM is very slow and CPU caches don't scale, though. So it's not as simple as you're presenting it

Well, for general computing I don’t think we have an answer either way. Yeah, for some hot loop with an array we get insane performance, but that is a very specific workload and not applicable everywhere. What about a chat application, which parts of the memory should be physically close to each other?

Also, Java’s GCs are compacting, putting similarly old objects relatively close to each other. The same random chat program in C might use a linked list with worse characteristics so it is really not that obvious to me what would be a good solution.

It's worth noting glibc isn't zero-overhead, either.

Manual memory management can also have overhead due to e.g. fragmentation. I don't know if it approaches 6% in realistic scenarios.

Wait, fragmentation is not included in that 6%. It is that a GC with no fragmentation will already suffer a 6% overhead. A region-based GC will suffer from additional temporary fragmentation on top of those 6% as well, because some segments can be filled only partially with live objects until they are compacted. That effect might be actually bigger because allocations are done only from contiguous space, and it can't just try to allocate from the "holes" until it compacts them. And you also need some additional room to do allocations from in order to avoid too frequent GCs. So I'm practice it is not 6% but sometimes 600%.

Yes, sure. But you do still need to compare the total overhead associated with manual memory management (which is not zero) to the total overhead associated with GC. It's an empirical question which is larger in any given case. And of course it depends on the implementation of malloc and the implementation of GC.

I'm not expressing a view as to whether the overheads of manual memory management are typically comparable or not. I really don't know.

The total memory overhead of manual memory management can be virtually zero if you're careful enough to avoid fragmentation (and you can because it is manual, so you control a lot of details). There is for example no constant header for each allocated chunk. In Java you pay additional 16B for each allocated object and you can't get away from that overhead.

Yes, manual memory management can always be better with arbitrary amounts of tuning. However, I think the more interesting question is how the overheads compare in typical applications written in a reasonably straightforward and maintainable style. In practice, fragmentation can be a difficult problem to avoid when using manual memory management. For example, Firefox struggled with it for a long time.

A web browser is a huge and complex app. The argument it struggled with memory issues is moot if you can't show a comparably featured app written in Java to compare memory use.

Anyways, I've got plenty of anecdotal evidence where Java apps take order of magnitude more memory than their close counterparts written in languages that use manual memory management. Not browsers, but things like benchmarking tools, webservers or even duplicate file finders (shameless plug: https://github.com/pkolaczk/fclones#benchmarks - there is one Java app there, see its memory use :D)

We’re talking at cross purposes here. I did say that I wasn’t expressing a view as to whether the overheads of manual memory management are typically larger than the overheads of GC. I don't know if they are or not.

The point I was making was just that you do need to compare empirically the typical overheads of each to make a meaningful comparison. The 6% figure in isolation doesn’t tell us very much.

As you point out, it is difficult to make these comparisons on the basis of anything other than anecdotal evidence, since it is rare for applications of significant size or complexity to be implemented in multiple languages.

According to the cppcon talk about Mesh [1] (an allocator that implement compaction for C++ programs), the overhead can be massive too (17% overhead measured on firefox, 50% on redis!)

[1] https://youtu.be/XRAP3lBivYM?t=1374

Nice link. I guess theoretically you can always optimise this in languages with manual management. With complex GC you have to figure out a way to tame the beast and I’m not sure if it’s easier to reason about

Looking at that, why isn't ZGC the standard GC in Java 17?

One of the graphs shows ZGC with almost 50% more overhead (in heap space) than G1, which could very well take some applications from "works fine" to "broken." So I can see why they'd leave it something to opt in to.

Latency and throughput are fundamentally opposite ends of the same axis. And perhaps the majority of Java applications prefer better throughput.

But the graph in the article shows that ZGC now gets better throughput than G1! [EDIT no it doesn't]

I suspect the answer is that ZGC is not considered mature enough, and has a higher memory overhead than G1.

No, that graph may be hard to understand. It shows the relative improvement each GC has made since Java 8, not the absolute performance.

Ah, i realised it was relative, but i had assumed they were all relative to the same baseline! Should have been obvious from the way the bars are all level for JDK 8, really. That seems a needlessly unhelpful graph.

They are essentially 3 graphs displayed together. It probably would have make sense to present them more separately.

The article does not compare the throughput of different GCs other than different improvements. The baseline for the G1 graph is JDK8 G1 performance and the baseline for the ZGC graph is JDK11 ZGC performance. There is really nothing that can be directly compared between the two.

There are also other GCs that are heavily competitive with ZGC, such as Shenandoah. It might make sense to see how they all pan out before choosing a new default.

Shenandoah isn't an Oracle maintained feature, so they'll never allow it to become the OpenJDK default (though another vendor make it so in their distribution).

G1 is probably still a better choice as it "balances throughput and latency". ZGC also wasn't generational until recently.

I see that generational ZGC is under development, but have you been able to try it out? I don't see any EA releases for it.

I have not tried it. I just saw the commits (it was 5 months ago), so I had assumed it was merged.

Probably once ZGC gets the generation support (https://github.com/openjdk/zgc/tree/zgc_generational) which will reduce the memory footprint.

Absolute gc throughput and overhead.

Only stabilized in Java 15

What is not shown is those benchmark is how much CPU is used for each GC. For example the latency for ZGC is much lower but does it means it uses more CPU than G1.

I came here to say something similar. ZGC and Shenandoah run the GC in extra threads. I don't see any mention of the number of cores in the article, or in the linked SPECjbb2015 benchmark [1].

It would be interesting to see how much of the improvement is just down to the use of extra "unused" cores, and how much CPU is actually used by the GC. Equivalently, run a CPU-bound task on one core and measure GC and application performance, with an eye on how much the fancy GC slows down the program.

[1] https://www.spec.org/jbb2015/

CPU is almost never a real issue. When I worked with Sql Server we always had data compression enabled, a little more of CPU but much less IO to do.

Those latency and pause-time numbers are pretty sexy. 200ms is still probably too high for some applications, but for everything I work on, this is incredible.

Those are P99 latencies. Not sure how they measure, but my assumption is that, if you probe the JVM at a random point in time, ZGC (P99 = 0.1) will be unresponsive for 100us 1% of the time.

That’s… not incompatible with being useable in a low latency service, but 100us is 2.5 x longer than a sync disk write these days, and they don’t report max latency.

My take on the article is that GC has gotten quantifiably better since Java 8, but not qualitatively better:

The types of projects that had to abandon Java due to GC (and JIT) latency 10 (or 20) years ago still shouldn’t consider using Java.

I’m by no means an expert but even the OS’s scheduler can cause pauses on the range of 1ms — so the applications that can’t use Java, cannot really use ordinary OSs to begin with, can they?

Like, yeah, I would probably not write something like pipewire (linux’s new audio processing project) in Java, but other than that and like some really low latency trading niche (where FPGAs are dominant now as even general CPUs are slow), where would it preclude the usage of Java?

The Linux kernel has realtime and lowlatency versions that help a lot here.

The throughput & latency charts have JDK8 pegged at 100%. Since that's meaningless and the value is in the difference between the JDK versions, it should be represented as a delta.

Which does not matter because most of stuff still rides Java 8.

...and they're missing out, that's exactly the message. By the way, I'd be shocked if Java 8 was still more than half of the running server-side JVMs. There's a lot of them for sure, but I'd be willing to bet it's gonna be less than a half now, and declining.

They continue to use Java 8 as they see to reason to change. Article like this are meant to motivate those laggards.

Used to work on an application that mostly did string manipulation and we couldn't upgrade it from 8 to 9 (or 11). Java 9 got compact strings (8 bits instead of 16 per character) if you used only a subset of UTF-16 (the LATIN-1 charset apparently, which english is). We would have gotten a free speed and memory improvement but the ops refused to support it, even though we did everything ourselves ):

So even if there's a reason to change, people wont do it.

Might wanna save some bandwidth with that 1.25meg yet small-in-visual-size portfolio.png, yikes! XD

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact