Hacker News new | past | comments | ask | show | jobs | submit login

Shame this doesn't show the CMS collector, which was present in 8 and 11, but removed before 17. CMS was the go-to option for low-latency collection for a long time, so it would be good to see how it compares to the modern options.

It would be particularly interesting from the perspective of someone working in a shop which still has lots of latency-sensitive-ish workloads running on JDK 8 with CMS!

Here is the comparison I did with Cassandra workloads, comparing CMS with ZGC: https://jaxenter.com/apache-cassandra-java-174575.html.

(TLDR, ZGC is a huge improvement.)

I'm curious what your applications' allocation patterns are like.

I've worked on a couple projects where switching from CMS to G1 was a pretty big latency win. Most of the were pretty strongly request-response-based. Pretty quickly G1 would converge on having most of the regions being young, and, by the time G1 wanted to do a mixed collection, most of them would have no live objects and would be summarily killed.

Also, it would be interesting to see this progression from at least 1.4. By the time 1.8 was released I had the impression that GC is already pretty well optimized for throughput.

Although if your ulterior motive is to persuade people to upgrade beyond 8 and 11, you probably don't want to suggest that performance has basically plateaued.

I don't see how this is bad.

JVM is pretty well optimized and it is much closer to raw C performance than most other popular languages. You could also say that C is bad because its performance plateaued a long time ago.

C maps more closely to assembly/hardware than most other languages. Saying C has plateaued gets pretty close to saying it’s hardware performance that’s plateaued.

If C mapped to "hardware" so well then OpenCL, CUDA C/C++, SYCL, ispc, etc wouldn't be necessary. The rising importance of accelerators is a big issue for the future of C.

Those languages maps to GPU's, C maps to CPU's. Those are on the same level as C, they aren't the real instructions that GPU's run but they are a pretty good abstraction for GPU instructions.

ispc is explicitly designed to take advantage of SIMD on CPUs and GPUs, its existence is directly related to the shortcomings of C in this area. Likewise, SYCL exists to target accelerators because C isn't even close to supporting heterogeneous hardware or programming. In any case, C does not map well to a massive amount of hardware running in production right now.

A CPU can run any program written for a GPU, yes. But the languages you talk about are no closer to how a CPU work than C is, they might be more ergonomic if you want to take advantage of SIMD instructions but they don't do anything you can't do in C and there are a lot of things you can't do in them since GPU's are much more limited than CPU's.

C doesn’t map any more closely to assembly/hardware than most other low level languages. If anything, hardware tries to conform as much to C programmers as they can.

Hell, now Java has much better SIMD support than C, even as a high level language.

Not sure what you mean, SIMD instructions maps perfectly well to C. You just call them like you call any other thing. What C doesn't do well are the exact CPU memory load order, how it caches things in the CPU etc. But no language can do that as you can't even control that in the machine code sent to the CPU. But most things you can do in machine code can also be done in C, and then things you can't can be done in inline assembly.

I suppose what they referred to is Java's (currently incubating) vector computation API. It lets you express vectorized algorithms in high-level way, with the API's methods being translated of the corresponding SIMD instructions of the underlying platform. I.e. you'll get vectorized execution on x86 and AArch-64 in a portable way, including transparent fallback to scalar execution if specific operations aren't supported on a specific target platform.

Right, but that would still mean that C is closer to the hardware than Java. Java has a high level but less powerful solution, since you can only use it on vectors and not arbitrary data anywhere. You can write a similar function in C which compiles differently depending on where you compile and falls back in the same way, just that C gives you the option to use the hardware dependent instructions anywhere if you want.

I’m not sure I understand you: in C there is no standard way for SIMD afaik. There are pragmas on for loops, or other compiler specific tools but the language itself don’t have any notion of lanes or SIMD instructions.

> but the language itself don’t have any notion of lanes or SIMD instructions.

C doesn't need it, you can just call CPU instructions like functions. SIMD is just another kind of CPU instruction, so C supports it. That works in C since you have a direct view of the memory layout. It doesn't work in higher level languages where memory is abstracted away from you, in those you need the higher level concepts you are talking about in order to take advantage of SIMD.

These instructions are ISA specific though; i.e. in C you'd have to implement your solution once using x86 SIMD instructions and once using the AArch64 counterparts. You'd also have to account for different vector lengths. Whereas the Java API does all that for you automatically, e.g. automatically taking advantage of longer vectors when running on AVX512, and shorter ones elsewhere.

I think that's what people consider "better" about the Java approach (at least I do). That's of course not to say that you cannot do all this in C as well, but I think having these capabilities in that portable way available in Java makes SIMD useable for a huge audience for the first time which didn't consider that a realistic option before.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact