Hacker News new | past | comments | ask | show | jobs | submit login
Java (JIT) vs C++ (AOT) for a simple hash table (github.com/coralblocks)
4 points by joas_coder 7 days ago | hide | past | favorite | 8 comments
The goal of this research is to explore the performance differences between JIT (just-in-time compilation) and AOT (ahead-of-time compilation) strategies and to understand their respective advantages and disadvantages. The intent is not to claim that one language is slower or worse than the other.

In our tests, we observed better results with the HotSpot JVM 23 using JIT compilation. We got slower results with C++ (compiled with Clang 18), GraalVM 23 (compiled with native-image), and HotSpot JVM 23 with the -Xcomp flag. We are seeking to understand why this happens, and if possible, identify ways to improve the performance of the C++ version to match the results of Java's JIT compilation.

Our benchmark involves comparing a hash table (map) implementation in Java to an equivalent implementation in C++. We made every effort to ensure consistency between the two implementations, but it’s possible that some nuances were overlooked.

The hash table implementation itself is simple, and we aimed to make the Java and C++ code as equivalent as possible, including how memory is managed. For instance, we ensured that the C++ hash table values are stored by reference, not by value, to avoid unnecessary copying.

The benchmark creates a hash table with 5,000,000 buckets and inserts 10,000,000 objects, minimizing collisions. The linear search in each bucket is kept to a maximum of two iterations to avoid discrepancies in measurements due to varying collision behavior across different elements.

The Bench class, which handles the measurements, should also be equivalent in both implementations.

Given these details, does anyone have insights into why the C++ version of the benchmark was slower than the Java version? Could there be something we overlooked or an aspect of the C++ implementation that could be optimized further? Perhaps specific Clang optimization options we should explore?

The link to the project is here => https://github.com/coralblocks/CoralBench?tab=readme-ov-file...






But what were the results?

You can find all the results and code in our GitHub repository => https://github.com/coralblocks/CoralBench?tab=readme-ov-file...

Is there a summary somewhere, i.e. what are the overall conclusions from the measurement results?

It will be hard to make a final conclusion. It is more of a research exercise to learn. All we can say is that as the code stands right now Java JIT is faster than C++ clang AOT for this simple map implementation. And all the code, compilation scripts, executions scripts, details, etc. are available in our GitHub so that other people can execute and play with these benchmarks in their own environment, to draw their own conclusions.

There is also an ongoing discussion on SO => https://stackoverflow.com/questions/79268109/c-implementatio...

The GitHub for the project is at https://www.github.com/coralblocks/CoralBench


It's not about a "final conclusion", just about a summary of the measurement results at hand. Instead of letting the user to trawl through a lot of documents and figures and make geomean and factors calculations him/herself, the author who publishes measurement results should do this. Here is an example how this could look: https://github.com/rochus-keller/Oberon/blob/master/testcase.... It is immediately recognizable how much less time on average the C++ implementation used compared to the reference (LuaJIT in this case).

I see your point. Sorry about that. Your given example looks very cool. We'll try to do something like that. For now, you can see the results here: https://gist.github.com/coralblocks/21523d73f460924874685f11...

Or on GitHub: https://github.com/coralblocks/CoralBench?tab=readme-ov-file...


It would be interesting to compile the C++ with PGO (it should catch up with JIT)

I agree it would be interesting and a fun project! Correct me if I'm wrong, but the problem with PGO is that every time the code changes the PGO needs to be updated as well. Also, my past experience with PGO (for the GraalVM native-image) is that it improves a bit, but not much. Maybe 50% at best. But again, the only way to know for sure is to do it and measure it. That's something we should definitely dig deeper to see what kind of difference (in practice) it can make.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: