It will be hard to make a final conclusion. It is more of a research exercise to learn. All we can say is that as the code stands right now Java JIT is faster than C++ clang AOT for this simple map implementation. And all the code, compilation scripts, executions scripts, details, etc. are available in our GitHub so that other people can execute and play with these benchmarks in their own environment, to draw their own conclusions.
It's not about a "final conclusion", just about a summary of the measurement results at hand. Instead of letting the user to trawl through a lot of documents and figures and make geomean and factors calculations him/herself, the author who publishes measurement results should do this. Here is an example how this could look: https://github.com/rochus-keller/Oberon/blob/master/testcase.... It is immediately recognizable how much less time on average the C++ implementation used compared to the reference (LuaJIT in this case).
I agree it would be interesting and a fun project! Correct me if I'm wrong, but the problem with PGO is that every time the code changes the PGO needs to be updated as well. Also, my past experience with PGO (for the GraalVM native-image) is that it improves a bit, but not much. Maybe 50% at best. But again, the only way to know for sure is to do it and measure it. That's something we should definitely dig deeper to see what kind of difference (in practice) it can make.