gcj had a lot of problems beyond needing configuration of reflection metadata. It used a full reimplementation of the standard library, and it was never adopted by the wider Java community being largely just Red Hat's strategy for creating a fully open source Java implementation rather than something offering specific benefits to Java developers. In particular people thought it'd lead to faster code, but GCC was never designed for Java and the results were actually a fair bit slower iirc.
Native image is quite different. With this new release the compiled images can not only be faster than JIT compiled Java (wow) but also use way less memory and start instantly. At a stroke this is resolving one of the biggest complaints people have always had against JVM languages.
And as a consequence you're seeing adoption by the wider community. All the modern Java web frameworks support it now, and there's a metadata repository where it's collected for projects that haven't accepted it upstream yet [1].
Valhalla and the vector API for SIMD, yes. In theory, at that point, the only major performance difference between them would be GC vs manual memory management. Also: code size.
One interesting experiment that'd be worth trying is generating value types that wrap memory segments, as that's the new manual memory allocation API. You can do pretty sophisticated manual allocation with arenas and stuff, but you can't allocate Java objects into those segments. What you can do though, is create a "struct" using VarHandles that you can cast the segment to which then reads/writes through to the segment. This leads to the question of whether you can make a nice wrapper around such value types that yield C++ style manual memory allocation.
I'm not sure any of those are actually quite right.
Valhalla supports flattening, that's the whole point of it. To get that you need to have a type declared as a value type, and then put it into a non-null variable, field or parameter. At least that's how the current prototype works. If you do that then the compiler will fully inline the allocation into arrays, containing objects/structs, or the stack.
Valhalla started with monomorphization prototypes, although that went silent years ago. I don't know if it'll be delivered in the first version. But, it's an intended goal of the project to deliver.
As for the final point, could you evidence that? It's very hard to do direct comparisons here because they're usually compiling very different languages and Graal hasn't been optimized for C++. For example, Graal can compile many languages LLVM just doesn't even try to. Also, you'd really need to compare against the Oracle GraalVM (née GraalVM Enterprise) to see the best of what it can do. I'm curious what comparison you're thinking of when you say that, as Graal is an extremely advanced compiler by any measure.
There are JVM implementations, like Azul, that use LLVM for their AOT/JIT compilers, so the last point isn't really a plus for C++.
Java like C and C++, enjoys multiple implementations.
Regarding monomorphization, it is still open how much Valhala will go into that front.
Even if Java will never be as good as C++ in generics, there are several other issues that usually tend to have Java chosen for instead of C++, even if C++ wins in the microbenchmark games.
Likewise most options for running Java on the GPU aren't as feature rich as C++, and that is certainly an area where C++ will be dominating in decades to come.
Native image is quite different. With this new release the compiled images can not only be faster than JIT compiled Java (wow) but also use way less memory and start instantly. At a stroke this is resolving one of the biggest complaints people have always had against JVM languages.
And as a consequence you're seeing adoption by the wider community. All the modern Java web frameworks support it now, and there's a metadata repository where it's collected for projects that haven't accepted it upstream yet [1].
[1] https://github.com/oracle/graalvm-reachability-metadata