The point is that compiler optimisations are a black box and not guaranteed. They can be very brittle wrt to seemingly harmless source changes (even something as simple as making an extra intermediate assignment). You are at the mercy of the 'fuel' of the optimisation passes. With staging you get to control exactly what gets inlined/partially evaluated. Of course, to get good results you need to know what to optimise for.
> With staging you get to control exactly what gets inlined/partially evaluated.
I want to stress that this is not true. Sure, sometimes it might work, but compilers can also uninline, as well as reorder the way things are evaluated. Compilers don't do a 1:1 mapping of lines of code to assembly instructions anymore; instead they are designed to take your program as input, and generate the best executable that has the same observable effect as your code. So whatever optimization you perform in the source code, it is going to be very brittle as well wrt to seemingly harmless compiler changes (like changing compiler flags, updating the compiler to a new version, and so on).
While indeed nothing is guaranteed, at this point in time the compiler is vastly better at optimizing code than humans are. If you want to make a point that multi-stage programming helps optimize code, you have to do much better than an example of raising x to some power.
I think you are missing the point a bit. With staging you can build up arbitrary levels of compile time abstractions and be sure that they will not appear in the final executable. Of course, an optimising compiler will reorder/rearrange code regardless. But it won't reintroduce all the abstraction layers that have been staged away. After enough abstraction layers, without staging even a compiler that optimises aggressively won't know to evaluate them away.
Let's put it another way: do you think there is utility in macros at all? And do you think that type safe code is better than untyped code? If you say yes to both, you must also think that staging is useful, since it basically gives you type safe macros. Now lots more things can be macros instead of runtime functions, and you don't need to deal with the ergonomic issues that macros have in other languages. For a more real world example, see Jeremy Yallop's work on fused lexing and parsing.
Incredible, one would expect this to be theoretically possible since Typescript's type system is Turing complete, but it is certainly different to see it done in practice. Wow!
This is just plain wrong. There's nothing you can do in C/C++ (or C+, as you put it) that you can't in rust, when it comes to parallelism. There's always unsafe if you really need it.