
Proposal: C++ Should Support Just-in-Time Compilation - kbumsik
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1609r1.html
======
m0zg
This could be pretty cool. Right now you have to drop down to non-portable,
assembly-style JIT like Xbyak
([https://github.com/herumi/xbyak](https://github.com/herumi/xbyak)) or VIXL
on ARM ([https://community.arm.com/developer/ip-
products/processors/b...](https://community.arm.com/developer/ip-
products/processors/b/processors-ip-blog/posts/announcing-vixl-a-dynamic-code-
generation-toolkit-for-armv8)). It might not be obvious why you'd want
something like that, so I'll explain. Basically think of this as templates
which you don't have to pre-instantiate. You can specify exactly how long your
loops are, so the compiler will be able to do a much better job of
vectorizing, eliminating branches, etc etc. In a tight math-kernel style code
this could easily improve the perf by 2x or more. Combined with intrinsics
this could eliminate the need for assembly in many cases.

This would be exciting if it came to fruition at some point. Unfortunately C++
standard being what it is, we're looking at like 2025 before we see the first
standard compliant implementation. Perhaps other languages could take this
idea and run with it.

~~~
jcelerier
> This could be pretty cool. Right now you have to drop down to non-portable,

do you ? I've been doing JIT in C++ with the clang api for quite some time
already. Sure, it's heavy in terms of "compile time" and binary size increase
since you have to ship clang & llvm, but it enables some pretty cool stuff.

~~~
m0zg
Xbyak works in both Clang and GCC though. And presumably in other compilers as
well. As far as I can tell, Clang-based JIT only works with Clang. Clang,
unfortunately, usually produces measurably slower non-jit code.

------
ece
There is cling: [https://root.cern.ch/cling](https://root.cern.ch/cling)

You can also use cling as a jupyter C++ kernel as well.

------
rurban
pragma's are now in [[ ]] brackets, not normal #pragma anymore? that doesn't
look backcompat to me.

~~~
brianush1
You're confusing backcompat with forward compatibility. Old code still works,
since old code doesn't use [[ ]]

------
okigan
As per example the templates are in compute intensive part of the app — pull
that out to a library, compile once and done - no need for JIT in the standard
/runtime (did they mention +75MB! runtime?)

~~~
DannyBee
I think you don't understand the goals Hal has, and lays out pretty explicitly
in the paper
([https://arxiv.org/pdf/1904.08555.pdf](https://arxiv.org/pdf/1904.08555.pdf)):

1\. Be able to replace very complex template dispatch mechanisms with simple
JIT ones.

Your suggestion is not possible, because you can't compile it once (maybe you
can compile it a ton of times though!)

That is, unless someone has implemented a very very ridiculous set of dynamic
detection and dispatch mechanisms, _and_ your compiler supports changing
microarchitecture/etc optimization per function or you split it out/ifdef it
and compile the same file 5000 ways.

Even then, you will likely miss something the JIT will not, and you will be
unable to support new architectures/microarchitectures without recompiling.

2\. Be able to JIT things without changes to application source code.

3, Be able to easily integrate things like CUDA into the JIT model.

(FWIW: Hal is a very long time and well respected LLVM contributor, so he
generally wouldn't suggest something trivially stupid :P)

