I assume the performance profile is similar to compiling regular Python code with Cython (which it can do). There is a decent but not world-changing speed-up; this makes sense, because pretty much the only thing you are removing from the equation is the instruction and stack overhead of the interpreter, you still have to perform all the equivalent work that CPython does, otherwise you would end up with different semantics. And this work is substantial.
In contrast, a tracing JIT can dynamically elide most of this work without changing the semantics.
In contrast, a tracing JIT can dynamically elide most of this work without changing the semantics.