> RPython makes you insert these hints to the trace compiler (can_enter_jit, jit_merge_point) about when to start/stop running a trace, does this buy you anything?
As I understand it, those are hints as to domain-specific optimizations that can be made by the trace compiler. The author mentioned that a lot of the low-hanging fruit in the converge vm was down to correctly hinting to the trace compiler what assumptions it could make.
"The second tactic is to tell the JIT when it doesn't need to include a calculation in a trace at all. The basic idea here is that when creating a trace, we often know that certain pieces of information are fairly unlikely to change in that context. We can then tell the JIT that these are constant for that trace: it will insert an appropriate guard to ensure that is true, often allowing subsequent calculations to be optimised away. The use of the word constant here can mislead: it's not a static constant in the sense that it's fixed at compile-time. Rather, it is a dynamic value that, in the context of a particular trace, is unlikely to change. Promoting values and eliding functions are the main tools in this context: Carl Friedrich Bolz described examples in a series of blog posts. The new Converge VM, for examples, uses maps (which date back to Self), much as outlined by Carl Friedrich."
I'm sure that you could provide such an interface to LuaJIT but it wouldn't be as trivial as just writing the interpreter in Lua.
I don't think that can_enter_jit and jit_merge_point are hints for optimization. I'm not particularly familiar with JIT compilation, or RPython, but it looks like RPython provides other methods for doing trace optimization.
For example, there is a decorator `@purefunction` which hints to the translator that a function will always return the same output given the same input, even if it is operating on a data structure that could be considered "opaque".
I'm not familiar either, I'm just going off the article.
"One thing the trace optimiser knows is that because program_counter is passed to jit_merge_point as a way of identifying the current position in the user's end program, any calculations based on it must be constant. These are thus easily optimised away, leaving the trace looking as in Figure 3."
I think you're right that jit_merge_point affects performance. My point was that this isn't an optimization of the JIT compilation, so much as it is a requirement for a JIT compiler to be able to operate. The compiler has to know what defines an execution frame. Once you've done that, there are definitely other optimizations that go in to reducing the opacity of your interpreter to the JIT compiler.
So, you're probably right that a fair amount of optimization can come from doing a good job of defining what identifies an execution frame. I'm curious though, if these are the sorts of "low hanging fruit" the author was referring to, or if they're included in the straightforward port of the original C interpreter.
From the section on "Optimizing an RPython JIT" it seems that the he's doing a lot more than just defining his execution frame.
> The first tactic is to remove as many instances of arbitrarily resizable lists as possible. The JIT can never be sure when appending an item to such a list might require a resize, and is thus forced to add (opaque) calls to internal list operations to deal with this possibility.
I guess my orginal point was that RPython has lots of hooks (eg the decorators that you mentioned) which allow the JIT to effectively trace the interpreted program. You would probably have to extend LuaJIT with similar hooks in order to use it in the same way. As tomp pointed out, the interpreter loop itself is not a good candidate for tracing. The JIT needs to at least know about the interpreters program counter to be able to identify loops in the interpreted program. You're right that referring to these as optimizations is inaccurate.
jit_merge_point isn't an optimization. It tells the PyPy JIT how to separate the parts of the trace that are part of your interpreter loop and the parts of the trace that are implementation of each bytecode.