To me, it feels that there is a very thick wall in between high level languages and something with raw data access like C, C++, and D. You either completely throw out every convenience feature, or go all in on them.
In C, a lot of data access turns into single digit number of load/store or register access instructions. It is easy to see that it is close to impossible to add fancy data access functionality on top of that without going from single cycles to kilocycles.
I was once told "when your try improving a programming language performance, it eventually turns into C"
P.S. on JIT - it is not given that a JIT language be automatically faster than a well written interpreter on a modern CPU. One of early tricks of making fast interpreters was to keep as much of interpreter in cache and data in registers as possible to benefit from more or less linear execution flow of unoptimised code in comparison to unpredictable flow of JIT made executable code. Today, with 16MB caches, I think the benefit of that will be even bigger.
Which is kind of ironic given how bad C compilers generated code during the mid-80s, versus other mainframe languages.
I actually think Java will one day evolve very close to that goal.
I doubt a competent JIT is ever slower than a competent interpreter, but it may not be that much faster or worth the workload.
It depends on the size of the primitives. An array language could be close to 1:1, while for a cpu-level instructions you will struggle to reach 1/6 of JITted perf.