But agreed that the amount of perf JS engines can achieve is truly impressive.
Also inlining across virtual method calls is just old hat. See the post’s related work section to learn some of the history. Most of the post is about techniques that are more involved that inlining and devirtualization.
Additionally that functionality has been added to over time with the invokedynamic byte code and it's optimizations.
That’s just one snarky example. There are lots of others. I’m sure you could identify them easily, if you are familiar with HotSpot and you read the post.
Like, really one non snarky example is all I'm looking for. As someone who does know the internals of hotspot fairly well and read the post.
HotSpot has three in the sense that you get interpreter, client, then server.
Both VMs have additional execution engines available behind flags. JSC has a different interpreter (CLoop, based on LLInt but different) for some systems, so it’s like a “fifth tier” if you think of “tier” as just an available execution engine. I think of tier as a stage of optimization that you get adaptively, without having to specifically configure for it. I get that HotSpot can alternatively AOT or Graal, but those aren’t tiered with the rest to my knowledge (like there is no interpreter->client->server->graal config but if there was then that would be four tiers).
I'll give you that levels 1-3 use the same IR, but that says more about the generality of the C1 IR than JSC being more advanced for using different IRs IMO.
Again, not the same as what JSC does, and not nearly as aggressive. Most notably, there is no baseline jit. Also, C1 in any config compiles slower than DFG.
That really colors the conversation in a different way retroactively that I don't really appreciate.
It now looks like I'm berating you out of nowhere, when really you originally made an assertion and I'm just trying to get you to back it up.
Maybe it’s more advanced at collecting garbage or supporting threads, but it is not comparable in the field of type inference, because Azul’s VM and other JVMs do not do any of the kind of type inference described in this post. And no, invokedynamic is nothing like inline caching for JS - not even close. Most of this post is about what type inference for dynamic languages really looks like when you invest HotSpot-like efforts to that specific problem. Saying that Azul is more advanced at type inference is far off from the truth at a very fundamental level.
So I gave you snark. Being snarky is fun sometimes!
The “number of tiers” difference doesn’t seem quite substantive enough, I would say, although I’m definitely no expert.
It’s not the purpose of this post to enumerate differences to HotSpot, but it does show some cases where the two approaches are alike.
All of these things are extremely typical for a JS VM to profile and speculate on.
I do not work on JSC, but I did work on V8 and it does all of these things.
Those are just modeled as invokedynamic getters and setters for dynamically laid out objects. It's handled great.
> HotSpot doesn't need to infer whether (and where) methods lie in the prototype chain of an object.
How is that different than vtable lookups?
> HotSpot does not need to speculate that arithmetic fits in integer range.
Here's where they added that to the JVM. https://bugs.openjdk.java.net/browse/JDK-8042946
> HotSpot does not need to speculate that fields are (not) deleted from objects.
Once again, handled as invokedynamiced getters and setters.
> HotSpot does not need to do scope analysis and closure conversion. HotSpot does not need to speculate that the arguments object does not escape.
They absolutely do since lambdas have been supported.
> How is that different than vtable lookups?
I wish you would step back and appreciate more of the shared context here to recognize that you don't need to explain (and exaggerate) the inner workings of JVMs to people who have worked on them for quite some years in the past. We'd probably be friends and have fun if you'd drop the HotSpot schtick. I went through that phase too, about 10 years ago.
And don't even get me started on HotSpot's startup time.
Dynasm is actually written in Lua. There's a good guide by Peter Crawley on how to run it here:
The PHP jit is actually much easier to understand. With SSA for all tiers, CFG, but not that much type speculation at all. The advantage of php, Perl, Ruby over JS is that objects don't vary that much, methods are not overridden that much, and arrays and hashes are much easier.
The php jit is at least 10x easier and smaller than JSC. Also C, not C++.
... cause it's I/O bound? Just curious.
So, I think it is cpu bound for at least some people.
The JIT does dramatically speed up microbenchmarks like Mandlebrot, etc. 5x.
You can see an example of working around the limited optimization scope by templating Lua here: https://github.com/LuaJIT/LuaJIT-test-cleanup/blob/master/be...
This makes some variables which would otherwise have to be loaded at the start of the trace into constants in the recorded trace. A more complex compiler can just do the same optimizations without the fuckaround.
Pypy is a tracing JIT though which is quite different. JSC and v8 compile a method at a time. Spidermonkey used to use tracing, but switched to methods too (though I think it still does limited tracing in some situations).
EDIT: It looks like a couple trace-like techniques are still used in IonMonkey (maybe the devs have some input there though).
I don’t work for Mozilla but spent a ton of time studying TraceMonkey for building my own tracing JIT.
That can work great for long running applications such as its main niche of scientific computing but would be terrible for JS since you want the page to be interactive ASAP.
This is why talking about JIT performance is so complicated. Not only do you need to worry about compilation speed and speed of the generated code, you also have to worry about a lot about impact on memory and impact on concurrently running code. Plus most JITs also need to have some sort of profiling system running all the time as part of those constraints to only spend compilation resources on hot paths.
So, it just depends on how important the benefits of dynamic types are versus the benefits of static types. I don’t think static types of dynamic types are better; they are just good at different things.
For example your type system may tell you have you have an int32, but you can speculate that only the lowest bit is ever set, with a kind of synthetic type you could call int32&0x1 which isn't expressable in the type system the user uses.
> dynamic typing doesn't make your job any easier
Yeah, it makes millions of application programmers' jobs easier at the expense of a small group of experts - sounds like the right tradeoff?
I don't think it's that simple. Large programs get unwieldy, no matter what language you write it in, and a large body of evidence suggests that having static types for both safety and documentation is a big win, because it makes programs more robust and ironically makes programmers more productive in the long run. As you and I both know, this is a long discussion that stretches back decades, so it probably isn't going to be productive to hash it out here.
A more important discussion which is not being had is the question of the size of the trusted computed base. Framed this way, it makes sense to minimize the size of the trusted computed base and not have a complicated dynamic language implementation on the bottom. Instead we should have layers with a very strict statically-typed target that is easy to make go fast at the bottom. This is why I want to put WebAssembly under everything. Yes, even JS. (Fil would probably not agree here :-))
> For example your type system may tell you have you have an int32, but you can speculate that only the lowest bit is ever set,
Citation needed. A review of studies on static vs. dynamic languages concluded "most studies find very small effects, if any". https://danluu.com/empirical-pl/
"The summary of the summary is that most studies find very small effects, if any. However, the studies probably don't cover contexts you're actually interested in."
I am beginning to think that cutting your large untyped program into pieces and typing it only at the boundaries will get you all the benefits. That probably means most, if not all, types inside those pieces can be inferred.
I think there was some proposal already but it's probably dead now because I haven't heard about it for a while.