
A New Bytecode Format for JavaScriptCore - stablemap
https://webkit.org/blog/9329/a-new-bytecode-format-for-javascriptcore/
======
cogman10
I'm unclear, why transform to a bytecode as a first step? Wouldn't it be
simpler to instead to transform to an AST and work against that for
everything? Wouldn't it make sense to generate the bytecode as sort of a last
step before doing heavy duty optimizations?

Seems like with JS you would be constantly transforming from bytecode -> AST
-> bytecode -> machine code, such as every time a method adds a new field to
an object or some optimization assumption is violated. It doesn't seem like it
would be easier or faster to work with. I'm guessing there wouldn't be a whole
lot of memory benefits either.

Would someone mind illuminating me on this design choice? (Granted, I've not
taken a compilers course, so feel free to call me an idiot for not knowing
something basic about how compilers work).

~~~
wahern
* Adding a new field or method to an object at runtime doesn't change the AST.

* You don't translate from bytecode to AST.

* In tiered JIT architectures--typical for dynamically-typed languages like JavaScript where as you point out effective types can mutate at runtime--most code is executed by an interpreter, not as compiled machine code.[1] Interpreters are more efficient when executing a bytecode than walking an AST.

* Translating an AST straight to machine code is much more difficult than translating a low-level intermediate representation, and such representations are often easy to express as simple bytecodes. Compilers like GCC and LLVM use intermediate representations: RTL and IR. Both are effectively assembly languages, and like all assembly languages trivial to express as bytecode. GCC also has GIMPLE, a higher-level intermediate representation earlier in the pipeline which expresses a generic AST. The fact that GIMPLE and RTL coexist drives home the point that you don't want to be translating tree-like representations straight to machine code.

* You don't need to take my word for this, or the word of anyone else. Every programmer should have experience writing parsers, compilers, and interpreters, regardless of whether they took a class or even went to university. By doing this most of the reasons for the way things are done will become immediately clear to you. Note that splitting strings with regular expressions does not count as parsing, not for these purposes. A good project, however, would be to write a regular expression parser, compiler, and interpreter. There are many examples to follow and you can copy their designs exactly; just don't copy+paste as actually going through the motions and understanding the complexities inherent in the implementations is how everything will become clear.

[1] How this mutation is handled and why tiered architectures are preferable
is another topic entirely.

~~~
eridius
Swift and Rust also have their own higher-level intermediate representations
that they translate their AST to before even going to LLVM IR (SIL and MIR
respectively). Rust used to translate their AST directly into LLVM IR but they
switched to going through MIR first because it unlocks a lot of optimizations
as well as improved user-facing functionality such as non-lexical lifetimes in
the borrow checker.

~~~
steveklabnik
Rust has AST -> HIR -> MIR -> LLVM-IR. Two internal IRs and a third external
one!

~~~
eridius
Good point, I forgot about HIR, though isn't HIR largely just a desugared
representation of the AST?

~~~
steveklabnik
Yes, but also post macro expansion and name resolution.

------
akling
Kudos to Tadeu Zagallo and the JSC team for landing this awesome patch!

I've worked on WebKit memory performance in the past, so I'm well aware that
these aren't low hanging fruits we're talking about. The type safety bonus
features look great too. :)

------
twoodfin
Re: Direct vs. indirect threading (aka ordinary dispatch). I had the
impression that recent Intel x86 chips did enough trace caching to render any
performance distinction basically irrelevant. Is that right?

(Of course, Apple has their own ARM implementations to consider.)

EDIT: Here’s the paper I was thinking of in this regard:

[https://hal.inria.fr/hal-01100647/document](https://hal.inria.fr/hal-01100647/document)

~~~
wahern
My first thought when they said they switched from direct threading to
indirect threading was that it may have been the worst possible time to do
that given Spectre.

The negligible performance difference on modern Intel chips between direct and
indirect threading is largely a result of their deep, heavily buffered branch
predictors. Spectre mitigations are going to increase the performance
differences. Most userspace applications will forgo mitigations in favor of
keeping performance, but the browser was the big exception as it has to be
concerned about in-process side-channels.

~~~
eridius
Haven't all the browser engines already implemented their own mitigations for
Spectre/Meltdown for JavaScript code?

~~~
wahern
I can't speak to their existing mitigations directly, but more generally
Spectre will be an ongoing saga for years to come.

I seriously doubt the mitigations in place, whatever they are, are
comprehensive even for _existing_ proven Spectre channels. The engines haven't
even solved RowHammer. It's a very difficult problem domain, particularly for
an application JIT'ing random, untrusted code from the Internet. In many
respects they're (IMHO) quietly punting because there just aren't satisfactory
solutions. Browsers are really stuck between a rock and a hard place, more so
than VM providers like AWS EC2.

One of the more general solutions is simply to stop trusting the browser, if
not your entire operating system. Keep your most sensitive secrets
inaccessible to software to the greatest degree possible. Use smartcards and
other hardware tokens for authentication, for example.

~~~
eridius
My point is you can't say "oh this approach may have poor behavior once
Spectre mitigations are put in place", because the browser has _already_
implemented Spectre mitigations.

~~~
tedunangst
There's more than one mitigation, and not all at the browser level. The kernel
may also flush the branch cache on syscall entry, which is going to hurt
performance if you depend on predictions being fast. Though as far as I know
pretty much nobody uses that mitigation precisely because it's too slow.

~~~
BeeOnRope
Direct threading and indirect threading both rely heavily on the indirect
predictor. You might even say that direct threading relies _more_ heavily on
indirect predictor state and in particular the contents of the IBTB because it
has so many more branches and so more opportunity to store per-location
history for particular instructions and hence more to lose if the IBTB is
flushed.

Indirect threading relies more on strong and deep history pattern matching at
a single location, and probably uses less state overall.

------
the_duke
Impressive improvements.

I wonder how performance and memory use compare nowadays between V8,
Spidermonkey and JavascriptCore.

Does anyone have a link to recent trustworthy and thorough benchmarks? (Google
doesn't really spit out anything noteworthy...)

~~~
pizlonator
I believe that JSC is in the lead when it comes to throughput and latency. But
that is at least partly based on benchmarks that I had a hand in designing.

I don’t know how the VMs stand against each other on memory. JSC is improving
in this area a lot recently but I don’t know if it’s just catching up or
leaping ahead or whatever.

------
nielsbot
When they talked about getting rid of their threaded interpreter, I was
reminded of this article from 2008 about writing fast interpreters, if that's
interesting to this audience:

[https://news.ycombinator.com/item?id=2593095](https://news.ycombinator.com/item?id=2593095)
[hn link]

------
_alastair
This is interesting! If anyone from WebKit is in the comments, can you provide
link(s) to the bugs discussing the bytecode caching API? I'd love to take a
look but my Bugzilla search skills are evidently weak.

~~~
tadeuzagallo
There are a few bugs related to caching, but the main ones are: The initial
implementation of the underlying infrastructure:
[https://bugs.webkit.org/show_bug.cgi?id=192782](https://bugs.webkit.org/show_bug.cgi?id=192782)
Initial C++ and Obj-C APIs:
[https://bugs.webkit.org/show_bug.cgi?id=193401](https://bugs.webkit.org/show_bug.cgi?id=193401)
WIP bug to integrate the cache with WebKit:
[https://bugs.webkit.org/show_bug.cgi?id=194047](https://bugs.webkit.org/show_bug.cgi?id=194047)

------
riotman
Do they have a different bytecode and runtime from WASM? Why not unify
everything to web assembly byte code?

~~~
ori_b
WASM byte code is structured into a tree, which means an interpreter loop
wouldn't perform well. You'd really want to flatten it into a simpler bytecode
before interpreting it -- and you'd want to do other transforms on it before
optimizing it.

I don't think WASM bytecode is a good format for execution, and it's only
mediocre as a compilation target.

~~~
riotman
Then why not abandon wasm, and make this byte code a target for llvm?

~~~
ori_b
LLVM bitcode is CPU and ABI dependent, and isn't even stable between LLVM
releases.

~~~
riotman
Bad question on my part. More relevant: Why wasm? From what I gather, js seems
to already have a bytecode for their JIT runtime. Why not expose that so that
we can have C++ in the web?

~~~
singularity2001
I asked the same question, see below for interesting thread

~~~
riotman
_sigh_ I think the modern web was a mistake from hacks of the 90s to share
documents. Can we plz get a fresh restart from scratch?

------
singularity2001
Now please expose a loadFromByteCode api so that we can target bytecode
instead of transpiling to js.

~~~
bepvte
Suprised no one has posted [https://blog.cloudflare.com/binary-
ast/](https://blog.cloudflare.com/binary-ast/) . Its not bytecode but its
still a great idea

~~~
singularity2001
Oh wow!! In a way it's even better than bytecode because it allows marking
unused functions for lazy loading, giving speedups of 97% (in theoretical
settings;)

Must read.

[https://github.com/binast/binjs-ref](https://github.com/binast/binjs-ref)
from below is part of this!

------
DiseasedBadger
WebKit2 Qt when? Also, full QNetworkConnection support. That would be great.

------
pier25
Is it possible to pregenerate that bytecode?

For example for hybrid desktop/mobile apps.

~~~
eridius
If you're going to precompile your JS, why would you want to emit this
bytecode instead of just going to WASM?

~~~
pier25
Because wasm still doesn't support many features that JS does. For example
accessing the DOM.

------
novok
You mean like binast? [https://github.com/binast/binjs-
ref](https://github.com/binast/binjs-ref)

~~~
olliej
bin ast is/was an attempt to make a more compact version of JS that was easier
to parse.

But parsing JS isn't a significant bottleneck, it's the subsequent work to
produce the stuff that actually runs. IIRC proponents also claimed it meant
you didn't have to worry as much about validation, which isn't true because
it's untrusted content that comes from the internet. It must be validated.

binast also is not designed to be any more readily executable that JS - so
even if it were supported step one would be "produce a bytecode that is fast".

------
ksec
And it is already shipped in Safari 12.1 ( Which means many are already using
it )

Time and Time again it seems WebKit are the only team that wants to make the
Web with better Web _Pages_ with javascript , All the others seems to want the
Web to be Fat _Apps_.

In the hope of anyone in Safari team is reading it.

Please make the Tab Overview Cache the Thumbnail or in List format, currently
pressing Tab Overview will reload all the tabs in the background. I don't know
if this is for generating thumbnails or other reason. Something that kills my
machine when I have 300 Tabs, most of them are "cold" and not loaded.

~~~
saagarjha
> Time and Time again it seems WebKit are the only team that wants to make the
> Web with better Web Pages ( And we are far from perfecting it ), All the
> others seems to want the Web to be Apps.

I don’t see how this blog post supports that view, since it’s talking about
optimizing JavaScript.

~~~
ksec
By Web page I mean including minimal Javascript, such as making initial JS
loading faster, lower latency, lower memory, and in general much better UX. To
me Chrome seems to be optimising for the wrong thing, like maximum throughput,
WASM, Super fast in Compute intensive usage but eating memory like crazy.

~~~
dchest
[https://v8.dev/blog/ignition-interpreter](https://v8.dev/blog/ignition-
interpreter)

"V8 team has built a new JavaScript interpreter, called Ignition, which can
replace V8’s baseline compiler, executing code with less memory overhead and
paving the way for a simpler script execution pipeline."

[https://v8.dev/blog/preparser](https://v8.dev/blog/preparser)

"Lazy parsing speeds up startup and reduces memory overhead of applications
that ship more code than they need."

[https://v8.dev/blog/embedded-builtins](https://v8.dev/blog/embedded-builtins)

"V8 built-in functions (builtins) consume memory in every instance of V8. The
builtin count, average size, and the number of V8 instances per Chrome browser
tab have been growing significantly. This blog post describes how we reduced
the median V8 heap size per website by 19% over the past year."

[https://v8.dev/blog/improved-code-caching](https://v8.dev/blog/improved-code-
caching)

"V8 uses code caching to cache the generated code for frequently-used scripts.
Starting with Chrome 66, we are caching more code by generating the cache
after top-level execution. This leads to a 20-40% reduction in parse and
compilation time during the initial load."

Etc, etc.

