
A Deep Introduction to JIT Compilers: JITs are not very Just-in-time - chrisseaton
https://carolchen.me/blog/jits-intro/
======
cs702
Very nice. I just looked at the author's resume.[a] It appears she is still
in, or only very recently graduated from, _high school_. Is that right?
Impressive!

[a] [https://carolchen.me/](https://carolchen.me/)

~~~
kipply
Yep, class of 2019, jit to actually have a physical prom and graduation

~~~
brabel
How does a high school kid get so interested in (and manages to pull off a
great article about) stuff like Intermediate Representations, compiler
optimisation techniques, JITs as different as those of Julia and GraalVM, that
would easily scare even a seasoned programmer with years of experience and a
CompSci degree under the belt?

~~~
enriquto
You don't know many nerds, do you?

EDIT: When I was in high school, msdos and windows viruses were all the rage:
self-modifying polymorphic code that injected itself into the RAM of other
processes, having no identical strings longer than a few bytes between each
instance of the virus, etc. Way useless than jit stuff, but a comparable level
of "depth", and 100% assembly code.

------
kipply
It's also in the post, but [https://carolchen.me/blog/jits-
impls/](https://carolchen.me/blog/jits-impls/) has more advanced concepts

------
Cu3PO42
Entirely off-topic, but the author's name when written without a space (i.e.
Carolchen, as it is in the URL) just happens to be the diminutive of Carol in
German. Except that Carol is not a very common name here at all, so rare in
fact that I've not once met a German Carol. This had me very confused for a
moment.

------
joe_the_user
This article is really nice for not making things harder than they needed to
be. Trace then substitute the trace - no magic.

It seems like a big part of all these strategies is making sure the checks for
the optimizations (checking if they're possible, checking if they're done)
don't take more time than the optimizations save.

Or could there be a way to alter the execution graph so that once you add in
the optimization, you never have to check if it's there?

~~~
the8472
> It seems like a big part of all these strategies is making sure the checks
> for the optimizations (checking if they're possible, checking if they're
> done) don't take more time than the optimizations save.

You can't really check that, you can only hope your heuristics are right. You
don't know in advance how many times a method will be executed and thus how
much optimization budget is worth it. If you boot up a java application that
runs for months and handles tons of traffic then you could theoretically
justify many minutes of CPU cycles for optimization.

And that's not all that makes JITs complex. Some also do speculative
optimizations that require guards and deoptimization points so they can
fallback and recompile if those assumptions are violated. If the language has
a garbage collector then the GC and JITs have to cooperate for ideal
performance.

------
saagarjha
I just realized that the link color changes every time I load the page–I
clicked on the article again and it was different from the last time I looked
at it!

~~~
kipply
picking colours is really hard so I decided not to

~~~
saagarjha
And now it is all inverted because I have Dark Mode on, even the images!
Without even getting to the content the site is a joy :)

------
pjmlp
Interesting articles, my only remark is not talking about JIT caches and PGO
across process executions, but I guess it might come on a later post.

~~~
kipply
[https://carolchen.me/blog/jits-intro/#pogo](https://carolchen.me/blog/jits-
intro/#pogo)

It's there though not very in depth c:

~~~
pjmlp
This is not what I am talking about, JIT caches with PGO data work in a
different way.

[https://openjdk.java.net/jeps/310](https://openjdk.java.net/jeps/310)

[https://www.eclipse.org/openj9/docs/xcodecachetotal/](https://www.eclipse.org/openj9/docs/xcodecachetotal/)

[https://docs.oracle.com/cd/E13188_01/jrockit/docs142/usergui...](https://docs.oracle.com/cd/E13188_01/jrockit/docs142/userguide/codecach.html)

[https://source.android.com/devices/tech/dalvik/jit-
compiler#...](https://source.android.com/devices/tech/dalvik/jit-
compiler#architectural-overview)

~~~
kipply
I don't think I'm familiar with this (I may have come across it but didn't
identify it by name when I worked with Graal?), could you provide a link? :o
Thanks!

~~~
pjmlp
I have provided links above, there are also a couple of Java Language Summit,
Code/Java ONE and Google IO talks about them, but I would need to search for
them. Can provide them later.

~~~
kipply
Thanks, I know of this stuff but I don't think I would've been qualified to
form an explanation/overview of them. It would be nice to have these in a post
though!

------
vlovich123
Is anyone aware of any efforts to get clang/rustc to use lli to speed up
iteration times? My thought is you lower everything to llvm bytecode first but
skip all the codegen/linking. Then you can codegen or run interpreted. Might
help make debug builds much faster to get to execution. Probably could
trivially parallelize the background compilation & auto-build in the
background on every file change to get the best of all worlds (most code gen
is ready, the remainder is executable & will get compiled as needed). If you
can do it with live substitution (which I think is what lli does) then you'd
progressively get a faster program.

~~~
saagarjha
How much time actually goes into instruction selection and register
allocation? I thought a lot of the time went into optimization passes at the
IR level.

------
why_only_15
This is a nit but looks like she says pyc files aren't around in python3
anymore -- they're still there just in the __pycache__ directory (as of 3.2)

~~~
saagarjha
I know Python does the dunders and all, but couldn’t they have at least stuck
a dot in front of it if they wanted to move it out of the way :(

~~~
ygra
That won't work on other operating systems anyway.

~~~
dkersten
Those files are typically generated on each platform anyway, so just have the
dot on platforms that allow it and not on platforms that don’t.

------
jkubrynski
There is also an interesting presentation about how Azul implemented their JIT
called Falcon, that is fully based on LLVM.
[https://www.youtube.com/watch?v=Uqch1rjPls8](https://www.youtube.com/watch?v=Uqch1rjPls8)
Generally LLVM is a nice idea that allows vendors to reuse major components.

~~~
saagarjha
The one problem with LLVM is that it’s not very well suited to JIT compilers,
a major component of which is its slowness.

~~~
monadic2
That’s not really accurate. It has strict requirements about stack traversal
required to appropriately trace memory roots. This is incidental to JIT vs
AOT.

~~~
saagarjha
Did you reply to the wrong comment?

~~~
monadic2
Sorry, what did you read?

~~~
saagarjha
This comment:
[https://news.ycombinator.com/item?id=23744360](https://news.ycombinator.com/item?id=23744360),
where you're talking about JIT vs AOT in response to my LLVM one. I couldn't
really understand how it followed.

------
jakozaur
Tl;dr: JIT is a compiler that optimuzes certain parts of code after
interpreter see it as hot (e.g. was executed many time). Thank to runtime
information it can occasionally exceed performance of statically compiled
language.

~~~
dialamac
> Thank to runtime information it can occasionally exceed performance of
> statically compiled language.

Interestingly in 30 years I have not once heard of a case where this
theoretical benefit has manifested as a clear advantage in any real world
application when looking at the system as a whole... amdahls law and all that.

You can always hand tune the 1-10% hotspots for reasonable cost most of the
time, and even static tools can do PGO which generally gets you where JIT
would anyway.

~~~
chrisseaton
> in 30 years I have not once heard of a case where this theoretical benefit
> has manifested as a clear advantage in any real world application

Today you make make a very direct empirical comparison to see this - using the
Graal compiler. This lets you compile exactly the same Java code either ahead-
of-time or just-in-time, but using the same compiler logic except for the
runtime information available when running just-in-time. The just-in-time code
is (ignoring startup and warmup time) in my experience _always_ faster, due to
the extra runtime information.

~~~
ptx
Isn't this because Java is designed for JIT compilation, or at least not
designed (with the appropriate tweaking knobs) for AOT compilation?

Languages built with AOT compilation in mind (e.g. Rust or Nim) usually give
you lots of ways make choices at compile time and give hints to the AOT
compiler that the JIT compiler would instead try to infer at runtime in Java.

But by infering these things at runtime intead, maybe the JIT approach makes
it easier to get fast code in those cases where you (as an application
developer) don't want to put a lot of effort into optimization?

~~~
pjmlp
Java has had commercial implementations of AOT compilers since the early 2000.

Most compilers for embedded systems have always offered that option, and in
what concerns enterprise JVMs, JIT compilers have had the capability to cache
JIT code and PGO data between runs.

Both options that have come now to OpenJDK, OpenJ9 and Graal.

Android also learned the hard way that changing to pure AOT did not achieve
the performance improvements that they expected, while compilation on device
achieved C++ compile times when it was time to update all apps, hence the
multi-tier interpreter/JIT/AOT with PGO introduced in Android 7.

The main problem of AOT compilation with PGO, is that first of all one needs a
good dataset so that the optimizations are in line with the actual behaviour
in production, still doesn't work across dynamic libraries so optimizations
like devirtualization are not possible, and most of the time the tooling is
quite cumbersome to use.

~~~
seanmcdirmid
Pretty sure assymetrix was done in the late 90s. 1998 or so, they started out
doing educational software and pivoted to doing ahead of time compilation for
some weird reason. Here is a link to when it failed in 1999:
[https://www.cbronline.com/news/sepercede_exits_java_sells_to...](https://www.cbronline.com/news/sepercede_exits_java_sells_to_instantiations/)

------
wikunia
First of all! Nice post :) Just some random things: The first paragraph needs
some commas, I think. In the first code block you probably want to have
`print(variable_x)`

