
JVM JIT optimization techniques - sashee
https://advancedweb.hu/2016/05/27/jvm_jit_optimization_techniques/
======
zmmmmm
It's interesting how most of the optimizations are only really available if
you essentially write your programs in a functional style. That is, they all
depend on the compiler being able to analyse the local scope to decide whether
the state changes are confined enough that it can perform an optimization
without changing behavior. If a reference escapes, even into a private,
instance variable, the compiler will (admittedly, only based on the shallow
analysis in this article), more or less give up, because it can't predict any
more how that variable might be mutated from another scope.

~~~
Alphasite_
Escape analysis is pretty damn capable if you can give it enough time to run.

~~~
laerad
The analysis is fine, but "escape analysis" is often conflated with stack
allocation, and at this hotspot is actually really terrible. The constraints
on allocating to the stack mean it happens very infrequently, even if the
literal _escape_ analysis would permit it otherwise.

~~~
chrisseaton
Calling it 'stack allocation' is confusing - object fields are replaced with
scalars in the IR. The whole object is not allocated as it would have been on
the heap but on the stack in the alloca sense.

But I don't think the constraints are that bad are they? Can you give
examples? Modern compilers like Graal can even scalar replace objects if they
will escape in the future.

~~~
sievebrain
Given that the HotSpot compilers are not Graal and HotSpot's scalar
replacement optimisation is notoriously fragile, I think the original
statement was fair. Once Java 9 rolls around and Graal is just a plugin
instead of a whole separate VM build, it'll be less fair, and if Graal ever
becomes the default compiler that replaces C2 then it'll be even more fair,
but I guess that is years away at best.

The biggest constraint by far is the reliance on inlining. Inter-procedural
escape analysis could potentially remove the need for value types in many
places, but it doesn't seem like the JVM engineers believe that's feasible.

~~~
Alphasite_
Hot spot is based on a relatively old paper at this point from what I remember
and there are alternatives which purportedly perform better.

~~~
sievebrain
Can you name one?

~~~
chrisseaton
Partial escape analysis is one [http://www.ssw.uni-
linz.ac.at/Research/Papers/Stadler14/Stad...](http://www.ssw.uni-
linz.ac.at/Research/Papers/Stadler14/Stadler2014-CGO-PEA.pdf)

~~~
sievebrain
Ah, I meant "alternatives to HotSpot" which is what the parent was implying.

I've read the Graal papers and part of the source code. So I am familiar with
PEA. It's too bad that optimisation didn't make it into C2 yet. I wonder if it
ever will.

~~~
chrisseaton
Someone told me that Zing also does PEA, but I don't know that to be a fact.

~~~
chrisseaton
Oh and I think JRockit might have done it as well.

------
orf
There is something about this site that absolutely destroys Firefox when
scrolling. Expensive work in a scroll handler isn't good guys :(

~~~
jsnell
I would not be too quick to blame the authors of the website. It seems likely
to be the same Firefox bug as all the other cases of HN comments complaining
about bad scrolling performance in Firefox:
[https://bugzilla.mozilla.org/show_bug.cgi?id=1250947](https://bugzilla.mozilla.org/show_bug.cgi?id=1250947)

~~~
pygy_
Bad link, I'm afraid... It's about a box-shadow glitch.

~~~
jsnell
It's the correct link (if you read the last 15 or so comments, it'll be more
obvious that there is a performance component to the issue).

~~~
mediumdeviation
It seems you're correct. Specifically, disabling this rule from the inspector
instantly fixes the performance issue.

    
    
      .widewrapper.main {
        box-shadow: inset 1px 3px 1px -2px #ababab;
      }

------
userbinator
_for most of the time, many performance considerations are invisible from the
higher abstraction levels, so you can concentrate writing simple, elegant and
maintainable applications in Java, Scala, Kotlin or anything that runs on the
JVM._

Funny statement, at least for Java. I don't know anyone who would say the
majority of Java applications _aren 't_ huge, slow, memory-hogging beasts
compared to equivalent native ones, and whose codebases are just as
unoptimised for the perspective of the humans who have to work with them.
That's mainly perception from when I briefly worked with Enterprise Java many
years ago --- no one wrote "simple" or "elegant" Java applications, and
stuffing as many design patterns and abstractions in as you could was
considered "best practice".

There are certainly examples of simple and small Java applications (see
[https://en.wikipedia.org/wiki/Java_4K_Game_Programming_Conte...](https://en.wikipedia.org/wiki/Java_4K_Game_Programming_Contest)
), and I've written a few, but the overwhelming culture seems to be that of
anti-optimisation and bureaucratic excess. In some sense, it's almost like
working _against_ or "challenging" the JVM's optimiser is the norm.

In my experience, a "simple, elegant" design --- which does _not_ necessarily
mean "highly abstract" \--- tends to be very close to optimal anyway, with the
compiler's optimiser doing the remaining work. That makes me wonder whether
claiming a language/runtime/compiler has powerful optimisation abilities is
actually a reflection of how much the typical source code in the language has
to be "cleaned up" by the optimiser in order to be decently efficient.

Also, does anyone else notice the "Machine code" given for the JIT example is
_completely unrelated_ to either the Java or bytecode? It's 16-bit realmode
--- I recognise the access to the 40h segment, it seems to be timing-related,
and a bit of Googling finds that it's actually part of an old TSR clock
utility (with unknown source):

[http://assembly.happycodings.com/code24.html](http://assembly.happycodings.com/code24.html)

~~~
pron
> I don't know anyone who would say the majority of Java applications aren't
> huge, slow, memory-hogging beasts compared to equivalent native ones, and
> whose codebases are just as unoptimised for the perspective of the humans
> who have to work with them.

I would, and I've been developing in Java for over ten years now, after ten
years of C/C++.

I think that your complaints have nothing to do with Java, and much to do with
a coding style that became popular in the '90s. You could see the very same in
C++ applications back then, but then Java took over. The big enterprise
applications of the '90s and '00s reflect the prevailing mindset of the time.
Java applications written today don't look like that.

> In my experience, a "simple, elegant" design --- which does not necessarily
> mean "highly abstract" \--- tends to be very close to optimal anyway, with
> the compiler's optimiser doing the remaining work. That makes me wonder
> whether claiming a language/runtime/compiler has powerful optimisation
> abilities is actually a reflection of how much the typical source code in
> the language has to be "cleaned up" by the optimiser in order to be decently
> efficient.

In that case I think your experience may be limited, or that you're unaware
how big "the remaining work" may actually be, especially on modern hardware.
HotSpot has proven extremely adept at running many different code styles very
efficiently. HotSpot's next-gen compiler, Graal[1], is IMO the biggest
breakthrough in compilation technology of the past decade. It runs Java faster
than C2 (HotSpot's current compiler), Python faster than PyPy, and JS as fast
as any other VM out there. It even has a C frontend, which, while not yet
matching gcc, performs surprisingly well considering how little effort has
been put into it.

This kind of work can be very important, namely, the ability to optimize code
across different languages. A recent paper[2] found that re-writing parts of
SQLite in Python (!), instead of the original C, can result in a 3x (!)
performance boost when running in an application, as application and DBMS code
can be optimized as one unit.

[1]:
[https://wiki.openjdk.java.net/display/Graal/Publications+and...](https://wiki.openjdk.java.net/display/Graal/Publications+and+Presentations)

[2]: [http://arxiv.org/abs/1512.03207](http://arxiv.org/abs/1512.03207)

~~~
geodel
Well if Graal is faster why is it not made part JDK distribution.

~~~
pron
As of JDK 9, Graal will be a pluggable JIT that you can opt to use (it's just
a Java library). It is not the default JIT because it is still experimental
and not productized yet.

------
karankamath
Maybe a noob observation, but can anyone explain why the Lock Coarsening
example is valid? Seems to me that locks on A with a lock on B in between is
not equivalent to 2 locks on A and then one on B... Unless the programmer made
a cognitive error the cases seem incomparable, unlike the other examples.

~~~
taspeotis
"Seems to me that locks on A with a lock on B in between is not equivalent to
2 locks on A and then one on B"

You're looking at:

    
    
        public void canNotBeMerged()
    

The locks can't be merged.

~~~
karankamath
Yeah, but the semantics there refer to merged by an optimization? What's the
point of having incomparable examples?

~~~
Alphasite_
Just to demonstrate an example of invalid code?

