

On Languages, VMs, Optimization, and the Way of the World - mwcampbell
http://blog.headius.com/2013/05/on-languages-vms-optimization-and-way.html

======
rayiner
One thing that gets lost in these discussions is implementation complexity.
The JVM is amazing, but it's also a phenomenally complex piece of code.

My recent interest has been in gradually typed languages:
[http://ecee.colorado.edu/~siek/gradualtyping.html](http://ecee.colorado.edu/~siek/gradualtyping.html),
[http://www.mypy-lang.org](http://www.mypy-lang.org).

Gradual typing is somewhat different from optional typing as found in Dylan or
Common Lisp, because it offers a consistent type system wherein a program that
is fully typed declared will be statically typed. Dylan and Common Lisp
compilers use extensive type inference systems to try and eliminate as many
dynamic type checks as possible, but type inference is fairly complicated and
its hard for the programmer sitting at his keyboard to guess where typechecks
will be generated unless he has a lot of familiarity with a particular
implementation. With a gradually typed language, you can be pretty sure that
if you annotate certain syntactic elements with types, the compiler will
generate code that does not check types at runtime. Or, in code that only
partially declares types, you can be sure that type errors will ultimately be
traceable code that does not have declarations
([http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.142....](http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.142.5873)).

The upside of gradual typing is two-fold: 1) you can generate pretty good code
without a very sophisticated compiler; and 2) you can generate pretty good
code without having a runtime optimization framework in place (as with the
JVM, or the Dart VM or V8).

~~~
evincarofautumn
We did this in an ActionScript 3 compiler at my last job. The problem is that
since code is dynamic by default, you still need the “fallback” operations
mentioned in this post, because people can (and do) make use of dynamic
operations and you need to preserve behaviour. Also, ActionScript’s type
system isn’t expressive enough to actually get to full static typing—there is
only one function type, for example, which says nothing about its parameter or
return types. But in principle, and in a language that was designed to support
it at the outset, it’s a good way to do typing.

~~~
rayiner
> ActionScript’s type system isn’t expressive enough to actually get to full
> static typing—there is only one function type, for example, which says
> nothing about its parameter or return types

Yeah, that's the problem in Dylan too. See: opendylan.org/~hannes/coll.pdf‎

------
azakai
The points all sound reasonable. I wonder about the details though, for
example

> Dart is dynamically typed (or at least, types are optional and the VM
> doesn't care about them), but types are of a fixed shape. If programmers can
> tolerate fixed-shape types, Dart provides a very nice dynamic language that
> still can achieve the same optimizations as statically-typed Java or
> statically-compiled C/++.

Again, the principle sounds reasonable - predictable types allow Java/C-like
performance - but it would be nice to see this supported by empirical results.
For example

[http://benchmarksgame.alioth.debian.org/u32/benchmark.php?te...](http://benchmarksgame.alioth.debian.org/u32/benchmark.php?test=all&lang=dart&lang2=java&data=u32)
,
[http://benchmarksgame.alioth.debian.org/u32/benchmark.php?te...](http://benchmarksgame.alioth.debian.org/u32/benchmark.php?test=all&lang=dart&lang2=gpp&data=u32)

and

[http://j15r.com/blog/2013/07/05/Box2d_Addendum](http://j15r.com/blog/2013/07/05/Box2d_Addendum)

do not support the article's point. Dart's performance there is much like
JavaScript's.

The one dynamic language that can in many cases compete with C and Java is
Lua, in the LuaJIT implementation. But Lua is very dynamic, much like
JavaScript, so again this is a result in the real world that does not seem to
add up with the article's point.

------
noelwelsh
_The important thing is for language users to recognize that nothing is free,
and to understand the implications of language features and design decisions
they make in their own programs._

I can agree with this statement. Other than that, I have some qualms with the
post. For a start, it doesn't consider that compilation takes time. The great
thing about JIT compilation is you know everything about the code being run.
The bad thing is you have no time to make use of this information.

Then there is this:

 _Traditionally, static typing was the best way to guarantee we produced good
CPU instructions. It gave us a clear picture of the world we could ponder and
meditate over, eventually boiling out the secrets of the universe and
producing the fastest possible code._

Huh, what? Any compiler author knows that many compiler optimization problems
are NP-Complete. For example, graph colouring register allocation is, as the
name suggests, NP-complete. You have to do register allocation whether you JIT
compile or AOT compile, and you have to do it approximately in both cases.
(IIRC typical JIT compilers use a linear scan graph allocation algorithm which
is almost as good as graph colouring but much faster.)

Furthermore, in the languages he considers there are loads of optimization
opportunities that cannot be statically expressed in the language (e.g. stack
allocation of memory, immutability, SIMD operations). You can infer many of
these statically, but statically typed languages also benefit from JIT
optimization using extra information available at runtime in the same way that
dynamically typed languages do.

Finally, a type theorist would have a fit over how he uses the word "typed"
(and they would object to this comment as well.) Types don't exist a runtime
-- representations do.

 _Update_ AOT compilers also allow code to be targeted to specific machines.
An obvious example is a Linux distro available in 32-bit and 64-bit versions.
GCC allows much finer distinctions than this.

AOT compilers can also make use of runtime information -- so-called "profile
guided optimisation."

~~~
qznc
I agree with you, although it comes out unnecessarily rude.

However, JIT does not necessarily mean fast compilation (e.g. Linear Scan). A
JIT might identify hot spots in the code and the optimize the hell out of that
part with the same optimizations that GCC -O3 uses. Plus, it might have
additional runtime information. Advanced JIT engines usually have multiple
stages of optimization.

~~~
nickik
Its a good point about jit time. It is however also true that even if you do
it in the background (as for example Azul does, I think) you are not completly
free of its problems. It is often hard to firgure out if it is worth the
coordination of the background work. Im not complety sure but I think V8 does
advanced optimisation on thread.

Also I am sure that Lua Jit does everything on a single thread. LuaJit is a
special case most of the time since it works so fundamentally diffrent almost
all other high performace jits in the wild. It only has one compiler and still
works with a interpreter.

------
m0th87
> You could probably make the claim that static optimization is a halting
> problem, and dynamic optimization eventually can beat it by definition since
> it can optimize what the program is actually doing.

It's not as if static type systems and JIT compilation are mutually exclusive
concepts, so this argument makes no sense.

If JITing were a panacea, dynamically typed languages would be just as fast as
statically typed, and lower-level languages like C would be long dead. This is
clearly not the case.

~~~
mwcampbell
I think his point there is that a JIT compiler can handle dynamism more
effectively than an AOT compiler. A corollary is that the more static the
language is, the less advantage JIT compilation has over AOT compilation.

~~~
PommeDeTerre
Part of what I see m0th87 saying, though, is that practice has shown that
these claims about JIT compilation handling certain cases more effectively
than AOT compilation just don't really hold true.

JIT compilation may be better than some of the other approaches for
implementing VMs or interpreters. It may also theoretically have the potential
to be better than AOT compilation in some cases, too. But that potential
doesn't really matter if we never see such benefits in real systems.

Real-world code written in C, C++, and Fortran and compiled ahead of time
still basically always outperform Java bytecode executed by HotSpot, or
JavaScript code executed by V8, or Lua code executed by LuaJIT, by a huge
margin. This is even after a huge amount of effort has gone into systems like
HotSpot and V8, for instance.

The theoretical benefits of JIT compilation do us no good if it's too costly
or difficult to build effective implementations of such systems in a timely
manner, for example. It does the wider community no good if such optimizations
only really apply in a very small number of highly specific cases, as well.

A lot of us have heard these claims about JIT compilation for over a decade
now, especially with respect to HotSpot, but we've yet to see these techniques
consistently and reliably match AOT compilation in general, never mind
outperform it. There is a lot of doubt out there due to this.

~~~
simula67
As much as I like C, it would be better if language choice does not depend on
the performance it can deliver. It would be nice to see the JVM pull up closer
to AOT compiled languages, but as this post painfully exposes, this seems
implausible.

------
cwzwarich
One thing that dynamic optimizations have a problem dealing with is
eliminating the overhead of indirection in data representation, e.g. from
arrays of objects or nested objects. If all of the objects involved have a
limited local lifetime, then you can just inline all of the functions involved
and apply standard optimization techniques. But if the objects have a more
global scope, then a JIT really can't do that much. There is some research
into this (see
[http://ssw.jku.at/Research/Papers/Wimmer08PhD/](http://ssw.jku.at/Research/Papers/Wimmer08PhD/)
for an example), but as far as I know nothing is production quality.

------
trimbo
> Given appropriate dynamic optimizations, there's no reason Java code can't
> compete with or surpass statically-typed and statically-compiled C/++,

Garbage collection.

I'm not sure I understand completely what he's trying to say though I find it
hard to believe an article about "languages, VMs and optimization" doesn't
once mention GC. You can heavily optimize any language to JIT, vectorize
instructions, use SSE, whatever... but then that pesky garbage collector stops
the world and takes 100 milliseconds...

~~~
barrkel
The primary cost of GC is in memory, not time. GC doesn't need to stop the
world; but to be efficient, it needs to be free to generate a lot of garbage
to amortize the cost of tracing the live set. This is why GC is asymptotically
faster than malloc/free (though not necessarily other approaches, like
arenas). GC is also easier, in principle, to parallelize.

~~~
nickik
> GC doesn't need to stop the world

As far as I know almost all GCs eventually have a full on stop the world case.
The only expetion I know of is the azul C4 collecter but I think there the
only ones.

~~~
barrkel
Yes, you are agreeing with me, also mentioning an existence proof.

------
lmm
With a sufficiently advanced language (scala), fully static typing is exactly
what I as a programmer want - it gives me a lightweight way to make assertions
about my values everywhere in my program, which makes debugging a lot easier.
The fact that values can change type in Javascript or Ruby I find to be a
disadvantage.

~~~
sethhochberg
I agree. For me, I really do enjoy _writing_ Ruby programs - but when
something isn't working as expected and its time to do some serious debugging,
I prefer C++ all day.

------
cromwellian
There is also an impact on memory use and startup latency imposed by JITs vs
AOT. None of the JS VMs so far seem to do any kind of snapshotting. Since the
types collected by runtime profiling are likely to be the same for most runs
of a well-behaved app, it seems like you could send down the profile to the
client JIT to allow it to converge quicker.

------
kodablah
I have been playing w/ cross compiling a language to Dart. There is no
bytecode[1] which actually makes it a tad easier. Dart does give you
method_missing semantics w/ noSuchMethod. Granted a runtime method lookup
table, if you were building Ruby on Dart, would lose the compilation
optimizations on dynamic methods, but you'd still get them w/ the explicitly
defined ones (I guess the same way as JRuby). When Dart has more runtime
compilation abilities (beyond just spawnUri) I think it will be a great target
for dynamic languages.

1 - [http://www.dartlang.org/articles/why-not-
bytecode/](http://www.dartlang.org/articles/why-not-bytecode/)

~~~
seanmcdirmid
This is similar to the design of the dynamic language runtime under
Microsoft's CLR, which is able to accomplish this with bytecode by objects
that implement IDynamicMetaObjectProvider.

------
sanxiyn
I also recommend "Why Python, Ruby, and Javascript are slow".
[https://speakerdeck.com/alex/why-python-ruby-and-
javascript-...](https://speakerdeck.com/alex/why-python-ruby-and-javascript-
are-slow)

Summary: Things that take time are hash table lookups, allocations, copying.

