
Announcing Pyston: an upcoming, JIT-based Python implementation - tweakz
https://tech.dropbox.com/2014/04/introducing-pyston-an-upcoming-jit-based-python-implementation/
======
haberman
> For instance, the JavaScript world has switched from tracing JITs to method-
> at-a-time JITs, due to the compelling performance benefits.

This is a weird way of putting it. Method-at-a-time JITs have been around much
longer and represent a more traditional approach to JIT compilers. Tracing
JITs have only become popular in the last 5-10 years. And during that time
they've been seen as the sort of hot new thing, so much that there is a
classic LtU thread from 2010 titled "Have tracing JIT compilers won?"
([http://lambda-the-ultimate.org/node/3851](http://lambda-the-
ultimate.org/node/3851))

While it's true that V8 has always been method-at-a-time and Mozilla has
abandoned their tracing JIT TraceMonkey, LuaJIT is one of the fastest dynamic
language implementations out there and is a tracing JIT. Unfortunately the
benchmark game dropped LuaJIT so it's not easy to find benchmarks, but last I
saw LuaJIT was pretty dominant speed-wise among dynamic language JIT
implementations.

Mike Pall (LuaJIT author) argues that TraceMonkey's lack of compelling
performance was more a result of trying to bolt tracing onto an existing VM as
opposed to any shortcoming of tracing as an approach ([http://lambda-the-
ultimate.org/node/3851#comment-57643](http://lambda-the-
ultimate.org/node/3851#comment-57643)).

~~~
saurik
Dalvik (the Android VM), when they added a JIT, claimed to be "starting" with
a tracing JIT which they would "supplement" with a method later, which they
seemed to feel could provide more performance. Their argument is that the
method JIT operates over a larger code window; but Dalvik only seemed to
operate on extremely short traces, so it has always been unclear to me whether
this was just a weird limitation of their specific design :/.

------
_halgari
I'd love to hear more about their issues with PyPy, it sounds like they wrote
of PyPy simply because they don't understand why it works so well. Not to
mention that this is mostly a re-hash of stuff found in unladen-swallow.

I mean, if your end goal is to write another Python, sure go for it. But it
really sounds like these people haven't done their research. I see nothing to
write home about.

\--- EDIT ---

Not to mention that JS is a completely different language form Python.
Everytime you add two objects in Python you have the possibility of hitting a
system defined add, or __add__ or __getattr__ or __getattribute__, or
__radd__, or __getattr__ (looking for __radd__), etc. That'll be fun....

~~~
rayiner
Their description does not suggest they don't understand how PyPy works, but
rather they don't think they can tackle the yet unsolved failure modes
(blowup, etc) of trace compilation.

They approach they're describing is one that already works in V8, JScore, and
IonMonkey, which is to mix type prediction, type analysis, and runtime
handling of unexpected cases. Basically, you use type feedback information to
get an initial set of types for a method, use type inference techniques to
squeeze out type checks, and then compile the method in a way that handles the
expected types in a fast path and traps into a slow path as necessary.

~~~
sanxiyn
None of V8, JavaScriptCore, SpiderMonkey do allocation removal, which is the
single most important optimization PyPy and LuaJIT do, which also goes back to
Psyco. I think it is unknown how to do this well in method JITs.

Allocation removal by partial evaluation in a tracing JIT:
[http://dl.acm.org/citation.cfm?id=1929508](http://dl.acm.org/citation.cfm?id=1929508)

Allocation Sinking Optimization: [http://wiki.luajit.org/Allocation-Sinking-
Optimization](http://wiki.luajit.org/Allocation-Sinking-Optimization)

~~~
mraleph
> None of V8 .. do allocation removal

This is not correct. V8 does sink allocations into deoptimization exits. It
does not sink allocations out of the loops at the moment though.

> I think it is unknown how to do this well in method JITs

I don't think it is unknown. The main simplification for tracing JITs comes
from the fact that deoptimization and loop-exit can be elegantly treated
within the uniform framework, which is a little bit harder for method JIT and
you need to find right place to insert materialization instruction after the
loop based on post-domination. Nothing hard or unsolvable though.

------
sanxiyn
Those who do not learn history are doomed to repeat it.

[http://qinsb.blogspot.kr/2011/03/unladen-swallow-
retrospecti...](http://qinsb.blogspot.kr/2011/03/unladen-swallow-
retrospective.html)

~~~
kmod
I can't speak to the non-technical reasons mentioned in the post, but there's
good reason to believe that LLVM is in a very different state than the time
that Reid wrote this, particularly wrt JIT support: the JIT engine has been
completely replaced. Both of the things that he mentions (lack of back-
patching, lack of gdb support) have been added to LLVM mainline.

~~~
sanxiyn
LLVM's JIT support has changed a lot, but one thing didn't change: "LLVM code
generation and optimization is good but expensive."

Even LLVM's "fast" code generator is slower than most JITs' code generators.

~~~
kmod
Definitely true; there are some overheads that could be low-hanging fruit to
optimize, but the real solution is most likely to use a tiered-compilation
system. ie only invoke the full LLVM code generator once a function has been
called 10,000 times. It's definitely an open question as to how to get faster
compilation times out of it; we have a simple LLVM interpreter but since LLVM
isn't designed for interpretation it's pretty slow.

I think you can look at the people who are interested in adding LLVM tiers to
their JITs (ex Apple, Facebook) to see that we're not the only ones that think
there can be a place for "good but expensive" code generation in a JIT.

~~~
jbs3982
Like this?

[https://github.com/dropbox/pyston#compilation-
tiers](https://github.com/dropbox/pyston#compilation-tiers)

------
asb
There's some more technical details here:
[https://github.com/dropbox/pyston#technical-
features](https://github.com/dropbox/pyston#technical-features)

Right now they have no baseline compiler but will interpret (un-optimised?)
LLVM IR at first, second tier is unoptimised LLVM compilation, then LLVM
compilation with type recording hooks and finally a fully optimised compile.
Given the history of the Unladed Swallow project and others using the LLVM
JIT, they're likely to find they have a lot of work on their hands,
particularly as PyPy is really rather good these days.

EDIT: There's some more info here in a post to the LLVM mailing list by one of
the Pyston developers
[http://article.gmane.org/gmane.comp.compilers.llvm.devel/718...](http://article.gmane.org/gmane.comp.compilers.llvm.devel/71870).
They've added a simple escape analysis pass for GCed memory among other
things.

As always, if you're interested in LLVM or compiler stuff you should subscribe
to [http://llvmweekly.org](http://llvmweekly.org) (disclaimer: I write it) and
follow @llvmweekly

------
chrismonsanto
I was really hoping that they would support Python 3. Unfortunately, Python
2.7 only.

Please. Python 3 is ready. :(

~~~
durin42
Python 3 is still very much not ready for a variety of complicated things, and
some large stacks will likely take years of effort to be ported.

Mercurial is an example, I think twisted is another.

~~~
herge
From what I gather, one of the things holding mercurial back is support for
python 2.4, although the support of %-formatting for bytestrings will move the
port of mercurial forward.

------
rayiner
Between Dart at Google, Hack at Facebook, and now Pyston at Dropbox, it's
really neat to me that there is a resurgence in interest in language
implementations. The field seemed quite moribund through most of the late
1990's, early 2000's.

~~~
ahomescu1
Dart and Hack are new languages, not new implementations of an existing
language. Pyston seems like a new VM for Python (more comparable to HHVM, PyPy
and V8).

~~~
rayiner
By "Dart" and "Hack" I meant the Dart VM and HHVM. What I mean to say is that
it's neat to see a renewed emphasis on serious, high-performance language
implementations, whether for existing or new languages. For awhile it had
seemed that there was just JVM/CLR on one side and a bunch of interpreted
languages on the other.

~~~
ahomescu1
Hack and HHVM are completely separate projects (they're even written in
different languages: Hack in OCaml, HHVM in C++). They're both really
interesting projects though.

------
mamcx
Why not build this on top of luajit?

I also toy with the idea of build a language, and my main contender is luajit
(perhaps with terra). In the other hand, julia have be done on the LLVM...

~~~
beagle3
LuaJIT is married so hard to Lua that it makes no sense to do so. Did you look
inside?

------
meemo
The "How it works" paragraph almost sounds like it was taken from a
description of how Julia works. (Not implying anything negative, btw.)

~~~
krick
It's pretty natural, we don't really have anything better that this stack, do
we?. If you're writing JIT, why invent your own backend if LLVM is good
enough? I mean, well, probably it's possible to write something better than
LLVM, but it would be quite exceptional work. And what that section describes
would be pretty typical for any duck-typed language passed to LLVM. Devil's in
the details, anyway. Actually, at this level of detalization you can say that
almost every cooking recipe sounds the same: "Chop products; put them in the
cattle and put it on fire for some time; season it".

~~~
sanxiyn
Many people (including myself) think PyPy is a better stack. We will see.

