
Building Oil with the OPy Bytecode Compiler - yorwba
http://www.oilshell.org/blog/2018/03/04.html
======
jnbiche
So he writes Oil Shell in Python because he wants to get it done quicker than
he would writing in C. And yet now he's found that the performance in Python
is inadequate for a shell to compete with Bash (pretty predictable) and so
he's basically writing a new Python interpreter (both bytecode compiler and
VM), which (as he'll find out) is a huge task, regardless of how simplistic he
makes the interpreter.

That said, I think Oil Shell the shell language looks pretty cool, so I wish
he had done it in a language like C, Rust, or even OCaml (which he apparently
considered).

~~~
chubot
(author here)

Well, a key point is that it doesn't have to be an entire Python interpreter.
It just has to be the subset that Oil uses. (This applies to both both Python
syntax and semantics.)

I also forked the 8K LOC bytecode compiler -- I didn't write it from scratch.
That's the point of "Cobbling together a Python interpreter".

So for a cartoonish view, compare these implementation strategies.

1\. Clone bash in C, which (charitably) would involve writing around 80K lines
of C code from scratch, given that bash is 160K lines. You can probably do
better by not writing it over 30 years :-)

2\. Write 16K lines of Python, fork a bytecode compiler in Python (8K lines),
and then either fork CPython's VM or write your own.

The post implies that #2 is probably easier than #1. I haven't done the last
part of #2, so I could be wrong. I think it's an interesting experiment to
find out whether you can write a bespoke compiler/VM for a specific program in
a high level language.

\-----

It's a little bit like TeX and Pascal. I discovered awhile ago that that TeX
is written in an abstract subset of Pascal. Apparently the version you use on
most Linux distros is compiled to machine code via a Pascal-to-C translator.
I'm not sure if that translator was written specifically for TeX or not, but
if anyone knows, please chime in!

But I have seen enough of Knuth's code to know that he does not rely on
implementation details. That is roughly the case with Oil and Python. As I
mention in the post, Oil is written in Python+ASDL, not Python. And I would
even call it Python + ASDL + some abstract regex subset. That subset could be
called the intersection of Python regexes + re2c regexes, e.g. at the end of
this comment:
[https://news.ycombinator.com/item?id=16525378](https://news.ycombinator.com/item?id=16525378)

Basically when writing Oil, I try to be clear about the algorithms, and not
think in terms of implementation details of a particular platform. This is
possible for a shell because it's mostly string handling, and the only
libraries it uses are a handful of libc calls (which are 40 years old,
standardized, and well understood.)

~~~
jnbiche
> Clone bash in C, which (charitably) would involve writing around 80K lines
> of C code from scratch, given that bash is 160K lines.

Or, third option, you could have used a language like Rust that would allow
you to write using a high-level programming language, and yet still you could
dip down to low-level code whenever needed for performance reasons.

Don't get me wrong, I'm a big Python fan for certain types of projects. And
now with Python 3 and asyncio, Python is great for writing I/O-bound code. But
it's not ideal, even if you confine your use to a Python subset, for a project
for which parsing is a critical part.

Finally, you may prove me wrong, but even using a subset of Python, I don't
think you're going to eke out the performance you're looking for using
CPython's VM. I say this simply because it's not a very performance-oriented
VM (partly because of the bytecode instruction set but also because it's
stack- instead of register-based, and some other architectural issues). So
then you're back to writing your own VM from scratch -- a significant
undertaking.

In any case, I wish you luck. Oil Shell's a cool project, so I hope it
succeeds.

~~~
chubot
I find Rust more interesting than I did at the start of the project, since I
ended up using algebraic data types all over the place, which Rust has. And it
doesn't have the runtime issues I mentioned with Go.

But the compile times are a dealbreaker. As mentioned in the post, if I had a
C++ compiler in my edit-run cycle, I would have never have gotten the project
done. From what I understand having a Rust compiler in there would be even
worse.

I can understand why you think Python is not ideal, and I listed many reasons
in the post why it's not. But that is why I used a name OPy -- OPy is
purposefully diverging Python. See the Tex analogy.

I agree it's an experiment. I didn't make any claims in the post; I
specifically mentioned that I would like to prove a point.

I agree that I will need to change the Python VM. The speedups can be
unlimited in that case that I write an entirely new VM (which is easier than
rewriting the very flexible and capable Python VM from scratch). Another thing
I didn't mention: it is probably possible to make the shell VM and OPy VM
converge. So Oil's runtime could be faster than any other shell, because every
shell is a tree interpreter and not a bytecode VM. (The parser might still be
slower, but I don't expect that to be an issue.)

\----

EDIT: On the other hand, I doubt you can write bash in Rust in less than say
50K to 80K lines of code. I think Rust will still be at least 3x more verbose
than Python. It inherently expresses more, so this is fundamental. So even if
compile times were not an issue, I still probably wouldn't choose Rust.

Also, build dependencies matter a lot for a shell, since it's used to
bootstrap embedded systems. I conjecture that shells run all sorts of systems
that Rust programs have never run on. Rust isn't as portable as C, or C++.

~~~
jnbiche
> I conjecture that shells run all sorts of systems that Rust programs have
> never run on.

Rust now runs on all kinds of embedded systems, including MSP, most current
ARM architectures, MIPS, PowerPC, and even (via forked version of Rust soon to
be merged) AVR. So it can run on most, if not all, major architectures that
shells run on.

------
montecarl
This post has more details on what OPy is:
[http://www.oilshell.org/blog/2017/04/09.html](http://www.oilshell.org/blog/2017/04/09.html)

~~~
abecedarius
I see the author didn't know about another Python-subset-in-Python,
[https://github.com/darius/tailbiter](https://github.com/darius/tailbiter) \--
I should've promoted it a little more, and maybe it'd have been helpful.

~~~
chubot
Actually I did find it a couple months ago! Someone asked me about it
yesterday:

[https://www.reddit.com/r/ProgrammingLanguages/comments/81wkg...](https://www.reddit.com/r/ProgrammingLanguages/comments/81wkgv/building_oil_with_the_opy_bytecode_compiler/dv6930s/)

I didn't know about it when I did the initial work for OPy in 2017 though.

I find the "expression-based" style quite interesting and I suspect it will
help me understand the "compiler2" code better. I plan to look at it in more
detail as I'm optimizing Oil.

This post links to a few posts that mention "byterun". These two pieces of
work are probably what pushed me over the edge to apply to Recurse Center! I'm
going from May-August this year. I'd love to connect with anyone interested in
this kind of thing (e-mail in my profile)

\-----

And if you have any advice on how to compile Python to more optimized code,
I'm interested. I assume this will involve creating some new VM instructions
(i.e. it can't just be done with the bytecode compiler)

This is a very concrete task -- the OSH parser is around 5,000 lines of code
that would be very annoying to port to another language. Essentially, it's 3
interleaved recursive descent parsers and a Pratt parser.

I already have benchmarks that show it's 40-50x too slow:

[http://www.oilshell.org/release/0.5.alpha2/benchmarks.wwz/os...](http://www.oilshell.org/release/0.5.alpha2/benchmarks.wwz/osh-
parser/)

Leaving aside the rest of the shell (which is not big either), I think it's an
interesting question if you can recover that factor of 40-50 without rewriting
the code. It's written in a pretty "static" style without much dynamism. You
don't need any special language features in a recursive descent parser.

I already did something like this. I wrote a whole bunch of Python regular
expressions for the lexer, then compiled it to C code via re2c:

 _When are Lexer Modes Useful?_
[http://www.oilshell.org/blog/2017/12/17.html](http://www.oilshell.org/blog/2017/12/17.html)

re2c code: [http://www.oilshell.org/blog/2017/12/files/osh-
lex.re2c.h.ht...](http://www.oilshell.org/blog/2017/12/files/osh-
lex.re2c.h.html)

(And to anticipate a question from passers by: it does not make sense to use a
parser generator here -- I wrote about this extensively on the blog, e.g.
[http://www.oilshell.org/blog/tags.html?tag=parsing#parsing](http://www.oilshell.org/blog/tags.html?tag=parsing#parsing))

~~~
abecedarius
Neat! I'll probably be around for the alumni week in May at Recurse Center,
and maybe we could get to chat then.

For optimization I guess it comes down to making productive restrictions on
the Python dialect to rule out some of the extreme dynamism. This sounds like
a really cool project, one that's too big for me to have much idea what'd help
without investing more time. What you're doing in stripping down CPython
reminds me a little of how Luke Gorrie's started adapting LuaJIT to his own
purposes:
[https://github.com/raptorjit/raptorjit](https://github.com/raptorjit/raptorjit)

~~~
yorwba
In my experience, most uses for dynamism in Python are concentrated at
initialization time. Features that make dynamic changes like namedtuple or
decorators are usually only executed once when a module is imported and
essentially implement a kind of code-generation that directly creates the
intended objects.

I wonder whether it might be feasible to import a Python module (running all
the initialization code) and then walk the reachable object graph to serialize
it into code in a more static subset.

~~~
chubot
Yes absolutely! I've long been interested in multi-stage programming for this
reason.

Oil is filled with this pattern: do a bunch of metaprogramming at startup to
make some data. Then use that immutable data for the rest of the program.
There are very much two stages.

I think Lua-Terra might be closest to the thing I want, although I haven't had
a chance to play with it:

[http://terralang.org/](http://terralang.org/)

I mention Bob Nystrom's language Magpie, which has an interesting model. Do
the type checking at main(), not after parsing! But everything that happens
before main(), at import time, is metaprogramming!

 _A Problem with Type Checking_
[http://www.oilshell.org/blog/2016/11/30.html](http://www.oilshell.org/blog/2016/11/30.html)

I think I want to do compilation/optimization right before main(), not just
type checking. It's all a bit vague right now, but I think OPy can go in this
direction. Having the compiler written in its own language facilitates this.
You can run code first, and then compile.

This post is also related:

 _Type Checking vs. Metaprogramming; ML vs. Lisp_
[http://www.oilshell.org/blog/2016/12/05.html](http://www.oilshell.org/blog/2016/12/05.html)

Also note that C++ is a two-stage language too. In fact Herb Sutter just
proposed that they unify the two languages. Like you can use STL with
constexpr at compile time. Link on this page:

[https://github.com/oilshell/oil/wiki/Metaprogramming](https://github.com/oilshell/oil/wiki/Metaprogramming)

I think your suggestion is exactly what I've been thinking, so if you want to
talk more about it / work on it, feel free to mail me :) The code needs a bit
of work but I think it's a promising direction.

