
Compilers for Free - grawprog
https://codon.com/compilers-for-free
======
abeppu
Why is it that 35 years after the Futamura projections were first described,
when we want to produce a new compiler, we don't write an interpreter and pass
it through a high quality, general partial evaluator?

Whenever the Futamura projections are described, it's always at a conceptual
level, in general terms. Reading a blog post like this one, I'm always left to
wonder (a) why does this not end with a link to a github implementation of a
general partial evaluator and (b) why is it the case that none of the
compilers I use were produced with these techniques?

There must be some set of tradeoffs preventing this approach from living up to
the "compilers for free" promise, but I haven't found anyone with a real
explanation. Either there must be hard challenges which make implementing a
general partial evaluator very difficult, or the results that we'd get from
using such a partial evaluator are poorer than what we get through other means
right?

Perhaps the greatest success in this direction is PyPy. At some level the
RPython toolchain is general enough that people have used it to produce JITted
versions of ruby, php and a few other languages. But these are a handful of
alternative implementations to existing languages, where the predominant
implementation doesn't use this toolset. And PyPy's own docs note that the JIT
compiler needs some amount of hinting from the interpreter; you can't just
throw any interpreter (written in RPython) at it.

~~~
dtech
> Why is it that [...] when we want to produce a new compiler, we don't write
> an interpreter and pass it through a high quality, general partial
> evaluator?

As far as I know, it has been tried numerous times and has been an area of
active research in the past, but results turned out unsuccessful. Mainly
because error reporting and performance, both compile-time and run-time, is
hard to achieve.

It's a little bit like pure functional programming. Super elegant in
principle, hard to get right and usable in practice.

~~~
chrisseaton
> As far as I know, it has been tried numerous times and has been an area of
> active research in the past, but results turned out unsuccessful.

TruffleRuby is a Ruby implementation that works exactly by writing an
interpreting and passing it through a general partial evaluator. It's
successful enough to pass almost all the Ruby spec and to run real Ruby
applications including their C extensions.

[https://github.com/oracle/truffleruby](https://github.com/oracle/truffleruby)

------
sifoobar
The more I learn about interpreters and compilers, the more everything looks
the same.

Snigl [0] is designed as a VM interpreter. But it has a separate compile stage
where code may be evaluated using the same interpreter and the results reused
at runtime.

And it traces code before running it, evaluating as much as possible
statically.

Finally, it supports emitting C code for an interpreter preloaded with the
current, traced program. Which is where compile time evaluation really shines,
as the results are embedded in the executable.

[0] [https://gitlab.com/sifoo/snigl](https://gitlab.com/sifoo/snigl)

~~~
gizmo686
I have written far more compilers/interpreters in my career than I probably
should have, and I reached the same conclusion. For most of them, I probably
couldn't even tell you weather or not what I had involved a compiler. At this
point, my working distinction is that a compiler produces a serialized
representation. If you feed the AST directly into the runtime, then you just
have an interperater. I don't see any practical benifit to this distinction,
but it seems to match up with what people consider a "compiler".

As an asside, I had a simmilar insight when I built a computer from logic
gates [0]. Essentially, the control unit had an EEPROM of microcode, and the
logic to execute an actual instruction was to use the upper 4 bits of the
instruction as an index into a lookup table, then write the result to the
microcode program counter.

Eg:

    
    
       switch(OP & 0xf0):
    
       case 1: jmp &add
       case 2: jmp &sub
       ...
    

Which looks a lot like an interperator.

It wouldn't suprise me if modern CPUs actually added an additional step, where
they first translate (compile?) the input stream into some internal
instruction set, and then have microcode that acts as an interperator for
that.

[0] Also EEPROM, and various composite circuits.

~~~
gmueckl
All modern x86 CPUs do this to some extent. The details changed over the
decades, but the basic principle these days is that the CPUs decode the x86
instruction set into a secret, internal only, more orthogonal sequence of
instructions. My understanding is that these sequences can be quite long for
the more involved CISC operations. And according to the manuals written by
Agner Fox, some instructions even have special cases based on their operands
to do something equivalent, but much faster. One particular example that I
remember is xor ax,ax which is often used to set ax to zero and is a one byte
instruction. Modern CPUs don't compute the xor in the ALU, but just rename ax
to a register that is set to 0.

~~~
teddyh
So in practice, nobody uses actual x86 assembly anymore, but everybody uses it
as a target for compilation. It’s just like _The Birth & Death of
JavaScript_¹, but non-fictional, and on a lower level – a _Birth & Death of
x86 assembly_, if you will.

1\. [https://www.destroyallsoftware.com/talks/the-birth-and-
death...](https://www.destroyallsoftware.com/talks/the-birth-and-death-of-
javascript)

~~~
gameswithgo
>nobody uses actual x86 assembly anymore,

well almost.

------
dmux
The description immediately reminded me of a book called "Understanding
Computation." Scrolling down I see the video is by Tom Stuart, the author of
that book. I'd really recommend it (the book) for anyone interested in
learning about "big step" (operational) and "small step" (structural)
semantics.

------
Dangeranger
Tom Stuart, the author of the post, also has written an interesting article
called "Programming with Nothing" in which he demonstrates how to use first-
class functions in Ruby to explain the Lambda Calculus.[0]

[0] [https://codon.com/programming-with-
nothing](https://codon.com/programming-with-nothing)

------
fjfaase
I have implemented the method of parsing an expression into objects with an
evaluation method in the scripting language that I build into an editor for
performing complex manipulations of images. See
[http://www.iwriteiam.nl/MySample.html](http://www.iwriteiam.nl/MySample.html)

------
iamNumber4
Nothing to see here, peaked my interest since “who would pay for a compiler??”
In the modern age of open source.

Post has a missleading title, and is just rambles about ruby and compilers.

~~~
4thaccount
It depends on what you want. If I wrote aerospace code I'd probably buy
adacore's ada and spark compilers and tools.

Since I mostly do scripting work, I usually stick to Python, Bash, &
Powershell though.

Some commercial compilers are very affordable with extremely sane licenses.
One that has interested me lately is called 8th and is $250 for a professional
license which lasts forever and can create encrypted binary executables on
Linux, Mac, Windows, raspberry pi, iPhone, android, or embedded all with a
cross-compiler, builtin graphics, sound, database support, full REPL, good
support...etc. There is also a free version that just has some features
missing (not a dumb temporary trial) and a $50 hobby license with many of the
features. The Enterprise license would be expensive for me, but fairly cheap
for most companies and includes priority support and the source code, so that
is neat. I'm all for open-source, but if we're honest, some commercial tools
are great and don't abuse users. It is not an only option situation.

I will say closed source can really stink too. I once had to use a product
that all of the vendor's customers had to use, but it wasn't officially
supported (although it was critical for infrastructure) and I was running into
a bug they couldn't figure out or didn't care to look into. The terminal would
tell me the problem was in line x of the encrypted script, but wouldn't let me
know what line x did so I could get the app working.

~~~
vram11
>One that has interested me lately is called 8th

Link, please? Did not find a page about the makers of 8th, though I googled
and saw some other related pages I'm reading.

~~~
reitzensteinm
It's probably this one:

[https://8th-dev.com/](https://8th-dev.com/)

~~~
4thaccount
Yea it's this one. The main developer wrote an open source Forth called Reva-
Forth awhile back. I think he does a lot of paid development a across various
fields from phone app development to embedded work, and machine learning
stuff. When he was trying to do phone apps he realized (like most of us) that
it is a pain to need to know Java for Android (+ API) and Objective C or Swift
for iOS and then there is WinForms or Gtk or WxWidjets for desktop
development, so you have to learn a lot of different technologies if you want
to write software to run on all major platforms and 8th helps solve most of
that problem. 8th is also a Forth, but he has added some features like using
JSON as a generic data structure (not generally done in Forth), bundled the
C++ JUCE GUI framework and associated licenses, added ODBC/SQLITE/MYSQL
support, sound, cross-compilation to many platforms...etc.

Most of my coding needs fall into either general scripting (Python,
Powershell, Bash, Perl...etc) or numerical work (Python + Numpy or Julia). 8th
seems to be pretty speedy, but Forths in general usually aren't lightning fast
due to their stack usage and emphasis and you need really fast code for
numeric work. He has made it a priority to work on performance though and
continuously adds scientific features like adjacency graphs, so he is on the
right track and I'll keep my eyes on it. For general scripting work and
utility apps it should work pretty nicely for my needs and distributing a tiny
.exe to users with a GUI is nice. I'm not sure how good the file I/O &
filesystem support is though (need to look at the doc), because I use those
common features a lot. I also need to check if he wrote functions (called
words in Forth speech) for web scraping as there is a project I'd like to do
where 8th would be ideal.

If you've never seen Forth before, try to keep an open mind with the syntax as
it can look alien at first. There are a lot of 8th examples on Rosetta Code
and although they are short, it might look like gobblygook if you're not used
to thinking about the stack where parameters are popped off and used in
functions which then return data to the stack so functions can be chained
(kind of like Unix pipes). Leo Brodie's books on Forth (classics and free
online) are called "Learning Forth" and "Thinking Forth" and are recommended
on HN sometimes.

~~~
vram22
Thanks a lot for the detailed answer.

I did try out Forth a little, much earlier, and had read some of the Starting
Forth book - had bought the hardcover edition, it looked really good,
including the cover - dark brown, either was leather or had a leather effect
(more likely the latter). And I liked Leo's writing. Will check out 8th at
some time. The cross-platform GUI thing with tiny EXE's is a pro (Rebol and
Red are similar, except not sure if Rebol supported EXEs, except maybe via
3rd-party tools), although not having good I/O and filesystem support is a
con, since a lot of the stuff I do involves file I/O.

Still, it may be worth trying 8th out.

~~~
4thaccount
I was going to mention that it is a lot like Rebol and Red in philosophy, but
you beat me to it.

Rebol was mostly built by one person (Carl Sassenrath) but has to package the
entire interpreter (I think maybe 5 MB) to make an .exe if you have the paid
version. Red can already make executables and has made fantastic progress, but
it is a larger effort. I check up on Red once a month and think 1.0 will be
amazing.

~~~
vram22
>I was going to mention that it is a lot like Rebol and Red in philosophy, but
you beat me to it.

I agree, the philosophies are similar. Minimalism, relying more on your own
code than libraries, not generalizing your code for a lot of cases that might
not happen, etc.

One thing that puzzles me about Forth, though:

I did read a few books about it (much earlier, and I'm not a language
designer, so what I say next could be wrong). Some of them talked a bit about
Forth's internal design. I understood from that, that when multiple words
(Forth subroutines) are defined, they are "threaded", from which I understood
that they are added to a linked list of word definitions. Wondered why it
needed to be a linked list. If it was a lookup table of some kind, like a
Python dict or a Ruby hash (basically an associative array), looking up word
definitions to run them would be faster than in a linked list, which is
linear, I'd think.

Edit: I did just have an idea of what the reason for that may be, but will
wait a bit to see if anyone replies with their explanation. Hint about my
idea: it may have to do with Charles Moore's Forth philosophy - just a guess
on my part.

~~~
4thaccount
I'm very much a noob when it comes to Forth and I've asked myself the same
question before. I think threaded is generic enough to not necessarily dictate
the data structure. It more or less means that a word (if it isn't being
defined and added to the dictionary (not dictionary as in Python dictionary,
but what they call the list of words in Forth)) it will read from the stack,
modify the stack, and then call the next word.

Someone on here or on the Forth subreddit should be able to answer though.

