
Monads to Machine Code (Part 1) - primodemus
http://www.stephendiehl.com/posts/monads_machine_code.html
======
pka
What's cool about monads is that the computation (assembly code in this
particular case) is just a plain old value.

So if you have a fragment like this:

    
    
        factorial :: Int64 -> X86 ()
        factorial n = do
          mov rcx (I n)
          mov rax (I 1)
          l1 <- label
          mul rcx
          loop l1
          ret
    

you can just compose this into a bigger program:

    
    
        two_factorials :: Int64 -> Int64 -> X86 ()
        two_factorials n1 n2 = do
          factorial n1
          factorial n2
    

Notice how the labels won't get mixed up. This concept (representing
computation by values) is incredibly powerful: one can compose, combine,
abstract values into bigger values. For example, compare the compositional
properties of RethinkDB's query language to SQL.

~~~
anon4
Correct me if I'm wrong, but isn't this basically the same as an HTML DSL,
i.e. functions which transform something like (in LISP)

    
    
      (html
        (head
          (title "hello"))
        (body
          (p "world")))
    

into
"<html><head><title>hello</title></head><body><p>world</p></body></html>"?
Except this takes assembly keywords and outputs machine code.

~~~
dmytrish
Yes, there is some similarity in expressive power of Lisp macros and Haskell
monads (edit: raw Lisp macros are even more powerful, since they may glue and
transform anything to anything). However, there is a huge difference in the
essence: Lisp macros work on the syntactic level, monads compose on the
semantics level.

Lisp macros are not bound by types, so they can be combined irregardless of a
program semantics, potentially producing any kind of incorrect code as the
result. Such errors get especially tricky with several layers of
quasiquotation.

In Haskell, the program meaning is partially encoded into types, so the
compiler can understand your intention much better and assists you in
programming it right. Also, getting some kind of inter-part optimization is
much easier in Haskell: you know exactly what kind of pieces you are gluing
up.

------
ch
Always fun to see this type of thing implemented in Haskell.

Here are some other examples.

This one uses indexed monads to keep the instruction stream correct:

[https://intoverflow.wordpress.com/2010/05/21/announcing-
pote...](https://intoverflow.wordpress.com/2010/05/21/announcing-
potential-x86-64-assembler-as-a-haskell-edsl/)

And this one is very similar to the OP but from 2013, and emits 6502 assembly:
[http://wall.org/~lewis/2013/10/15/asm-
monad.html](http://wall.org/~lewis/2013/10/15/asm-monad.html) (of course
without the JIT portion).

~~~
iheartmemcache
A bunch of people on HN were asking where to learn the conventions of x86 ASM
- this actually is a pretty good place to start. He picked the 8 important
registers, the 3 forms of addressing, demonstrated hi/lo (ah/al) -> ax -> eax
-> rax evolution, and kept with the acc, count, data, GP conventions for
eax,ecx,edx,ebx which is often overlooked (you can tell which era of x86 asm I
grew up in ;)).

I'm scared this has the potential to be information overload for people who
aren't familiar with an assembly language and Haskell, but for me I thought
this was a terribly neat (both in the "interesting" as well as "structurally
well done") article. As usual, fantastic work by Steven. Up there with Bartosz
and Yang in quality of write-ups.

~~~
rdc12
I don't know Haskell (yet) but do have a small ammount of assembly experience
and I managed to follow along fairly well. However on purpose I didn't worry
about fully understanding the monadic side of things.

------
brandonbloom
Awesome article!

Seems a reasonably complete assembler too with one notable exception: Forward
jumps, conditionals specifically. These may be a tad tricky to fit in to a
monad because they require either knowing the future (eg. how far away the
jump target is) or rewriting the past (eg. leave a hole and fill it in later).

I'd really like to see both approaches in Haskell, since they're kinda
different. My guess is that in the former case, you'd have nested/delimited
monads with buffers. However, I don't know Haskell enough to figure out how to
accomplish the hole-filling approach.

EDIT: On second thought, I guess both approaches (traditionally "two-pass" vs
"one-pass" assemblers) can be accomplished by changing labels to be opaque
rather than addresses and storing a symbol table in either JITMem or the two-
types that would replace it for a two-pass approach.

~~~
paulajohnson
I haven't looked at this one in detail, but MonadFix and the recursive do
syntax
([https://wiki.haskell.org/MonadFix](https://wiki.haskell.org/MonadFix)) will
probably solve this problem by allowing forward references.

~~~
brandonbloom
That'll do for the hole-filling approach :-)

------
anyfoo
You can do even more with Arrows. This very short presentation goes from
Monads to Arrows, and explains why:
[https://www.dropbox.com/s/e3la5wueg1zsid7/Arrows%20for%20CPU...](https://www.dropbox.com/s/e3la5wueg1zsid7/Arrows%20for%20CPU%20Emulation.pdf?dl=0)

