
An x86 Assembler in 256 LOC (2017) - peter_d_sherman
http://blog.jeff.over.bz/assembly/compilers/jit/2017/01/15/x86-assembler.html
======
userbinator
Although not as regular as some RISCs, x86 is still quite regular which makes
it relatively easy to write an assembler for; the main opcode space is octal-
structured
([http://www.dabo.de/ccc99/www.camp.ccc.de/radio/help.txt](http://www.dabo.de/ccc99/www.camp.ccc.de/radio/help.txt))
and nearly 1/4 of the single-byte opcodes, 00-3F (0xx in octal) are where all
the frequently-used ALU ops (reg-reg, reg-mem, mem-reg) reside. The ability to
easily generate vast swaths of the opcode space from a compact description is
what makes a table-driven (dis)assembler easy to write.

Related, an x86 (self-)disassembling demo in 256 _bytes_ :
[http://www.pouet.net/prod.php?which=16930](http://www.pouet.net/prod.php?which=16930)

------
huntie
(2017)

It's probably worth noting that this is only a runtime assembler, i.e. it does
no parsing. It also doesn't support all of the addressing modes for the
instructions that it does support. Nonetheless it does show that assemblers
aren't that hard. Adding support for the various addressing modes and amd64
does complicate things but not too badly. Moving forward from this you'd
probably want a better scheme for handling the various "extra" bytes (SIB,
REX, etc.).

~~~
mhh__
Never having written an assembler for a proper ISA, I was under the impression
that assemblers for real CPUs are extremely simple until you start writing
them.

~~~
chrisseaton
Assemblers are one of those things that's very simple for the simple cases,
and then when you add more complex cases you start to think it's actually very
hard until you go back and add the proper abstractions, which would have
seemed needlessly complicated for the simple cases.

So people doing something trivial thinks they're easy (like this example),
people trying to do a bit more think they're really hard, and people doing an
entire assembler think they're easy again.

~~~
univerio
What are the proper abstractions necessary?

~~~
chrisseaton
Most simple cases of instructions seem straightforward to just go ahead and
emit the bytes for. Then when you start to use more addressing modes you
realise it gets to be a lot of code and it turns out there's a common pattern
for everything.

Here's a concrete example from an industrial assembler.

Almost every simple instruction boils down this helper method which is
parameterised by a bunch of flags and can then deal with all of them.

[https://github.com/oracle/graal/blob/f3a3576493f87abdea1045d...](https://github.com/oracle/graal/blob/f3a3576493f87abdea1045d6156a633e332550dd/compiler/src/org.graalvm.compiler.asm.amd64/src/org/graalvm/compiler/asm/amd64/AMD64BaseAssembler.java#L549)

Even that looks complicated! But writing this per instruction (the simple
case) would be insanely complicated.

------
nathell
Here's my x86 assembler in 250 lines of Clojure code that uses Clojure data
structures as its input:
[https://github.com/nathell/lithium/blob/master/src/lithium/a...](https://github.com/nathell/lithium/blob/master/src/lithium/assembler.clj)

And a sample program:
[https://github.com/nathell/lithium/blob/master/examples/stri...](https://github.com/nathell/lithium/blob/master/examples/stripes.li.clj)

~~~
616c
That's super cool. I will definitely check this out. I live with a lot of
pentest folk and the traditional toolset with msfvenom and stuff doesn't teach
you to be creative or self-sufficient, so a lot is left to be desired; this is
an interesting opportunity to build a cooler tool for mediated shellcode
building!

------
stevekemp
Do read out the follow-up post on testing the code:

[http://blog.jeff.over.bz/assembly/2017/02/15/finding-
machine...](http://blog.jeff.over.bz/assembly/2017/02/15/finding-machine-
language-encodings.html)

And then how to execute the binary which is output from the assembler:

[http://blog.jeff.over.bz/assembly/compilers/jit/2017/03/30/e...](http://blog.jeff.over.bz/assembly/compilers/jit/2017/03/30/executing-
dynamically-generated-machine-code.html)

------
akavel
As mentioned in other comments, maybe worth to note that it's an assembler in
a library form, i.e. without code for a parser. But then, looking at it in
another way, it's a clever reuse of the C compiler's parser!

Now, in a something of a "shameless plug", incidentally I took a very similar
approach[1] recently, going for a "library form" too when writing an
experimental assembler for Dalvik (i.e. Android VM) bytecode. Not having to
write a parser helped me to iterate/prototype/hack faster!

[1]:
[https://github.com/akavel/dali/blob/d79cec81293abc5f3a87f4d8...](https://github.com/akavel/dali/blob/d79cec81293abc5f3a87f4d879a225991395b62b/src/dali.nim#L382-L422)

------
akkartik
Since we're discussing novel syntaxes for x86, here's one I've been working on
that should soon be self-hosting and come in under 2kLoC source and under 30KB
executable (excluding tests):
[https://github.com/akkartik/mu/blob/master/subx/Readme.md](https://github.com/akkartik/mu/blob/master/subx/Readme.md)

Stats:
[https://raw.githubusercontent.com/akkartik/mu/master/subx/st...](https://raw.githubusercontent.com/akkartik/mu/master/subx/stats.md)

------
rurban
This is typically used as a jit, not assembler. My little jit looks similar. A
jit doesn't need to encode all ops, only a few. For jumps you want to abstract
named labels, and you want to patch in ip's you dont know yet.

------
bin0
The real question is: can you re-write it in assembly and make it self-
hosting?

~~~
HappyJoy
Good question - give it a shot

------
tempodox
There should be a contest between assembly languages where they're rated by
the simplicity and shortness of their respective assembler / disassembler
code.

------
ttflee
I wonder if it would be possible to train an assembler in a recurrent neural
network.

~~~
mruts
What do you mean?

~~~
ttflee
I mean an assembler like this decompiler:

[https://www.cs.unm.edu/~eschulte/data/katz-
saner-2018-prepri...](https://www.cs.unm.edu/~eschulte/data/katz-
saner-2018-preprint.pdf)

Since all these compilers/decompilers are transforming one language into
another, I cannot stop thinking if they could be fit into some nonlinear
networks, or tensor-flow-as-a-(de)compiler.

------
ngcc_hk
HN never stop amaze me. Good works!!

