Hacker News new | past | comments | ask | show | jobs | submit login
Flat assembler: x86 assembler that does multiple passes to optimize machine code (flatassembler.net)
71 points by vmorgulis on Jan 3, 2016 | hide | past | web | favorite | 20 comments

See also: The newer flat assembler g, a generic assembler not tied to a particular architecture. Download http://flatassembler.net/fasmg.zip and read docs/manual.txt. There are examples for 8086, 8052, AVR, and Java Virtual Machine. http://board.flatassembler.net/topic.php?t=17952

Another see also: STOKE: https://github.com/StanfordPL/stoke-release

"STOKE is a stochastic optimizer for the x86_64 instruction set. STOKE uses random search to explore the extremely high-dimensional space of all possible program transformations. Although any one random transformation is unlikely to produce a code sequence that is both correct and an improvement over the original, the repeated application of millions of transformations is sufficient to produce novel and non-obvious code sequences that have been shown to outperform the code produced by general-purpose and domain-specific compilers, and in some cases expert hand-written code."

If I understand this right, fasm_g is the macro-language of fasm with some additions and the ability to output bytes.

That's really quite something, is it yet comparable to the current fasm?

See the board post in my parent comment for a comparison to fasm 1 by the author.

The most interesting points of flat assembler are elegant syntax, "Same source - same output" philosophy, and a very powerful macro system that consists of preprocessing and assembling stage; the fasm preprocessor has been shown to be turing-complete [0].

In 2008 I've played with it, creating macros that patch a binary file (and used it to extend a closed-source Windows program [1]), made PE format from scratch [2], a simple PE encryptor [3].

[0]: http://board.flatassembler.net/topic.php?t=6624

[1]: http://board.flatassembler.net/topic.php?t=8876

[2]: http://board.flatassembler.net/topic.php?t=8632

[3]: http://board.flatassembler.net/topic.php?t=8951

We changed the URL from http://flatassembler.net/docs.php?article=manual#2.3.3 to the root page because it looks like this project hasn't had attention on HN before.

A better link might be http://flatassembler.net/docs.php?article=design which helps explain wtf this is and why anyone should care.

The introduction to SSE is not bad http://flatassembler.net/docs.php?article=manual#2.1.15

What would really get me excited is an assembler that does more intelligent things — automatic register allocation at a minimum. I don't know why in 2015 I still have to manually track which register I'm using for what. This gets particularly problematic with MMX/SSE/AVX registers. I think the x86 assembly world would benefit from learning about features found in DSP assemblers (I'm thinking of TI C6000 tools in particular).

Your wish has been granted. It's called C and it's been around since the 70's.

I will take an educated guess that none of the responders to my comment (and I'll just post this one reply to all) have written significant amounts of assembly code, especially using MMX/SSE/AVX, and so will respectfully disagree with all of the responses.


* writing assembly in C is an exercise in frustration, just try to see anything past the syntax, same goes for intrinsics, with the added bonus of having to check whether the code you got is actually what you wanted, * no, you do not want to manually keep track of all of your 32 AVX-512 registers, trust me (for x86+SSE I used printed register allocation tables to help me track of what is where at which stage), heck, even on a measly ARM Cortex-M managing your R0-R12 (and not all instructions operate on all of them) can get annoying. The idea that you can keep registers in your head stems from the days when we had AX-DX to work with.

"Assembly" doesn't have to mean doing everything manually (and macros, yay!), really. There is a middle ground, where you get a reasonably intelligent assembler which does a lot of the manual grudge work for you. There are also more intelligent assemblers which, for example, take your linear assembly and convert it into VLIW, allocating processing units. Or reorder instructions around branches.

Assembly language should be thought of as a spectrum of tools.

This is a serious question: what's the point of programming in assembly if you don't want to control the registers? And what do you do if the assembler does a poor job of register allocation? If you hand optimize it, you're back in normal assembly; if you ignore it, you might as well not check in the first place.

Once you sacrifice the registers, assembly just becomes a non-portable programming language without any libraries. Unless you build it out with macros and capitalize on it being a programmable programming language, but then why not go with Forth or some kind of Lisp?

What does no-register assembly gain you that higher level languages don't already offer?

What do you mean automatic register allocation? Your example doesn't really make sense to me.

mov 5, eax

mov eax, ebx

It seems pretty prerequisite to know what you're doing with the registers themselves. I have only taken a basic systems class that taught y86 so knowledge of building anything production worthy in assembly is nil. Just caught my eye and thought I'd ask!

Something like

mov 5, a

mov a, b

Where `a` gets `eax`, and `b` gets `ebx` during register allocation, and the lifetime analysis of each variable makes sure that all alive vars have register.

basm (borland's assembler, built into delphi and bcb) does this. You just freely reference variables defined outside your 'asm' block and the compiler determines how to map them to registers, stack offsets, etc.

I used CodeWarrior for PowerPC a few years ago and this did something similar. You did have to declare register variables with `register', though. (Being RISC, all instructions took register operands exclusively, so the distinction between memory and register was fairly important.)

This worked really well - the compiler would sort out spills and reloads at the start and end of asm blocks, so you could generally just write one or two instructions per asm block, then have an assert, then on to the next asm block. Made debugging amazingly easy, and I never had to care about the ABI.

For day-to-day programming it was certainly streets ahead of the nonsense gcc foists on you.

There is an assembler that does more intelligent things - https://github.com/Maratyszcza/PeachPy

fasm has pretty powerful macros that could do something like this though at that point why are you using an assembler directly?

I've been looking for something like the FAT12 bootloader from the examples section for some time.

Not really related to the main post, but still a big help!

The word "optimize" is overused here. Fasm makes multiple passes because it allows forward references, and rich macro system makes it hard to guess everything in 1-2 passes (it also removes lot of formalism from user). What is actually optimized: jmp width.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact