
Ask HN: More Ergonomic Assembly Language? - rectang
It has been many decades since the AT&amp;T and Intel syntaxes were introduced, and we&#x27;ve learned a lot about programming language design since then.<p>For example, consider how AT&amp;T assembler uses parentheses to dereference memory locations: `(%eax)`.  Such a usage of parentheses is unlike that of most programming languages, and unlike how parentheses are used in mathematical notation.  You could say it &quot;violates the principle of least surprise&quot;.<p>Another problem is width specification: typing e.g. `DWORD` (in all caps) a la Intel to specify width is verbose and laborious, but appending a letter to the instruction name a la AT&amp;T makes those names harder for the human eye to parse.<p>Then there&#x27;s the inscrutable naming of registers and instructions.  Ideally we&#x27;d want them to follow &quot;Huffman Coding&quot; naming principles, where high-value short, clear names are assigned to the most commonly used elements and rarer elements get somewhat longer, more explicit names.  Unfortunately this is a problem for hardware manufacturers, but let&#x27;s dream for a moment that they&#x27;re listening.<p>What would a more ergonomic assembly language look like?
======
nsajko
Firstly, I think those may interest you slightly, those are some projects
which tried to have a somewhat unified assembly syntax across ISAs, and with
more regular naming of registers, etc.:
[https://9p.io/sys/doc/asm.html](https://9p.io/sys/doc/asm.html)
[https://tip.golang.org/doc/asm](https://tip.golang.org/doc/asm)

Now, are you proposing to change the "ordinary" assembly (with simple one-to-
one mappings between assembly and machine instructions) or rather something
like this: [https://en.wikipedia.org/wiki/High-
level_assembler](https://en.wikipedia.org/wiki/High-level_assembler) ? I think
it is important in this kind of discussion to elaborate on the uses we have
for assembly languages, the main uses I know about are:

1) Documenting an ISA or compiler or similar.

2) Reading machine code as disassembly. (For debugging, profiling, etc.)

3) From within a high-level language to rewrite some hot-spots in the source
code so they would be more efficient.

4) Using specific instruction not accessible by compiler intrinsics. This is
mainly for kernel or embedded work.

I think that for all four uses one would want an "ordinary" low-level assembly
(including the Plan 9 or Go assembly languages), but honestly, I never used a
high-level assembly and have no idea why would somebody use something like
that except perhaps for "demos", so that one could brag of having written the
whole thing in "assembly". In any case, you should clarify which use of
assembly language you have in mind for this discussion.

> It has been many decades since the AT&T and Intel syntaxes were introduced,
> and we've learned a lot about programming language design since then.

I don't think programming language design applies very much to assembly
language, at least the usual, low-level kind. I'd say that assembly language
is only a trivial case of a programming language, because it definitely is a
programming language, but the assembler does not have to bother with types or
translating all those high-level constructs into lower-level forms. Assembly
is much more "descriptive" of the end-result (machine code) than high-level
programming languages are. Another point is that assembly is, as a formal
language, much simpler than high-level programming languages, so it is already
easy to understand and there is necessarily little room for improvement.

> Such a usage of parentheses is unlike that of most programming languages,
> and unlike how parentheses are used in mathematical notation. You could say
> it "violates the principle of least surprise".

The "principle of least surprise" sucks. It stiffles innovation for little
gain. But my last point is also applicable here. I.e. I think it does not
matter that the principle of least surprise is violated because assembly will
be simple anyway.

> Another problem is width specification: typing e.g. `DWORD` (in all caps) a
> la Intel to specify width is verbose and laborious, but appending a letter
> to the instruction name a la AT&T makes those names harder for the human eye
> to parse.

I'm fine with the AT&T syntax, but maybe it would be better to have number
instead of letter suffixes for specifying width, e.g.: MOV1, MOV2, MOV4, MOV8
instead of MOVB, MOVW, MOVL, MOVQ.

> Then there's the inscrutable naming of registers and instructions. Ideally
> we'd want them to follow "Huffman Coding" naming principles, where high-
> value short, clear names are assigned to the most commonly used elements and
> rarer elements get somewhat longer, more explicit names.

Regarding registers: we don't need that. Registers come from small finite
sets. It is better to have regular and small names for registers, at least
with the current ISAs.

Regarding instructions: that's kind of what we already have.

