
New inline assembly syntax available in Rust nightly - dagenix
https://blog.rust-lang.org/inside-rust/2020/06/08/new-inline-asm.html
======
dathinab
I recommend reading the RFC. It explains pretty well how it works and it's
readable even for people with only very minimal asm experience:

[https://github.com/Amanieu/rfcs/blob/inline-
asm/text/0000-in...](https://github.com/Amanieu/rfcs/blob/inline-
asm/text/0000-inline-asm.md)

------
Teknoman117
To be honest, I still like the explicit cobber syntax of rusty_asm.

Last year I had some fun getting a rust program to run on some old 386 DOS SBC
I had hanging around. Was terrible in terms of code density (basically a bunch
of 32 bit instructions prefixed with the "execute 32 bit op in real mode"
prefix).

This was a wrapper around the DOS string print function (where s is a &str).

    
    
            rusty_asm! {
                let ptr: in("{dx}") = s.as_ptr();
                let len: in("{cx}") = s.len();
                clobber("ah");
                clobber("al");
                clobber("bx");
                asm("volatile", "intel") {
                    "mov ah, 0x40
                    xor bx, bx
                    int 21h"
                }
            }

~~~
JoshTriplett
In the new syntax, a direct translation of that would look something like
this:

    
    
        asm!("
            mov ah, 0x40
            xor bx, bx
            int 21h
            ",
            in("dx") s.as_ptr(),
            in("cx") s.len(),
            out("ax") _,
            out("bx") _,
        )
    

However, I'd personally write that like this instead:

    
    
        asm!(
            "int 21h",
            inout("ax") 0x4000 => _,
            inout("bx") 0 => _,
            inout("cx") s.len() => _,
            inout("dx") s.as_ptr() => _,
        )
    

That lets the compiler handle filling in ax and bx, and also avoids trusting
that the interrupt routine leaves ax/bx/cx/dx untouched. (I might also write
clobbers for every other general-purpose register, too. Take a look at Linux
arch/x86/boot/bioscall.S and the "Glove box" for BIOS calls, due to
misbehaving BIOSes that clobber registers they shouldn't.)

~~~
josephcsible
I agree that passing `ax` and `bx` as inputs is better than setting them in
the inline asm, but I don't like the idea of setting registers that shouldn't
get changed as outputs "just in case".

~~~
JoshTriplett
It's not a theoretical consideration. It's quite common for BIOS and firmware
code to clobber registers that it isn't supposed to; see the code I mentioned
in the Linux kernel, which arose through real-world bugs that were incredibly
hard to debug.

~~~
josephcsible
But where do you draw the line? What if it clobbers the stack pointer?

~~~
JoshTriplett
Then it won't be able to return at all, and people will notice more quickly.

------
amluto
Rust folks, thank you so much for making this far, far better than GCC’s asm
syntax for C. Also, thank you for using Intel x86 syntax instead of AT&T.

It would be delightful if GCC were to adopt something similar after this
stabilizes in Rust.

~~~
JoshTriplett
> Also, thank you for using Intel x86 syntax instead of AT&T.

Thank you for helping to validate that decision; this is the kind of feedback
we needed.

(And note that you can still choose to use AT&T syntax, it just isn't the
default.)

~~~
jdright
Yes, thank you for the Intel syntax. It is by far the most frequent syntax
you'll find anywhere somewhat recent (1990+) and it is by far the most
straightforward to learn by being clean. I did learn both in the 90s, but I've
always struggled with AT&T symbols and different offerings which to me was
hard to switch to when learning from different material sources.

~~~
monadic2
FWIW I mostly see AT&T when compiling against C codebases to a) avoid
requiring a secondary assembler & wanting to support legacy binutils and b)
because inline C is AT&T in practice, at least outside of windows (no clue
what the c ecosystem is like there). However, most resources exploring
“assembly” do so in a context where it makes sense to use intel syntax & work
with a “dedicated” assembler.

That said, this is a good decision because C compilers seem to be the major
holdouts at this point—binutils has had .intel_syntax for a long time now,
it’s just not supported inline.

~~~
asveikau
> at least outside of windows (no clue what the c ecosystem is like there).

Microsoft's assembler uses Intel syntax, and their inline assembly also uses
Intel syntax.

Their inline assembly is much less explicit about inputs, outputs, and
clobbers, and "just" has you mix C symbols and labels with assembly
instructions, which I imagine is a pain to get right.

However Microsoft has de-emphasized inline assembly and doesn't allow it on
amd64 or ARM. For things that need specific instructions you need to either
use intrinsics or put assembly in a different object file.

~~~
saagarjha
> However Microsoft has de-emphasized inline assembly and doesn't allow it on
> amd64 or ARM.

That's just sad :(

~~~
asveikau
I suspect if they had a situation like GCC where the programmer had to
annotate the side effects of every assembly snippet, it would have been easier
for them to add it elsewhere.

But the way they had it, they presented a seamless blend of assembly
instructions and any C identifier, in or out, and I guess the compiler would
need to parse out any side effects and cope with them. My guess is they looked
at porting all that to ia64 [which they supported until Server 2008 R2], amd64
and ARM and balked.

No idea if they ever had it on some of the old architectures they supported in
NT4 days or on CE (alpha, mips, ppc).

~~~
monocasa
I've used it with SH4 on Windows CE.

------
WalterBright
The inline assembler syntax used in D is designed to match the syntax used in
the Intel assembler reference manuals. It's not that the Intel syntax is that
great (it isn't) but the mental gymnastic of having to swap the operands makes
blood run out of my ears, and causes me to make huge numbers of mistakes.

------
jfkebwjsbx
Anybody that knows about the RFC and other implementations in C compilers, why
are always "string literal parameters" (quoted strings) used?

Wouldn't be nice to have a way to embed other languages without the need for
those quotes everywhere?

I guess it is meant to avoid complicating the syntax/grammar reusing the
existing macro one?

~~~
Sharlin
For maximum generality and compatibility, likely. Also note that this is
”just” a macro, not first-class language syntax, so any non-quoted arguments
would have to be valid Rust token trees.

~~~
est31
> any non-quoted arguments would have to be valid Rust token trees.

Token trees allow a very wide set of syntaxes. rust-c for example allows you
to embed inline C code as token tree via a macro:
[https://github.com/lemonrock/rust-c](https://github.com/lemonrock/rust-c)

------
livre
I may be the only one but I feel like this (and the old) asm syntax is too
complex and not very user friendly. It looks heavily inspired GCC. I'm sorry
for the upcoming German (you can ignore it and just read the examples), I feel
like a smarter compiler could have a friendlier asm syntax, similar to the one
used by FreeBASIC[1].

I hope you can see why I prefer that. Even though the FB compiler is not very
smart and just pushes and pops all important regs to and from the stack its
asm syntax is a joy to write. I don't have to specify clobber registers, in,
out and variables and functions are in scope by default. Also the asm code is
not a string and doesn't need to be quoted, it's doesn't look like something
foreign at all.

I know that something like that is currently not possible in Rust but I'd love
to see it some day.

[1] [https://www.freebasic-portal.de/befehlsreferenz/inline-
assem...](https://www.freebasic-portal.de/befehlsreferenz/inline-
assembler-434.html)

~~~
fluffything
> I don't have to specify clobber registers,

So how does the compiler know that these are clobbered then ?

~~~
masklinn
I expect the entire assembly is encoded into the language.

~~~
int_19h
That doesn't really help - the compiler doesn't know what could be clobbered
by any given (sys)call.

------
Bedon292
I am super new to Rust and maybe not the best place to ask, but why would you
want to use inline assembly in Rust at all? Does that not invalidate a lot of
the safety features built into the language?

~~~
woodruffw
Rust has lots of unsafe features; they all require explicit `unsafe {}`
blocks.

As the post mentions, inline assembly comes in handy in a number of low level
contexts (e.g. dealing with memory-mapped devices, working on
microcontrollers, using processor features that aren't exposed by the kernel
or standard library).

~~~
gizmo385
> using processor features that aren't exposed by the kernel or standard
> library).

As someone who doesn't do any low-level work, could you give an example of
these kinds of features? I'd like to read into some of them :)

~~~
neutronicus
Some processors implement "Instruction Level Parallelism" where you can do
arithmetic on many values simultaneously with special assembly instructions.

The most widespread cases of this (SSE, AVX2) came from Intel trying to get
lock-in in the high-performance computing market. So it took a while for AMD
to sell chips that implemented these instructions and even Intel's catalog
doesn't uniformly offer them.

It's also tough to get the compiler to emit these instructions where you want
even if it knows they're available (unsurprisingly, since you're asking it to
auto parallelize a computation), so in the high-performance space a lot of
people just have to resort to inline assembly.

~~~
jlokier
Just a minor terminology point. Instruction Level Parallelism (ILP) means
something else.

What you are describing, with special instructions, is called Single
Instruction Multiple Data (SIMD), or simply vector instructions.

ILP means the execution of multiple separate instructions in parallel per
clock cycle, though techniques like superscalar, out-of-order and speculative
execution. ILP is sometimes used as a measure: How many instructions per cycle
can be issued. It does not require special instructions, it's just the
hardware being clever about running existing instructions faster.

~~~
neutronicus
Ah, we used that wrong in my old group then.

We talked about Instruction Level Parallelism vs. Thread Level Parallelism vs.
Node Level Parallelism since generally if you are working on a problem with
some data dependency you will really have to think about each separately to
get the best performance.

------
open-paren
I'm new to Rust, so there may be precedent that I don't know, and I'm kinda
spoiled because I'm a front-end dev mainly, where the rule is "don't break the
web", but doesn't changing the syntax of a macro like that break backwards
compatibility?

~~~
jcranmer
asm! has been an unstable feature for years, which means a) it's available
only in nightly builds, and b) there is no guarantee that it will remain
backwards-compatible.

Actually, with regards to asm!, it's been known for a long time that it was
unlikely to be stabilized as-is (being originally a very thin wrapper around
LLVM's inline asm functionality). Probably nearly everyone who's been relying
on inline asm to make their code work has already been following the effort to
replace asm! with a better version.

~~~
twic
> Probably nearly everyone who's been relying on inline asm to make their code
> work has already been following the effort to replace asm! with a better
> version

Except for my colleague who basically says "nah, they'll never be able to move
off the LLVM syntax, the code i'm writing now will be supported forever".

Fortunately, we only have a small amount of assembly, so if we ever do want to
upgrade to a version of the compiler which has dropped support, it won't be a
lot of work.

~~~
dathinab
Support will likely not be dropped anytime soon, maybe never. You just need to
rename it to `llvm_asm!`. Through with `asm!` likely going to be stabilized
while `llvm_asm` never going to be stabilized it might be worthwhile to switch
to the new asm to get of nightly in the future ;=)

------
gonational
I don’t know much about the topic, but I am curious: does this open Rust up to
being less “safe” (in the Rust sense of safety)?

C is of course higher level than assembly and I understand that C is very much
not “safe”, so it makes me wonder whether this gives library developers
incentives to add code that has a higher likelihood of creating security bugs
while optimizing their code.

Is there any reason to believe this could be the case, or am I just
misunderstanding how all of this fits together?

~~~
andolanra
This does not make Rust less safe. The capability to use assembly is already
there: either by using the existing assembly syntax, by linking to C code, or
by linking to assembly routines directly. In order to use this, you need to
use the _unsafe_ keyword in order to indicate to Rust, "I am aware that the
guarantees that Rust provides to me are not applicable when using this
feature." This is true of all the existing ways of interfacing with non-Rust
or assembly code, and it does not change with this new syntax.

------
artursapek
If I wanted to learn assembly language in 2020, what's the best book or
resource people would recommend?

~~~
ethagnawl
[https://famicom.party](https://famicom.party) features the best, most
accessible 6502 assembly crash course I've yet to come across.

It does strike me as weird that there aren't any popular resources that sit
between something as high-level as the linked e-book and something as low-
level as "buy some hardware and figure it out".

~~~
mywittyname
I suspect this has to do with Intel architecture not being very beginner
friendly. When I tried learning x86 assembly, I found it to be incredibly
overwhelming.

There are simulators for teaching assembly, but I don't think those are nearly
as fulfilling as hacking on real hardware. And since it's difficult to hack on
your computer's hardware, you're left pickup up something more simple.

~~~
JoshTriplett
In my opinion, it's less that x86 is any more or less user-friendly, and more
about having an environment that gives you the "raw" bare-metal feeling that
you could once have from DOS. You can write a few instructions, write to video
memory, and the result is _right there_.

You can still get that feeling on x86. For instance, try writing some code
directly in the MBR of a disk that writes some bytes to the serial port on
0x3f8 with outb, and load it up in KVM.

------
m0zg
Is there anything like
[https://github.com/herumi/xbyak](https://github.com/herumi/xbyak) on the Rust
side? I like this approach a lot for non-tivial high performance work since it
lets you tweak your assembly for particular hardware at runtime.

------
mhh__
I don't understand (I do actually but it's not that much work) why so few
languages do what D does and actually have inline assembly as a fully
supported (no quotes) syntactic construct within the language

~~~
cowboysauce
> I do actually but it's not that much work

It apparently is, given that D only supports x86 assembly, doesn't support all
instructions and has idiosyncrasies like requiring prefixes to be written as
separate instructions.

~~~
mhh__
D only has about <20 people working on it at a given time. Some languages
probably have at least a hundred times that number.

~~~
saagarjha
The number of people working on the asm! for Rust on the more obscure
architectures is likely less than one.

------
JoeAltmaier
Always tricky. I avoid inline assembler, except in cases where the compiler
cannot generate some special instruction. This can occur in drivers and kernel
code, where there's no other language feature to manipulate processor
architecture (special registers or machine state manipulation).

It can be fun to try and hand-optimize code in regular applications. I can't
help but feel, there's more juice in teaching the compiler to recognize the
case and generate better code instead. But few know how to go down that road.

~~~
saagarjha
Teaching compilers works in most cases, but not all of them. Sometimes the
language just does not have ways of expressing things to the compiler that you
would like it to know. (For example, how would you explain to the compiler
that reading within the same page will not trigger a fault? Such a guarantee
is often used in vectorized strlen functions, for example.)

------
whatshisface
Hopefully this will help us get better numerical libraries for scientific
computing.

~~~
Q6T46nT668w6i3m
I hope so too but I’m skeptical. In my opinion, Rust still needs ergonomic
multi-dimensional index expressions and standardized multi-dimension array
traits. I’d love to see (or join) a scientific working group to help with
this.

~~~
devit
You can use a[(i, j, k)], and also a[i][j][k] for some data layouts.

As for multidimensional array traits, you can use Index<(usize, usize, usize),
Output = f32>.

~~~
kzrdude
you can use the syntax `a[[i, j, k]]` too, just for another example.

------
MintelIE
Is Rust ever going to go for standardization? Seems like the language has been
in a state of flux with heavy changes all the time. When's it going to settle
down a bit?

~~~
sedatk
As long as it's backwards compatible, why should that be a problem?

~~~
pjmlp
Use of Rust on domains that require compilers for ISO and ECMA certified
languages.

------
xiaodai
Bro. First, unsafe and now assembly. How is Rust a safe language?

~~~
nitsky
There are two languages, “rust” and “unsafe rust”. All code in unsafe blocks
is not safe.

~~~
cesarb
> All code in unsafe blocks is not safe.

That's a sadly common misconception. The unsafe blocks only enable a few extra
language features, like dereferencing raw pointers and calling unsafe
functions; if none of these extra language features are used, the code is
completely safe. That is, if you have a block of code ({ X }) and just add the
unsafe keyword (unsafe { X }), it works exactly the same and is still safe.
This also shows that there are not "two languages"; what you called "rust" is
a subset of what you called "unsafe rust", not a different language.

~~~
nitsky
The first paragraph of the chapter on unsafe rust in TRPL reads:

> However, Rust has a second language hidden inside it that doesn’t enforce
> these memory safety guarantees: it’s called unsafe Rust and works just like
> regular Rust, but gives us extra superpowers.

I disagree that “{ X }” and “unsafe { X }” are the same. The former is
provably safe if a bug free rust compiler accepts it. The latter leaves the
burden of proof on the programmer.

------
mkchoi212
This looks a lot better than GCC’s asm. However, I can’t think of a good
reason why one should use asm in their code instead of letting the compiler
write the far superior asm for you.

~~~
palmtree3000
It's not always superior!

For example, I've been working on a big integer library. Addition is a bit
annoying, because you need to add 3 numbers (the prior carry bit, and the two
digits you're adding) and get out a new carry bit and a resulting digit. This
is a bit cumbersome:

    
    
            let (res, carry1) = target_digit.overflowing_add(carry as u64);
            let (res, carry2) = res.overflowing_add(other_digit);
            *target_digit = res;
            carry = carry1 || carry2;
    

The resulting assembly is a fairly literal translation. We perform the
addition using an add instruction, and extract the carry flag into a register
using setb.

    
    
            addq    (%rdi,%rsi,8), %rcx
            setb    %r10b
            addq    (%rdx,%rsi,8), %rcx
            setb    %al
            movq    %rcx, (%rdi,%rsi,8)
            orb     %r10b, %al
            movzbl  %al, %eax
    

But there's a dedicated instruction for this, adc. adc adds two operands and
the carry flag, while itself setting a carry flag. Manually unrolling the loop
a bit, I wrote this assembly:

    
    
                    shlb $8, {carry}
                    adcq 0x00({y0}), {x0}
                    adcq 0x08({y0}), {x1}
                    adcq 0x10({y0}), {x2}
                    adcq 0x18({y0}), {x3}
                    adcq 0x20({y0}), {x4}
                    adcq 0x28({y0}), {x5}
                    setb {carry}
    

And got a 3x speedup.

~~~
jeffdavis
The article here:

[https://news.ycombinator.com/item?id=23351007](https://news.ycombinator.com/item?id=23351007)

specifically recommends against the carry variants of addition, because the
instructions are still dependent on each other and don't pipeline well. In
other words, it's using the same algorithm, just buried in a single
instruction, and that doesn't necessarily make it faster.

Have you considered using a strategy similar to what the article suggests? I
think the HN comments also had some additional suggestions.

~~~
adwn
> _that doesn 't necessarily make it faster_

Did you miss the "And got a 3x speedup" part of the post you were replying to?
Actual benchmarks of real code always trump theoretical deliberations.

~~~
jeffdavis
Yikes. I was just trying to link to a relevant technique that the author might
find helpful.

A big integer library may have many use cases; a benchmark only shows one data
point. It's possible that by deferring carry work across more operations he'd
see an even bigger improvement.

------
pjmlp
Sadly it is yet another language that follows the UNIX tradition of not having
a proper Assembly parser instead forcing us to play with strings.

I guess it leaves PC based languages to have actual inline Assembly parsers or
intrisics.

~~~
Someone1234
If you want the outer language to be cross platform (e.g. x64, ARM64, etc),
then you'd either need a parser for every potential platform (even ones you
aren't currently using!) OR you just settle for strings.

Don't get me wrong: Parsed assembly is superior because you can have certain
parser guarantees/checks. But it will definitely limit the scope of your
language or at least slow its adoption to new platforms.

~~~
pjmlp
Productivity of language users vs compiler developers.

~~~
alharith
There are infinitely more language users. I'd rather throw a bone to the
compiler developers.

~~~
AnIdiotOnTheNet
Then that means there is infinitely more productivity to be gained catering to
the language users.

