As other comments have noted, the asm statement needs to have its input/output registers specified to ensure the compiler doesn't erase the "unused" values.
Sorry folks! Note also this only works on Linux. On BSDs for example, even if you change the magic number, BSDs may clobber all the call-clobbered registers. So with those OSes it's usually simplest to write an assembly stub like this:
> This C program doesn’t use any C standard library functions.
This is only half true. While the code doesn't call any stdlib functions, it still relies on the the c stdlib and runtime in order to get called and properly exit.
I'm somewhat perplexed why the author did do it with the runtime, given that he doesn't really depend on features of it (except maybe the automatic exit code handling) instead of building with -ffreestanding.
Thanks for making me extremely sentimental for the hundreds of Turbo Pascal projects I did back in the day - this particular example highlights the elegance and clarity of the language, which we still seem to resist in our modern tooling.
Modern C is neither "low-level" or "high-level". It's defined for an abstract machine where integers can't overflow, null pointers can't be referenced, etc. And unless you follow all the rules, and add proper annotations for things like inline assembly, the compiler is free to do anything to your code.
The one advantage to this approach is that modern compilers can turn megabytes of auto-generated crap produced by string substitution macros into halfway decent machine language.
(And I freely admit that specifically Turbo Pascal produced really bad code, worse even than C compilers at the time, but the syntax is oh so much nicer IMHO)
I believe that MSVC inline asm allows referencing variables in the asm as it can parse and understand the asm (at least before they got rid of inline asm completely for 64 bit code).
AFAIK GCC does not attempt to parse the asm by design, as it is meant to be used for code that the compiler might not understand, so you have to describe input, outputs and side effects with annotations.
I think its elegant because the distinction between Pascal and Assembly is made using the Pascal asm .. end; keywords, and in that block one can also access the Pascal variables without much fuss involving the assembler.
I find that really nice to read and to look at, whereas the examples given in the original article are prone to syntax overload, what with all the intermixing - for example, the variable declarations having what 'look' like attributes - but are really assembly instructions, emitted.
I guess one would have had to have enjoyed writing Turbo Pascal code, though, to see this particular aesthetic. A lot of folks do, some don't ..
The code biffs rax when it loads the string address, so the system call number is lost, and the code ends up not printing anything. Moving the string assignment to be the very first line in main fixes it.
BTW, Clang 14 with no optimization accepts the code without issue but compiles it without using any of the registers; it just stores the values to memory locations and runs the syscall opcode. With O1 optimization or higher, it optimizes away everything except the syscall opcode.
No idea why a newer version produces worse code in this case (though of course, this way of doing inline assembly isn't "correct" anyway, so nasal demons may result)
Never seen inline assembly written quite like that, is this actually correct code? I'm concerned that normally register annotation is just a hint, and that the assembly blocks are not marked volatile - and that the compiler may therefore be free to rewrite this code in many breaking ways.
Edit: Ah a basic asm blocks is implicitly volatile. I'm still a little concerned the compiler could get clever and decide the register variables are unused and optimize them out.
I think that named register variables (a GCC extension) are meant to be live in asm block by design, so they shouldn't be optimized away.
Still I would use extended asm.
edit: from the docs: "The only supported use for [Specifying Registers for Local Variables] is to specify registers for input and output operands when calling Extended asm".
It's not UB, it's documented behaviour of a vendor extension.
It's not UB because it's defined as outside the scope of the language standard. The vendor (in this case, GCC) does document how to use its inline assembly extension in quite a lot of detail, including how to use clobber lists to prevent exactly the kind of thing these failures demonstrate.
GCC says that register extensions are not supported with basic inline asm. If you do it anyway it doesn't work it's not undefined behaviour, it's behaving as documented. Once you've ventured into vendorland you're outside the realm of undefined behaviour to start with, but following the vendor's documentation on how to use the vendor's extension is the minimum requirement for meeting your expectations that the feature will work.
I don't think this is exactly correct. Undefined behavior means a very specific thing - that the program could do literally anything. But I think that's not quite the situation in this case. Rather, I would suspect that these specified-register variables are only guaranteed to be effective with extended asm constraints.
For basic asm I would assume then that the register contents cannot be relied on to contain the value of the variable, but as long as you don't rely on it, then you are in the clear.
Then again it's hard to be sure about these matters with C.
> For basic asm I would assume then that the register contents cannot be relied on to contain the value of the variable, but as long as you don't rely on it, then you are in the clear.
That's the crux. The example invokes 'syscall', which obviously relies on specific register content, from basic asm.
Working example: https://john-millikin.com/unix-syscalls#linux-x86-64-gnu-c
Adapted to use main():
Test with: