Hacker News new | past | comments | ask | show | jobs | submit login
Hello world in C inline assembly (2018) (jameshfisher.com)
163 points by aragonite 3 months ago | hide | past | favorite | 40 comments



As other comments have noted, the asm statement needs to have its input/output registers specified to ensure the compiler doesn't erase the "unused" values.

Working example: https://john-millikin.com/unix-syscalls#linux-x86-64-gnu-c

Adapted to use main():

  static const int STDOUT = 1;
  static const int SYSCALL_WRITE = 1;
  static const char message[] = "Hello, world!\n";
  static const int message_len = sizeof(message);

  int main() {
   register int         rax __asm__ ("rax") = SYSCALL_WRITE;
   register int         rdi __asm__ ("rdi") = STDOUT;
   register const char *rsi __asm__ ("rsi") = message;
   register int         rdx __asm__ ("rdx") = message_len;
   __asm__ __volatile__ ("syscall"
    : "+r" (rax)
    : "r" (rax), "r" (rdi), "r" (rsi), "r" (rdx)
    : "rcx", "r11");
   return 0;
  }
Test with:

  $ gcc -o hello hello.c
  $ ./hello
  Hello, world!


Or just

  int main(void) {
    asm volatile("syscall" : : "a"(1), "d"(14), "D"(1), "S"("hello world!\n"));
    return 0;
  }
Though the clobber list is weak spot, I don't know exactly what it should have in this case.


You want:

    long ax;
    asm volatile("syscall" : "=a"(ax) : "0"(1), "D"(1), "S"("hello world!\n"), "d"(14));
You can also say:

    long ax = 1;
    asm volatile("syscall" : "+a"(ax) : "D"(1), "S"("hello world!\n"), "d"(14));
https://justine.lol/dox/rmsiface.txt


Got some sleep and took a second look. You actually want:

    long ax = 1;
    asm volatile("syscall" : "+a"(ax) : "D"(1), "S"("hello world!\n"), "d"(14) : "rcx", "r11");
Sorry folks! Note also this only works on Linux. On BSDs for example, even if you change the magic number, BSDs may clobber all the call-clobbered registers. So with those OSes it's usually simplest to write an assembly stub like this:

    my_write:
      mov $4,%eax
      syscall
      ret


I don't suppose you know the syscall clobbered list for aarch64 Linux? Can't find it documented anywhere and not sure how to dig it out of the kernel

A sibling comment pointed at https://chromium.googlesource.com/linux-syscall-support/+/re... which suggests none are clobbered outside of the arguments used by a given call which is possible but seems unlikely


I think that's accurate. Example code:

    static privileged int GetPid(void) {
      int res;
    #ifdef __x86_64__
      asm volatile("syscall"
                   : "=a"(res)
                   : "0"(__NR_linux_getpid)
                   : "rcx", "r11", "memory");
    #elif defined(__aarch64__)
      register long res_x0 asm("x0");
      asm volatile("mov\tx8,%1\n\t"
                   "svc\t0"
                   : "=r"(res_x0)
                   : "i"(__NR_linux_getpid)
                   : "x8", "memory");
      res = res_x0;
    #endif
      return res;
    }
You should be fine.


> This C program doesn’t use any C standard library functions.

This is only half true. While the code doesn't call any stdlib functions, it still relies on the the c stdlib and runtime in order to get called and properly exit.

I'm somewhat perplexed why the author did do it with the runtime, given that he doesn't really depend on features of it (except maybe the automatic exit code handling) instead of building with -ffreestanding.


You have to add some extra assembly before main if you don't use the C runtime. You have to write _start, the actual entry point that CRT usually takes. https://github.com/fsmv/dfre/blob/master/code/linux32_start....

This is for -nostdlib not -ffreestanding


You can usually with not having the initial part. As long as you do call the exit syscall, it should work.


"This C program doesn’t explicitly use any C standard library functions." doesn't sound as cool, though.


If you ever feel the need to do this in production, use linux_syscall_support.h (LSS) https://chromium.googlesource.com/linux-syscall-support

No need to remember syscall numbers or calling conventions, or the correct way to annotate your __asm__ directives, and it's even cross-architecture.


Actually more readable than the AT&T syntax :)

But does this work on both GCC and Clang, and is safe from being optimized away? edit: the answer is no

Turbo Pascal had an integrated assembler that could use symbols (and even complex types) defined anywhere in the program, like this:

    procedure HelloWorld; assembler;
    const Message: String = 'Hello, world!'^M^J;  {Msg+CR+LF}
    asm
        mov  ah,$40  {DOS system call number for write}
        mov  bx,1    {standard output}
        xor  ch,ch   {clear high byte of length}
        mov  cl,Message.byte[0]
        mov  dx,offset Message+1
        int  $21
    end;


Not only Turbo Pascal, this more sane approach to inline Assembly was quite common in the PC world compilers, regardless of the programming language.


Inline assembly also has support for symbol names, although the native symbols could not be accessed directly but in a bit awkward way.

https://stackoverflow.com/questions/32131950/assembler-templ...


Thanks for making me extremely sentimental for the hundreds of Turbo Pascal projects I did back in the day - this particular example highlights the elegance and clarity of the language, which we still seem to resist in our modern tooling.


I don't really see what's "elegant" about the code, could you elaborate? (This isn't a jab at GP. I'm just curious about what I'm not seeing.)


You might want to compare it to the "proper" version of the inline asm code, from this comment: https://news.ycombinator.com/item?id=40703314

Modern C is neither "low-level" or "high-level". It's defined for an abstract machine where integers can't overflow, null pointers can't be referenced, etc. And unless you follow all the rules, and add proper annotations for things like inline assembly, the compiler is free to do anything to your code.

The one advantage to this approach is that modern compilers can turn megabytes of auto-generated crap produced by string substitution macros into halfway decent machine language.

(And I freely admit that specifically Turbo Pascal produced really bad code, worse even than C compilers at the time, but the syntax is oh so much nicer IMHO)


I believe that MSVC inline asm allows referencing variables in the asm as it can parse and understand the asm (at least before they got rid of inline asm completely for 64 bit code).

AFAIK GCC does not attempt to parse the asm by design, as it is meant to be used for code that the compiler might not understand, so you have to describe input, outputs and side effects with annotations.


That isn't proper C, rather GCC and clang dialects of inline Assembly.


I think its elegant because the distinction between Pascal and Assembly is made using the Pascal asm .. end; keywords, and in that block one can also access the Pascal variables without much fuss involving the assembler.

I find that really nice to read and to look at, whereas the examples given in the original article are prone to syntax overload, what with all the intermixing - for example, the variable declarations having what 'look' like attributes - but are really assembly instructions, emitted.

I guess one would have had to have enjoyed writing Turbo Pascal code, though, to see this particular aesthetic. A lot of folks do, some don't ..


When I compile it with GCC 12, this machine code results:

    1129:       f3 0f 1e fa             endbr64 
    112d:       55                      push   rbp
    112e:       48 89 e5                mov    rbp,rsp
    1131:       b8 01 00 00 00          mov    eax,0x1
    1136:       bf 01 00 00 00          mov    edi,0x1
    113b:       48 8d 05 c2 0e 00 00    lea    rax,[rip+0xec2]        # 2004 <_IO_stdin_used+0x4>
    1142:       48 89 c6                mov    rsi,rax
    1145:       ba 0f 00 00 00          mov    edx,0xf
    114a:       0f 05                   syscall 
    114c:       b8 00 00 00 00          mov    eax,0x0
    1151:       5d                      pop    rbp
    1152:       c3                      ret    
Can you spot the error?

. . . . . .

The code biffs rax when it loads the string address, so the system call number is lost, and the code ends up not printing anything. Moving the string assignment to be the very first line in main fixes it.

BTW, Clang 14 with no optimization accepts the code without issue but compiles it without using any of the registers; it just stores the values to memory locations and runs the syscall opcode. With O1 optimization or higher, it optimizes away everything except the syscall opcode.


The exact same thing happens with GCC 12 with 32-bit MIPS.

  #include <asm/unistd.h>
   
  char msg[] = "hello, world!\n";
   
  int main(void)
  {
      register int syscall_no asm("v0") = __NR_write;
      register int arg1       asm("a0") = 1;
      register char *arg2     asm("a1") = msg;
      register int arg3       asm("a2") = sizeof(msg) - 1;
   
      asm("syscall");
   
      return 0;
  }

  root@OpenWrt:~# objdump --disassemble=main
  ...
  00400580 <main>:
    400580: 27bdfff8  addiu sp,sp,-8
    400584: afbe0004  sw s8,4(sp)
    400588: 03a0f025  move s8,sp
    40058c: 24020fa4  li v0,4004
    400590: 24040001  li a0,1
    400594: 3c020041  lui v0,0x41
    400598: 24450650  addiu a1,v0,1616
    40059c: 2406000e  li a2,14
    4005a0: 0000000c  syscall


With an older version, it works (as long as there is no optimization at least, with -O2 all the register init code disappears):

$ gcc -v

... gcc version 10.2.1 20210110 (Debian 10.2.1-6)

    0000000000001125 <main>:
        1125: 55                    push   %rbp
        1126: 48 89 e5              mov    %rsp,%rbp
        1129: b8 01 00 00 00        mov    $0x1,%eax
        112e: bf 01 00 00 00        mov    $0x1,%edi
        1133: 48 8d 35 ca 0e 00 00  lea    0xeca(%rip),%rsi        # 2004 <_IO_stdin_used+0x4>
        113a: ba 0e 00 00 00        mov    $0xe,%edx
        113f: 0f 05                 syscall 
No idea why a newer version produces worse code in this case (though of course, this way of doing inline assembly isn't "correct" anyway, so nasal demons may result)


Never seen inline assembly written quite like that, is this actually correct code? I'm concerned that normally register annotation is just a hint, and that the assembly blocks are not marked volatile - and that the compiler may therefore be free to rewrite this code in many breaking ways.

Edit: Ah a basic asm blocks is implicitly volatile. I'm still a little concerned the compiler could get clever and decide the register variables are unused and optimize them out.


Tried it with GCC, and without any optimization it does print the message. With "-O2" however, we get this:

    Disassembly of section .text:
    
    0000000000001040 <main>:
        1040: 0f 05                 syscall 
        1042: 31 c0                 xor    %eax,%eax
        1044: c3                    retq   
Everything except the syscall instruction has been optimized away!


Now that's incredibly cursed. Could do basically anything and swallows the error too!


I think that named register variables (a GCC extension) are meant to be live in asm block by design, so they shouldn't be optimized away.

Still I would use extended asm.

edit: from the docs: "The only supported use for [Specifying Registers for Local Variables] is to specify registers for input and output operands when calling Extended asm".

So the example is UB.


It's not UB, it's documented behaviour of a vendor extension.

It's not UB because it's defined as outside the scope of the language standard. The vendor (in this case, GCC) does document how to use its inline assembly extension in quite a lot of detail, including how to use clobber lists to prevent exactly the kind of thing these failures demonstrate.


GCC says that register extensions are not supported with basic asm. What happens if you do it anyway is not documented. Ergo UB.


GCC says that register extensions are not supported with basic inline asm. If you do it anyway it doesn't work it's not undefined behaviour, it's behaving as documented. Once you've ventured into vendorland you're outside the realm of undefined behaviour to start with, but following the vendor's documentation on how to use the vendor's extension is the minimum requirement for meeting your expectations that the feature will work.


> it's behaving as documented.

what do you think the documented behaviour is? None is documented, so it is undefined. The only defined behaviour is with extended asm.

Remember in C and C++ everything is UB unless defined in the standard or by an implementation.

If it was documented as giving an hard error, you would be right. But it is not.


I don't think this is exactly correct. Undefined behavior means a very specific thing - that the program could do literally anything. But I think that's not quite the situation in this case. Rather, I would suspect that these specified-register variables are only guaranteed to be effective with extended asm constraints.

For basic asm I would assume then that the register contents cannot be relied on to contain the value of the variable, but as long as you don't rely on it, then you are in the clear.

Then again it's hard to be sure about these matters with C.


> For basic asm I would assume then that the register contents cannot be relied on to contain the value of the variable, but as long as you don't rely on it, then you are in the clear.

That's the crux. The example invokes 'syscall', which obviously relies on specific register content, from basic asm.


The `return 0;` is optional for main() in C, so the function body could be made to consist solely of inline assembly.


Is anyone aware of a similar example, for ARM assembly on macOS?


    int
    main(void) {
        register const char *msg asm("x1") = "hello, world!\n";
        asm (
            "mov w0, #1\n"
            "mov w2, #14\n"
            "mov w16, #4\n"
            "svc #128\n"
            :
            : "r" (msg)
        );
    }


How does one compile this?

EDIT: my bad, my source had a typo - it's as easy as you'd think:

  $ cc hello.c -o hello
  $ ./hello


thanks!


Not inline, but this was linked in a comment on HN a few days ago

https://github.com/below/HelloSilicon


Try this with visual studio and x64. Microsoft!!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: