jmillikin
As other comments have noted, the asm statement needs to have its input/output registers specified to ensure the compiler doesn't erase the "unused" values.

Working example: https://john-millikin.com/unix-syscalls#linux-x86-64-gnu-c

Adapted to use main():

  static const int STDOUT = 1;
  static const int SYSCALL_WRITE = 1;
  static const char message[] = "Hello, world!\n";
  static const int message_len = sizeof(message);

  int main() {
   register int         rax __asm__ ("rax") = SYSCALL_WRITE;
   register int         rdi __asm__ ("rdi") = STDOUT;
   register const char *rsi __asm__ ("rsi") = message;
   register int         rdx __asm__ ("rdx") = message_len;
   __asm__ __volatile__ ("syscall"
    : "+r" (rax)
    : "r" (rax), "r" (rdi), "r" (rsi), "r" (rdx)
    : "rcx", "r11");
   return 0;
  }
Test with:

  $ gcc -o hello hello.c
  $ ./hello
  Hello, world!
sp1rit
> This C program doesn’t use any C standard library functions.

This is only half true. While the code doesn't call any stdlib functions, it still relies on the the c stdlib and runtime in order to get called and properly exit.

I'm somewhat perplexed why the author did do it with the runtime, given that he doesn't really depend on features of it (except maybe the automatic exit code handling) instead of building with -ffreestanding.

Retr0id
If you ever feel the need to do this in production, use linux_syscall_support.h (LSS) https://chromium.googlesource.com/linux-syscall-support

No need to remember syscall numbers or calling conventions, or the correct way to annotate your __asm__ directives, and it's even cross-architecture.

rep_lodsb
Actually more readable than the AT&T syntax :)

But does this work on both GCC and Clang, and is safe from being optimized away? edit: the answer is no

Turbo Pascal had an integrated assembler that could use symbols (and even complex types) defined anywhere in the program, like this:

    procedure HelloWorld; assembler;
    const Message: String = 'Hello, world!'^M^J;  {Msg+CR+LF}
    asm
        mov  ah,$40  {DOS system call number for write}
        mov  bx,1    {standard output}
        xor  ch,ch   {clear high byte of length}
        mov  cl,Message.byte[0]
        mov  dx,offset Message+1
        int  $21
    end;
msla
When I compile it with GCC 12, this machine code results:

    1129:       f3 0f 1e fa             endbr64 
    112d:       55                      push   rbp
    112e:       48 89 e5                mov    rbp,rsp
    1131:       b8 01 00 00 00          mov    eax,0x1
    1136:       bf 01 00 00 00          mov    edi,0x1
    113b:       48 8d 05 c2 0e 00 00    lea    rax,[rip+0xec2]        # 2004 <_IO_stdin_used+0x4>
    1142:       48 89 c6                mov    rsi,rax
    1145:       ba 0f 00 00 00          mov    edx,0xf
    114a:       0f 05                   syscall 
    114c:       b8 00 00 00 00          mov    eax,0x0
    1151:       5d                      pop    rbp
    1152:       c3                      ret    
Can you spot the error?

. . . . . .

The code biffs rax when it loads the string address, so the system call number is lost, and the code ends up not printing anything. Moving the string assignment to be the very first line in main fixes it.

BTW, Clang 14 with no optimization accepts the code without issue but compiles it without using any of the registers; it just stores the values to memory locations and runs the syscall opcode. With O1 optimization or higher, it optimizes away everything except the syscall opcode.

im3w1l
Never seen inline assembly written quite like that, is this actually correct code? I'm concerned that normally register annotation is just a hint, and that the assembly blocks are not marked volatile - and that the compiler may therefore be free to rewrite this code in many breaking ways.

Edit: Ah a basic asm blocks is implicitly volatile. I'm still a little concerned the compiler could get clever and decide the register variables are unused and optimize them out.

layer8
The `return 0;` is optional for main() in C, so the function body could be made to consist solely of inline assembly.
onderweg
Is anyone aware of a similar example, for ARM assembly on macOS?
badrabbit
Try this with visual studio and x64. Microsoft!!