This is only half true. While the code doesn't call any stdlib functions, it still relies on the the c stdlib and runtime in order to get called and properly exit.
I'm somewhat perplexed why the author did do it with the runtime, given that he doesn't really depend on features of it (except maybe the automatic exit code handling) instead of building with -ffreestanding.
No need to remember syscall numbers or calling conventions, or the correct way to annotate your __asm__ directives, and it's even cross-architecture.
But does this work on both GCC and Clang, and is safe from being optimized away? edit: the answer is no
Turbo Pascal had an integrated assembler that could use symbols (and even complex types) defined anywhere in the program, like this:
procedure HelloWorld; assembler;
const Message: String = 'Hello, world!'^M^J; {Msg+CR+LF}
asm
mov ah,$40 {DOS system call number for write}
mov bx,1 {standard output}
xor ch,ch {clear high byte of length}
mov cl,Message.byte[0]
mov dx,offset Message+1
int $21
end;
1129: f3 0f 1e fa endbr64
112d: 55 push rbp
112e: 48 89 e5 mov rbp,rsp
1131: b8 01 00 00 00 mov eax,0x1
1136: bf 01 00 00 00 mov edi,0x1
113b: 48 8d 05 c2 0e 00 00 lea rax,[rip+0xec2] # 2004 <_IO_stdin_used+0x4>
1142: 48 89 c6 mov rsi,rax
1145: ba 0f 00 00 00 mov edx,0xf
114a: 0f 05 syscall
114c: b8 00 00 00 00 mov eax,0x0
1151: 5d pop rbp
1152: c3 ret
Can you spot the error?. . . . . .
The code biffs rax when it loads the string address, so the system call number is lost, and the code ends up not printing anything. Moving the string assignment to be the very first line in main fixes it.
BTW, Clang 14 with no optimization accepts the code without issue but compiles it without using any of the registers; it just stores the values to memory locations and runs the syscall opcode. With O1 optimization or higher, it optimizes away everything except the syscall opcode.
Edit: Ah a basic asm blocks is implicitly volatile. I'm still a little concerned the compiler could get clever and decide the register variables are unused and optimize them out.
Working example: https://john-millikin.com/unix-syscalls#linux-x86-64-gnu-c
Adapted to use main():
Test with: