The `jns` is the jump that reflects the if (x < 0) statement and it jumps in the case the number is >= 0 (i.e., non-negative). gcc organizes it like this because it involves a only a single jump. However if you expect the non-negative case to be the common one, this is will limit the throughput of the code. It would be better to compile it like this:
.L4:
mov edx, DWORD PTR [rdi]
test edx, edx
js .L10
.L3:
add rdi, 4
cmp rdi, rax
jne .L4
.L1:
ret
.L10:
mov DWORD PTR [rdi], 0
jmp .L3
This now involves two jumps when the number is negative: it has to jump out to the extra bit of code outside the main body of the function at .L10, and then jump back, but the common case is now "fall through" so this could execute at close to 1 per cycle, twice as fast as the other version.
You can get gcc to produce the second code by changing the condition to __builtin_expect(x < 0, 0):
All good compilers do this today. They use heuristics to guess what the likely outcome of branches, but you can provide them specific information with profile guided optimization, or with manual branch hits (the __builtin_expect I used in the example above is one such hint).
Similarly, you can annotate certain functions or paths "hot" or "cold", and compilers even understand that functions like abort() are cold (they can be called at most once, after all) and hence paths leading up to them are also cold.
Similarly, JIT compilers use the runtime observed branching behavior to organize branches so that fall-through is the common case.
It wasn't clear to me what "organize the likely path instead as untaken" means.