
Linux System Call Table - thevivekpandey
http://thevivekpandey.github.io/posts/2017-09-25-linux-system-calls.html
======
zokier
So the issues noticed so far:

* Missing syscalls

* Wrong syscall numbers

* Wrong calling convention

* Links to source are to wrong version

Does the table get actually anything right? I mean this is pretty spectacular
cascade of failures.

~~~
Skunkleton
At least for x86, you can get this same information fairly easily directly
from the source. The table is located at
arch/x86/entry/syscalls/syscall_64.tbl, from there you can grep for the
function with git grep. For example, git grep 'SYSCALL_DEFINE.*read'.

~~~
tomsthumb
why bother with git grep vs. just vanilla grep. i could see the use if you're
working with an older binary, but you didn't mention.

~~~
Skunkleton
If you ran something like grep -r SYSCALL_DEFINE. _read from the top level of
the linux source it would search through not just your source code, but also
all of the artifacts of building the kernel. Basically, git grep is faster in
this case because it filters the searched files down to only ones that are
checked in. You could achieve a similar effect with standard tools like this:
find -type f -regex '._\\.[hc]' | xargs grep 'SYSCALL_DEFINE.*read'

~~~
leni536

        grep -r --include='*.[hc]' 'SYSCALL_DEFINE.*read'

~~~
Skunkleton
Nice. I hadn't used --include before.

------
akrasuski1
Actually, the syscall numbers are wrong! This reference seems better:
[http://blog.rchapman.org/posts/Linux_System_Call_Table_for_x...](http://blog.rchapman.org/posts/Linux_System_Call_Table_for_x86_64/)

Consider simple C program:

    
    
        #define _GNU_SOURCE
        #include <unistd.h>
        
        int main(){
            syscall(276);
        }
    

Strace'ing it shows the syscall used is tee, just as the reference I linked
shows, and not pwritev as in OP's table.

~~~
voltagex_
It's odd. The owner of this Git repo has Issues turned off so I can't post a
question/issue, and it appears to have been auto-generated -
[https://github.com/thevivekpandey/syscalls-
table-64bit](https://github.com/thevivekpandey/syscalls-table-64bit) is a
"fork" of [https://github.com/paolostivanin/syscalls-
table-64bit](https://github.com/paolostivanin/syscalls-table-64bit)

------
webreac
If man pages were up to date, this should be the index of chapter 2. I have
discover unix with sun in the 90s and I am very nostalgic of the quality of
man pages. At that time, man pages were complete and up to date. My latest
frustration was with the option -m of df command. Chapter 2 should be updated
each time a new version of kernel is installed.

~~~
mjw1007
It's very strange that adding/updating documentation isn't treated as a basic
requirement for a patch that adds to or modifies Linux's public interfaces.

~~~
milcron
For the BSDs, incorrect or missing man pages are considered a serious bug.

------
dmix
Nice, I'm curious how it maps the system call to the source code line number
dynamically? (Edit: seems like ctags + [http://elixir.free-
electrons.com/linux/latest/source](http://elixir.free-
electrons.com/linux/latest/source) [1]) It supports every kernel version.

The linked source code browser seems like a useful way to check the history of
system calls for research...

[1] [https://github.com/thevivekpandey/syscalls-
table-64bit/blob/...](https://github.com/thevivekpandey/syscalls-
table-64bit/blob/master/gen_syscalls.py)

------
akrasuski1
That's all cool and everything, but the registers are wrong... Not only are
they 32-bit (eax vs. rax), but their order is wrong too - the first argument
in x86-64 ABI is rdi, for example.

~~~
khedoros1
The registers look correct for the i386 ABI. eax for the system call number,
then ebx, ecx, edx, esi, edi, ebp for the next 6 arguments.

I skimmed a couple files in the code. And it seems like it might be parsing
this information out of some other sources, and maybe getting confused about
the info it's grabbing?

[https://github.com/thevivekpandey/syscalls-
table-64bit](https://github.com/thevivekpandey/syscalls-table-64bit)

------
jared0x90
There are tables in the kernel git repo if you want a good reference for their
values; however, the register definitions aren't provided.

x86:
[https://github.com/torvalds/linux/blob/master/arch/x86/entry...](https://github.com/torvalds/linux/blob/master/arch/x86/entry/syscalls/syscall_32.tbl)

x64:
[https://github.com/torvalds/linux/blob/master/arch/x86/entry...](https://github.com/torvalds/linux/blob/master/arch/x86/entry/syscalls/syscall_64.tbl)

------
sigjuice
This man page describes the syscall ABI for all architectures.
[http://man7.org/linux/man-
pages/man2/syscall.2.html](http://man7.org/linux/man-
pages/man2/syscall.2.html)

~~~
caf
...and the syscalls(2) man page lists them: [http://man7.org/linux/man-
pages/man2/syscalls.2.html](http://man7.org/linux/man-
pages/man2/syscalls.2.html)

------
kwoff
I put out a syscall table back in the day for Linux 2.2 (up to %eax 190).
Someone copied it (I'm glad.):
[https://www.cs.utexas.edu/~bismith/test/syscalls/syscalls32....](https://www.cs.utexas.edu/~bismith/test/syscalls/syscalls32.html)
They didn't attribute it to me, but I remember a professor did for his class.
There were better tables after that I admit, though I liked my version because
it linked into the source code.

------
amluto
I'm a bit puzzled. The code is at "syscalls-table-64bit", yet the regs are
eax, etc. This makes very little sense.

In any event, I think the args should just be labeled arg0..arg5.

~~~
LukeShu
In several cases, the order of the args for a syscall varies between
architectures--writing a general "arg0" doesn't make a lot of sense.

That said, I don't know what's up with it using 32-bit register names.

------
known
Latest complete list is at [http://elixir.free-
electrons.com/linux/latest/source/include...](http://elixir.free-
electrons.com/linux/latest/source/include/linux/syscalls.h)

------
kahlonel
This is very handy with the asm registers mapped to arguments. Thanks!

------
fbourque
Nice work. the table has been generated for 4.10 and hence the link to the
source code files should also have this kernel version in the path of the url
for direct access

------
language
Ah, this is neat! Would be nice to have a script for this that you could just
point at a local copy of the source tree too!

------
throwaway613834
What is the use case for this? Is it for someone trying to write their own
syscall wrappers?

~~~
Manozco
You might need that when you want to reimplement Linux, the Joyent team did
that on their OS (derived from solaris) so that user can run linux binaries on
a solaris kernel (so thay have dtrace, zfs, mdb, ...) Bryan Cantrill did a
bunch of conferences on that (one here:
[https://youtu.be/TrfD3pC0VSs](https://youtu.be/TrfD3pC0VSs))

The idea behind is that Linux is only a list of syscalls, if you are able to
reimplement them, you reimplement linux, you don't need anything else. On the
contrary if you want to reimplement a BSD you need to reimplement their libc
(and perhaps some other libraries)

~~~
int_19h
> On the contrary if you want to reimplement a BSD you need to reimplement
> their libc (and perhaps some other libraries)

To clarify, what you're saying is that in BSD land, the syscall API is not
considered stable, but libc is?

~~~
Manozco
I'm not ultra familiar with the topic so if someone wants to correct me please
do but :

\- Linux has always been described as just a kernel, which translates as just
a syscall table. The fact that this table is stable or not is not relevant
here.

\- *BSD on the other hand are shipping a kernel plus a lot of
libraries/binaries, if you want to simulate a BSD system, you have to expose
those libraries/binaries.

It's not so much a technical difference, it's more of a different approach to
OS development (kernel space vs kernel/user space).

~~~
int_19h
Thing is, if syscalls in BSD are considered stable the way they are in Linux,
then you could just ship your own kernel with BSD's libc. But if they consider
it an internal API between kernel and libc, and apps are only ever supposed to
depend on libc, then of course that doesn't work.

So stability of syscall API is the de facto differentiating factor here. It
sounds like Microsoft couldn't do "Windows Subsystem for BSD" the way it did
WSL, for example.

------
smegel
Do system calls put their return value on the calling threads stack or in a
register?

~~~
a3f
AFAIK, A process isn't required to have a stack.

------
rhinoceraptor
Someone should put together a list of which ones are irredeemably broken (and
as such, humanity is stuck with a broken ABI in perpetuity), e.g. epoll.

------
spilk
What about non-x86/64 platforms?

------
guhcampos
The mere fact that we are debating over the correctness of this table confirms
the quality of the documentation of the OS we base our entire civilization
upon is pretty poor.

------
eatonphil
This is great! It would be even more useful to have this for Mac OSX too. A
lot of the projects I do ends up being on both Mac and Linux. It's always a
pain to find the corresponding number for the system call on Mac.

~~~
legulere
System calls have not stability guarantee on macOS. You should use libc
instead. In general the use of syscalls directly is fairly limited.

Edit: for instance go broke once for macOS Sierra, when Apple changed the
gettimeofday system call:
[https://github.com/golang/go/issues/16570](https://github.com/golang/go/issues/16570)

~~~
zzzcpan
Linux doesn't guarantee syscall stability either. Just make sure your wrappers
can use a syscall table chosen at runtime, depending on which kernel you are
running.

~~~
wahern
Yes it does, at least in the sense that syscalls which become officially
public will never be removed from Linus' tree except in rare circumstances
(i.e. proof that nobody is using it), nor will the arguments change. This is
Linus' famous "never break user space" ABI mantra. While distributions may
deprecate and remove them (e.g. sysctl(2)) they certainly won't be assigned
new IDs. A table won't help in such cases.

~~~
caf
Exactly. This is why, for example, the original 'mmap' system call entry point
on x86 still exists, even though it is overwhelmingly likely that every
program on your machine is actually going to use the 'mmap2' entry point.

