In fact, come to think of it, Linux is the only OS where syscalls are the official public userspace API, is it not? On all other platforms, they're an implementation detail behind the system libraries.
For instance, on Windows the syscall API is hidden and changes from release to release, but KERNEL32.DLL persists as the stable system API that you're supposed to call.
Linux desperately needs to get past the notion that libc is the only gateway to the kernel, and instead start supplying a standardized "system/kernel" user-level wrapper library. One which wraps the syscalls nicely but doesn't try to add additional language-specific functionality like printf or memcpy.
Are you sure about that? Linux doesn't have that notion, it keeps the syscal ABI stable, and anyone can use it directly, like Go does. There's no need to go through blessed syscall library, which on their respective systems are the only gateway to the kernel.
But I don't see why it's a problem, either. What's the actual benefit of Go invoking syscalls directly on Linux? That it doesn't depend on glibc? But that is only an advantage because glibc is not guaranteed to be there on Linux, the way e.g. kernel32 is on Windows. If it were, there'd be no reason to not use it.
When using syscalls directly, it does not matter. Kernel doesn't care how your laid out your userspace stack.
Except when it does: https://github.com/golang/go/issues/20427
A lot of the hardening patches come with disclaimers that some software might break, and as shown by this bug, for good reason.
I'd rather blame the compiler patches for this silly behaviour (and maybe the kernel for not documenting and limiting how much stack VDSO can use)
Yes - but it's an advantage because glibc is a compatibility nightmare. One of Go's really nice features is that you can compile a binary and it will run pretty much anywhere. That would be pretty much impossible if it linked with glibc (even if you ignore the fact that glibc might not be present).
I implemented support for this, for ELF, when I wrote the Solaris port. Other people have done it for PE and Mach-O. It is a fallacy to think that you need access to a shared library in order to link with it. That's only true for C toolchains that don't know better. For Go we have our own toolchain and it doesn't have such a restriction.
This is one of the points lost to the "why didn't you just use LLVM?" crowds. Our own toolchain allows us flexibility that simply doesn't exist with traditional toolchains.
I hope they're planning to fix that...
I found this out when trying to copy a go program I had compiled on alpine to an Ubuntu machine and got the “file not found” error from the linker. :(
Try it yourself: run ldd on a recent go binary.
Also the trickiness of having efficient process fork/exec based on vfork: http://ewontfix.com/7/
and the considerations going into Go: https://go-review.googlesource.com/c/go/+/46173/
Although I never checked so I could be completely wrong, I would expect folks shipping just kernels like SeL4 to ship stable kernel ABI rather than a libc.
Yes and this is by design, out of necessity, taught by experience. Commercial customers back in the day paid lots and lots of money, so solutions had to be found and they had to work.
With that being said, I disagree with your characterization of “some header”–anything that’s in Apple’s headers is public API, whether it has a fancy page on developer.apple.com or not. Apple has a very clear definition of what they consider to be “private”, and anything in /usr/include isn’t it.
#endif /* __APPLE_API_PRIVATE */
Maybe make it a module so space constrained systems can leave it out.
The kernel patch process could keep everything nicely in sync and native build processes would easily find the right source code. Cross compiling would require you to find and copies though.
But then you run into the problem of how willing to commit to ABI stability in a non-posix API. I assume there’s some commitment to such at the moment, but how much also depends on *libc abstracting them? Libcs always seem to (to me) be approximately kernel version specific
Linus has stated as such and I think a few other maintainers agree there. If you don't find the problem until years later, chances are, too few people care.
Even when you want to call some INT 21h service which doesn't have a C function, it is more common to use interrupt functions like int86() or intdos() than to use inline assembly. (Or something like __dpmi_int from 32-bit code, such as DJGPP)
So, I don't think MS-DOS is as different from Linux/Unix as you think. High-level language code on MS-DOS (whether in C or Pascal or BASIC or whatever) usually doesn't directly invoke INT 21h, it goes through higher-level libraries / wrapper functions. Only software written in assembly tends to invoke INT 21h directly.
Or anybody that wasn't satisfied with the high-level libraries. We just looked up the functionality we needed in Ralf Brown's list.
 http://www.ctyme.com/rbrown.htm (INT 21h http://www.ctyme.com/intr/int-21.htm )
Apple platforms manage to support the idea of a “deployment target” and binary compatibility; the new symbols are weak-linked. Broken old behavior is preserved with linked on-or-after checks.
Not sure what makes it so difficult for glibc.
What are the technical reasons that glibc cannot adhere to the Linux dogma, "Don't break userspace?"
Since glibc does not adhere to that dogma, why the decades-long reluctance to add certain syscall wrappers? If they screw up and make a bad interface just modify it and bump the version number.
I just waded through the lwn cross-purpose-writing-festival comments and did not see them answered.
I don't know if they have something like that officially, but in practice, they do follow it. Programs linked to an older version of glibc continue working with a newer version of glibc, in a large part thanks to symbol versioning, which allows them to keep the old versions of an interface available to old binaries, while new binaries get the new functionality.
> If they screw up and make a bad interface just modify it and bump the version number.
Bumping the glibc version number would mean recompiling everything (a program can't have two versions of glibc at the same time, so all libraries a program links to would have to be recompiled); we had that in the libc5 to libc6 transition last century. And since they won't bump the version number, it means they will have to keep the bad interface forever, even if it's just visible to binaries compiled against an older glibc.
For a recent example in which they actually went ahead and removed a bad interface: https://lwn.net/Articles/673724/ and https://sourceware.org/bugzilla/show_bug.cgi?id=19473 -- and according to the later, they did it in a way which still kept existing binaries working.
If that's true then I don't understand ldarby's comment on the article:
> The common problem that I suspect Cyberax is actually moaning about is if software uses other calls like memcpy() which on centos 7 gets a version of GLIBC_2.14:
> readelf -a foo | grep memcpy
000000601020 000300000007 R_X86_64_JUMP_SLO 0000000000000000 memcpy@GLIBC_2.14 + 0
3: 0000000000000000 0 FUNC GLOBAL DEFAULT UND memcpy@GLIBC_2.14 (3)
55: 0000000000000000 0 FUNC GLOBAL DEFAULT UND memcpy@@GLIBC_2.14
> and this doesn't work on centos 6:
> ldd ./foo
> ./foo: /lib64/libc.so.6: version `GLIBC_2.14' not found (required by ./foo)
Just to be clear, my original question is why glibc technically cannot follow the same exact development model of the Linux kernel for retaining backward compatibility.
The example above is actually a great example of bending over backwards to keep compatibility with broken userspace. Some programs incorrectly called memcpy with overlapping inputs, and an optimized version of memcpy started breaking these programs. Instead of just letting them break, the older symbol was kept with a slower implementation which accepts overlapping inputs, while new programs get the faster implementation at the memcpy@GLIBC_2.14 symbol.
On some platforms, new instructions came out and allowed the glibc maintainers to write a faster version of memcpy(3) that still satisfied its documented interface, however it did not retain the undocumented behavior of allowing overlapping memory ranges.
So without symbol versioning we are given the options to either: 1. All be stuck with a slow memcpy(3) forever; or 2. break glibc user's code
Neither of those options were great. So the glibc maintainers decided to write a new function, let's call it "memcpy_fast()". But how do you get everyone to use it ? Symbol versioning is the answer here. At compile-time linking, there is a directive that tells the compiler that the current implementation of memcpy(3) is "memcpy_fast()", and that's the symbol that gets embedded into the executable's symbol table. If your code wasn't compiled against a glibc where this was available, you'd be using an older implementation. This gets you the best of both worlds: 1. Existing binaries continue to work (without the upgraded code path); 2. Newly produced binaries use the upgraded code path, and in theory are tested to ensure that they are working.
This does prevent executables from being compiled against a newer glibc than intended to be executed against... but, so what ? Linux ALSO doesn't guarantee that you that newer call semantics are available in older systems. The solution here is to either specifically indicate that you want the unversioned symbol, or compile against the lowest version of everything you wish to support. glibc is far from your biggest problem here given the ABI stability of many libraries.
1. Use macros: Side-effects
2. Just expose a new, unversioned symbol: Nobody will use it, you'll have to document it, it'll be a platform-specific call. If people do use it, then their binaries can't be used on older platforms (just like symbol versioning)
 The symbol is referred to as memcpy@@GLIBC_2.14
The end result is that the users of GNU/Linux will always draw the short end of the stick.
But AFAICT the glibc dogma is based on the premise that it would be impossible for a large, complex project to have backward compatibility without making regular changes to the extant interfaces that it provides. Given that premise glibc devs seem to have some process for figuring out what "correctness" means for time=now and then noodle around with their interface to reflect that correctness in the next version of the lib. Thus symbol versioning is employed.
At the same time, Linux is a large, complex project with backward compatibility which does significantly less noodling around with the extant interface. AFAICT the process consists mainly of a) devs breaking the extant interface for correctness, b) a user submitting a bug, and c) the lead dev surrounding the declarative sentence "We don't break userspace" with imperative sentences containing curses and then rejecting the change.
I've read where Linus and others have tried to defend their choice and argued that the glibc dev process is worse. Regardless of the persuasiveness of that argument, I've read it and am familiar with it.
I am not familiar with the glibc argument as to why they require regular interface changes, nor an acknowledgement that a closely connected large complex project gets by without that. I don't see anything on the glibc FAQ about it-- only a question about symbol versioning where the answer assumes that the interface must change.
Yeah well tell that to the engineers of HP-UX, IRIX, and Solaris, because all of those managed to produce libc's which were backward compatible. Sun Microsystems even legally warranted Solaris and therefore libc, they were that paranoid about backwards-compatibility.
That's not the issue. The issue is that glibc is developed by people who are not and never were system engineers and instead of learning from the masters, asking them how to do it correctly, or sticking with BSD when its situation was dire, they just decided to re-invent the wheel.
One does not simply re-invent glibc from first principles, especially so if one does not have the requisite insights and experience, which they didn't and they still don't, and most likely if they haven't by now, never will. GNU developers are a lost cause. Just look at how long it took them to "discover" versioned interfaces with linker map files, something Solaris system engineers have been using since the early '90's of the past century, and everything becomes crystal clear, if one knows the Red flags. That's one Red flag right there, "late in phase and unlikely to ever catch up".
Take a look at the system call table:
#define __NR_oldstat 18
#define __NR_oldfstat 28
#define __NR_oldolduname 59
#define __NR_oldlstat 84
#define __NR_olduname 109
#define __NR_dup3 330
#define __NR_pipe2 331
#define __NR_preadv2 378
#define __NR_pwritev2 379
The main difference is that, instead of defining a new symbol, a new system call number is defined. The effect is similar: a program using the new "stat" system call (106) won't work on an older kernel which doesn't have it, while on the opposite direction it still works (new kernels still understand the old system call).
One thing the kernel developers do nowadays to reduce the API churn is to add a flags argument to every new system call (for instance, the "dup3" above is the same as "dup2", but with a flags argument). Even then, if you try to use a flag which the current kernel doesn't know, it won't work (the kernel developers learned the hard way that you can't ignore unknown flags, since programs will pass them and then break on newer kernels).
And that's without considering the "escape hatches" of ioctl() and fcntl(), or the virtual filesystems like /proc and /sys, which are also part of the Linux kernel API. So yes, the Linux kernel does see regular interface changes.
Here are two different types of backward compatibility:
1. will old binary work with the new version?
2. will old code build and run correctly with the new version?
So when I talk about extant interface changes, I'm speculating that old code that leverages the public Linux interface is more likely to work and work correctly vs. old code that leverages the public glibc interface.
For example: suppose foodev built a Linux driver for a very popular piece of hardware in 2003, abandoned it, and in 2018 there are problems getting it to run correctly. Are those problems more likely due to Linux public interface churn or glibc public interface churn?
About leveraging one interface or the other: when you use the glibc interface, you are also using the Linux interface behind it, so a change in either can affect your program. On the other hand, if you are using the Linux interface directly instead of going through the C library, chances are you are doing something unusual, which increases the risk of it breaking by accident. And there are some things which exist only on the glibc interface, like nameserver lookups (getaddrinfo), user database lookups (getpwnam/getpwuid), and many more.
This is unthinkable on Solaris / illumos kernels because of DDI / DDK interfaces: I can take a driver from 1993 for Solaris 2.5.1 and modload(1M) it into the latest nightly illumos build and I'm guaranteed that it will work.
Whenever a person has to roll out their own handler it will almost always undergo less testing and auditing. The article points gettid which for at least 10 years required your own method to use, and the comment section for the article points out that the getpid call had caching that was bugged for a long time.
Having no glibc implementation of a syscall affects its usage and the total number of people knowledgeable about that function, so it would be a perfect place to look for bugs and security.
In the nearly reverse case a poor implementation of a glibc handler might be doing something that would allow an attacker to take advantage. The same applies where the glibc and syscall functionality differ, an aspect only part of the syscall might be undertested.
Plan 9 has very few syscalls compared to Linux and the BSDs.
Inferno doesn't have system calls.
And: do you really want to spend the rest of your professional career wrangling with a shoddy product, or do you want to actally do professional, cutting edge IT?
I can't write for you, but I did not graduate computer science at the top of my class so that I could spend the next several decades working with / on the shittiest, amateur knock-off copy of UNIX when I could run the real thing for free & cheap. That's not why I studied at a university and got a degree for. How about you, what's it gonna be, shitty Linux for the next 20-30 years or the real computer science with SmartOS or FreeBSD?
>"In such cases, user-space developers must fall back on syscall() to access that functionality, an approach that is both non-portable and error-prone."
I understand about portability but can someone elaborate on why using syscall() is inherently error-prone?
That being said, you can still at least always call syscall(2).
That's just generic kernel ABI: syscalls can have at most 6 arguments: https://elixir.bootlin.com/linux/latest/source/include/asm-g...