
The Unix C library API can only be reliably used from C - jandeboevrie
https://utcc.utoronto.ca/~cks/space/blog/unix/CLibraryAPIRequiresC
======
Too
This is a fairly common problem when trying to interface other languages with
any C API, not just errno. The FFI-library of your language can usually call
straight into C-functions but then it turns out half of the C API is
implemented in macros which can't be called nor converted automatically, so
you have to be re-implement them all in the new language.

Funny because usually the reason for having a C API is to provide universal
compatibility to be called from any language. If you are doing this, please
avoid macros.

~~~
arnsholt
I ran into that at work a few years back. I was tasked with integrating
OpenSSL into the commercial Smalltalk used for the project (last updated in
1999, for more or less good reasons). A priori, not an unreasonable task as
the Smalltalk had a pretty good FFI mechanism. Except of course the
neverending cavalcade of functions documented in the OpenSSL docs that were
actually preprocessor macros. In the end I wrote a Python script that parsed
the C headers and generated FFI entry points for normal functions and
Smalltalk code for preprocessor macros that weren't simple literals. The
result was a small mountain of code, but it made all of OpenSSL available to
our application, which was great.

~~~
wruza
You’re still lucky it wasn’t like DirectDraw or something WinAPI in general. I
had to reinvent entire preprocessors and investigate implicit compiler
defaults to be able to wrap that, and still there was LOWORD, MAKE_XYZ, etc,
which were official arbitrary code-in-a-header interfaces. Preprocessor is a
scourge of an API.

~~~
jstimpfle
The main problem with C macros is their usage when a regular function would do
just fine. That goes for a lot of the cruft in the Windows headers.

Having said that, while I object to most ideas that "C is missing", I do think
that better constant expressions and expression macros (functions on the
syntactic-expressions level) would be a usability improvement in APIs compared
to (token stream) preprocessor macros. Even though in the presence of the
preprocessor, constants and expression macros would be somewhat redundant.

One advantage of expression macros is that you do macro expansion _after_
parsing to an AST. You get a valid AST parse with macro expansion if and only
if you get a valid parse _without_ expansion. Expression macros are regular
AST elements, and are scoped just like regular functions or variables. They
don't have a weird textual scoping scheme with conditional compilation or
un-/redefinability. Getting rid of that should make it easier to translate
them to other languages.

Another advantage is that they allow you to actually know which constructs in
an interface are part of the API. An expression macro is always supposed to be
used directly by the consumer of the API, or at indirectly through another
macro that is. That particular problem of the C preprocessor could in fact be
solved if headers cleaned up after themselves, but in practice, preprocessor
is just to messy so nobody does that.

~~~
youdontknowtho
Isn't this what rustlang did?

I have to admit that I'm coming around on rust. I still think that their
fanboy marketing is obnoxious...but I need to get over my penchant for finding
reasons to be angry.

~~~
steveklabnik
We have two forms of macros: one that’s basically fancy pattern matching, and
another that lets you manipulate an arbitrary series of tokens.

------
rkangel
The issue here is that you shouldn't need to interface with C, unless you need
to talk to some C Code. You absolutely shouldn't have to interface with C to
talk to the kernel.

I come at this from a Linux perspective, but to me the user/kernel interface
(sys calls etc.) should be the well defined abstraction boundary. The C
library is a useful api making that easy to use, but to dictate that people
can only make sys calls through C is silly and ends up with these sort of
problems.

~~~
amluto
There is quite a bit of variation here. Linux and, AIUI, quite a few
microkernels, have a defined ABI at the user/kernel boundary. BIOS, a bunch of
hypervisors, and some RISC-V M-mode interfaces are similar, albeit with “user”
and “kernel” replaced with other execution modes.

OpenBSD and, very notably, Windows instead have a defined library interface.
Windows has somewhat stable “system calls”, but only through NTDLL.dll. The
real stable interface is kernel32, etc.

There are arguments in favor of both approaches. One significant, if rarely
appreciated benefit of the user/kernel approach is that it enables user
programs that don’t use the C ABI at all. Go on Linux is like this, and Go and
the vDSO have interoperability issues because the vDSO is a bit FFI-ish in Go,
and Go’s FFI is not very fast.

~~~
the_why_of_y
A stable syscall ABI is a Linux peculiarity and was never a part of the UNIX
tradition.

Here's a talk where Bryan Cantrill describes the tightly coupled Solaris
kernel/libc combination as a "welded unit", and mocks the Linux "libc
ecosystem" and the complexity of ENOSYS handling in glibc.

[https://www.youtube.com/watch?v=TrfD3pC0VSs#t=29m10](https://www.youtube.com/watch?v=TrfD3pC0VSs#t=29m10)

~~~
amluto
Mock it all you like. Old static binaries, old containers, and old chroots
continue to work on new kernels. This is a big deal.

------
GlitchMr
A function internally called by `errno` macro is part of an ABI, and changing
its name will break all already compiled applications for no real reason. Most
operating systems provide stable libc ABI (exception: OpenBSD). So it's a
rather safe thing to depend on.

~~~
ainar-g
> (exceptions: FreeBSD, OpenBSD)

These are some pretty significant exceptions though, especially in the
networking-related fields. I don't want the programmes on my router to
suddenly not boot after a system upgrade. Especially when I don't have the
source code or if the compilers haven't caught up yet.

------
zozbot234
Back in the day, other languages had their own bindings to the POSIX/UNIX
API's. There are standardized bindings for FORTRAN and Ada; not sure if others
exist.

------
skissane
The complaint is that the underlying function which returns the thread-local
errno pointer isn’t a public API, isn’t standardised, and its name varies from
OS to OS (__errno or __error or __errno_location)

This specific case would be resolved if some standard (ISO C, POSIX, etc)
defined a public API to do it, e.g. errno_r

(Of course, that doesn’t fix any other cases where some public functions are
really macros to private functions, but this appears to be the most
significant such case.)

------
miohtama
The whole C interface is clunky because C does not support multiple return
values. More advanced languages should not be forced to deal with it.

~~~
rwmj
You can return structs from C functions (note, I don't mean pointers, I mean
whole structs). You can pass in structs too. It's not commonly done and is a
bit clunky compared to returning a tuple, but it is possible.

~~~
chewxy
What's the difference? My understanding is as follows: A struct is a tuple.
It's just some structured memory. Where a tuple would be indexed numerically,
a struct is indexed with names of the fields (equiv in Python to NamedTuple)

~~~
jacquesm
Tuples can pull their contents from different memory locations whereas structs
are laid out continuously in memory (modulo padding if enabled). Of course you
could store pointers in the structs you return but that would defeat the whole
'single return' concept, then you may as well return a pointer to a struct
which is ugly because it will never be thread safe and is a particularly nice
pitfall regarding ownership of that struct.

So either return the whole struct as values or, alternatively - and this is
the way most library functions do it - call the function with a pointer to an
empty struct which the function then fills. That leaves the return value for
error status.

~~~
layoutIfNeeded
> So either return the whole struct as values or, alternatively - and this is
> the way most library functions do it - call the function with a pointer to
> an empty struct which the function then fills.

Mind you, under the hood all your structs will be returned through pointers to
caller-allocated memory, unless it can fit into 8 bytes.

“To return a user-defined type by value in RAX, it must have a length of 1, 2,
4, 8, 16, 32, or 64 bits. [...] Otherwise, the caller assumes the
responsibility of allocating memory and passing a pointer for the return value
as the first argument.”

[https://docs.microsoft.com/en-us/cpp/build/x64-calling-
conve...](https://docs.microsoft.com/en-us/cpp/build/x64-calling-
convention?view=vs-2019#return-values)

~~~
jacquesm
Yes, but that's literally an implementation detail and this could be done in
many different ways. The idea here is to gain some amount of expressiveness,
if that is the goal then implementation details should be of lesser
importance.

Moving structures around 'by value' rather than 'by reference' is wasteful in
my view, but if you have to then you should no longer worry about the
underlying mechanism until optimization time rolls around.

------
xvilka
I think it is time to create the new POSIX standard, with better portability
and safety/security. Decades of experience can help to design the better API.

~~~
pjmlp
It will never happen, because POSIX is literally the UNIX part of C libraries
that did not land as part of the ISO C standard library.

------
touisteur
One of the worst offenders is statvfs with its use of __USE_FILE_OFFSET64...

------
Eikon
> There is no useful errno variable to load in your own language's runtime
> after a call to, say, the open() function

FWIW, in rust there is std::io::last_os_error().

libc::perror works too.

~~~
x3ro
The point of the article is that there's glue code needed for this, I think.

It seems like Rust also needs this, either through linker magic or extern C:

[https://github.com/rust-
lang/rust/search?utf8=%E2%9C%93&q=er...](https://github.com/rust-
lang/rust/search?utf8=%E2%9C%93&q=errno&type=)

