Hacker News new | past | comments | ask | show | jobs | submit login
C Isn't a Programming Language Anymore (gankra.github.io)
379 points by jasonpeacock on March 16, 2022 | hide | past | favorite | 326 comments



C's most infamous parsing difficulty is:

    A * B;
Is it A multiplied by B, or B declared as a pointer to A? It can't be resolved without a symbol table. But there's another way that works very well: it's a declaration. The reason is simple. The multiply has no purpose, and so nobody would write that. C doesn't have metaprogramming, so it won't be generating such code as an edge case (unless using the preprocessor for metaprogramming, in which case you deserve what you get).

But there's a worse problem:

    (A) - B
Is it A minus B, or casting negative B to type A? There's just no way to know without a symbol table. One might think who would write code that has a vacuous set of parentheses around an identifier? It turns out they don't, but they write macros that parenthesize the arguments, and the preprocessed result has those vacuous parentheses.

D resolves both issues with:

1. if it parses like a declaration, it's a declaration

2. a cast expression is preceded by the keyword `cast`

and D is easy to parse without a symbol table.


I don't see why resolution without a "symbol table" is a big deal. Knowing whether A is a type or not resolves these fairly easily. And in well written code, it should generally be obvious whether A is a type or not so it should not be a readability problem either.


Because a quite different AST is built for the second case depending on whether A is a type or not. And you can't tell whether A is a type or not without a symbol table.


I think the two of you are talking past each other. You're saying, "yes, you need a symbol table", and he's saying, "yes, but a symbol table isn't very hard to do".

Personally, I wonder how many of these things would go away if pointers were a suffix operator with a different character (maybe the @ sign), and if casts looked like function calls.

   A B@;   # B is a pointer to type A
   A(-B);  # A is either a function or a type for casting
           # Same AST regardless


As for symbol tables being hard to do, notice that C does not allow forward references. Supporting forward references while relying on the symbol table to drive the parse winds up with unresolvable problems.


I'm not sure I understand your point. I don't think we're disagreeing about anything.

C needs forward declarations for some things, and it needs a symbol table to resolve some parts of the grammar. All I was saying was that I think you could resolve both of them with minor changes. (I see that D has a "cast" keyword, and that's obviously one way to do it.)


> A B@; # B is a pointer to type A

IIRC that's how Pascal did it, using a caret ^ for denoting a pointer.


Kind of.

A pointer to type A is:

     var B: ^A
The parser knows it is a declaration because of the var, and it knows ^A is the type, because of the colon. That it is a caret does not really matter here


Yeah, now only if Pascal had used curly braces, all could be right in the world :-)


I don't know about Pascal, but AT&T allowed anyone to write a C compiler without needing a license. (C++, too, I know because I asked AT&T's lawyers.)


Arh.... But then it wouldn't be Pascal ><


Yes! As he said - all would be right with the world =)


It was more about AT&T monopoly I believe :)


It's a problem because it means your editor will have a very hard time parsing your file without analyzing a lot of other files first.

The grammar of your file depends on previously declared symbols. But which symbols have been declared depends on the header.h files you import. But which headers you import depends on the -I options you give to your compiler. Except there's no standardized way to express what -I options your project uses, and they might change depending on your build profile.

Any modern text editor can give you good syntax highlighting for a Rust file or a Go file basically as soon as it opens. When it opens a C/C++ file, it has to do a lot of guessing.

(This is not conjecture, by the way. I tried to integrate clang's implementation of the Language Server Protocol in Atom for my end-of-studies project, and it was not fun.)


It’s annoying when parsing c. Also the situation is worse in c++ since * can have side effects!


You're right, and C++ is hopeless to parse without a symbol table in other ways. Later versions of C++ added keywords to disambiguate, but of course there's legacy code.


Because pretty much the whole science of parsing is built for context-free languages. Yes, of course, you can ignore science when you do your thing, but please don't call yourself an engineer, then ;).


And as an engineer you can slap on a symbol table and it works like a context free language after that.


Mainly because most parser generators don't have a feature for disambiguation based on the application providing hints based on symbol table lookups.


It seems like you maybe misread their comment as referring to human parsing of code, rather than writing a parser program? I can’t make much sense of this otherwise.


This is because there would be extra memory involved, and this could make it hard to deploy on low memory environment. Consider the historical background of C, in the late 70s' to 90s' and you'll see why.

But thanks to Moore's Law and the hedge by the Writh's Law, you have significantly more powerful hardware yet not so faster software, because we started to deploy languages worse than C: C++. C++ with template, is Turing Complete, and this means every template expansion is potentially undecidable, meaning it could run forever, where most compiler turned a blind eye to by putting a "recursion depth". Having templates, even without "recursion depth" alone consumes even more memory than C, and so that's why we complaint about C++ compilers are memory boggling, while C has (relatively) low memory overhead. It's just that times are different and the perception about memory use is not that apparent anymore.


The problem has nothing to do with extra memory, it’s just that C parsers are more hacky than others due to the C language not having a proper context-free grammar.

Also, I frankly don’t get why templates are “bad” according to you. They allow for proper metaprogramming not relying on ugly hacks like memory layout conventions, etc.


I'm not saying templates are bad, well maybe my wordings are ambiguous but I do find template metaprogramming too complicated to begin with. Yes I'm fascinated by many things TMP made possible (such as std::tuple, std::bitset, Boost.ASIO, Boost.Hana), but also because of this, to this date I still cannot commit my idea of creating my own C++ compiler.


It’s both.


You need those symbols tables anyway though


> But there's another way that works very well: it's a declaration. The reason is simple. The multiply has no purpose, and so nobody would write that.

But there's another way that works very well: it's a multiplication. The reason is simple. A declaration of a variable that isn’t used has no purpose, and so nobody would write that.

As I think you know, C doesn’t handle this case by guessing that it must be a declaration. Its lexer looks in the tables that its parser creates to check whether a type called ‘A’ is in scope (https://stackoverflow.com/questions/41331871/how-c-c-parser-...)


> The multiply has no purpose, and so nobody would write that.

A fellow named Bjarne Stroustrup fixed that bug, though. In the plus plus dialect of C, A * B could reboot your system, without any #define macros for A or B.


> My problem is that C was elevated to a role of prestige and power, its reign so absolute and eternal that it has completely distorted the way we speak to each other. Rust and Swift cannot simply speak their native and comfortable tongues – they must instead wrap themselves in a grotesque simulacra of C’s skin and make their flesh undulate in the same ways it does.

Lol at this. Love this eloquent style, and pretty much agree. However, C's reign is not absolute. Virgil compiles to tiny native binaries and runs in user space on three different platforms without a lick of C code, and runs on Wasm and the JVM to boot. I've invested 12+ years of my life to bootstrap it from nothing just to show it can be done and that we can at least have a completely different userspace environment. No C ABI considerations over here. It can be as romantic as just you and the kernel in Virgil land. Heh.


But how would you link a library written in Rust to a Virgil program? Or vice versa?

That's the real problem the author is ranting about. If you've solved that, too, then I think that the author's article isn't the only Rust team member who'd like to talk to you.


> But how would you link a library written in Rust to a Virgil program? Or vice versa?

Sounds like that's Rust's problem, or maybe Virgil's? Why is it C's problem?

If Rust is going to "replace C" for systems work, as the more grandiose claims would have it, it's going to have to replace all of C.

It's not like writing a POSIX-like system is that hard. It's a lot of work, sure, but it's been done multiple times, by unpaid volunteers at that.

This complaint basically boils down to "We can't/won't/haven't reimplemented this low-level stuff in our whizzy new language, and that's somehow C's fault".


Your argument reads like: you shouldn't need to run Rust programs unless _the entire operating system_ is first rewritten in Rust. I doubt you really mean that, though...


No. More like "You shouldn't whine about the C infrastructure unless you're planning to replace it with something better."


C is only used for FFI because it provides a well-defined, stable interface for function calls that can be easily implemented by another language.


The article argues against every word of this statement.


The article is wrong.


C definitely doesn't provide a well-defined stable interface. Specific ABI definitions might, but there are 176 of those, and implementing all of those in another language is decidedly not what I'd call "easy".


C provides an API, not an ABI.

The compiler provides a mapping from the stable, (relatively) portable, well-defined API -> the unstable, non-portable, ill-defined ABI of the host system.


It's been done multiple times, by small teams of unpaid volunteers.

"Easy"? Depends on your definition of "easy". Doable? Absolutely. But complaining won't get it done.


In most cases, I'd think those small teams support like a max of 5 ABIs, out of the 176.

Of course it's "possible". Of course, complaining about it won't get it done. But I can easily envision an alternate universe with the same OS & CPU architecture variety, that has the effort required for doing this be ten times smaller, and I don't see it being utterly pointless to ponder the viability of that if only just for fun (which the swearing in the article hints to me it is).

I, as an author of a small language implementation, would really prefer to not need to implement more than one C FFI ABI, much less 176. I'd meet somewhere in the middle, but 176? nah.


> If Rust is going to "replace C" for systems work, as the more grandiose claims would have it, it's going to have to replace all of C.

... and people complain all the time about Rust fanatics talking about rewriting everything in Rust.


Yeah, they do talk about that a lot. They talk a lot about a lot of things.

They just don't ever do any of them.


And they say romance is dead lol


Anyway to build a GUI app?

Definitely looks like a cool project. Can you write an OS in it, for microcontrollers ?


Hmm, very interesting question. I think you could interface with X windows via IPC, but you'd have to build that IPC layer, which ultimately boils down to pipes/sockets. So that route would be doable, but probably a steep climb.

Another option would be to compile to Wasm and then import GUI-related functions that are then implemented in the Wasm host environment, e.g. JavaScript.

As for microcontrollers, that was in fact, Virgil I, circa 2006. :) My prototype compiler back then generated C code and then you could compile with avr-gcc. Nowadays, I don't have a C backend or native backend for microcontrollers, so it'd require a new backend and a new exe format. Or maybe again, try to compile Wasm to AVR.


Are you planning to do an arm64-darwin port for Virgil?


Yes, this would be very useful. I have it listed here [1] and am hoping to get either a student or a volunteer to do some work there.

[1] https://github.com/titzer/student-projects


This article is hard to parse because it doesn't present any alternatives that the author considers better.

- You have to speak the C ABI to talk to the OS -- yes, if you want an OS to support more than one language, or you want multiple languages within a single process, you need a common ABI of some sort. What is the alternative that doesn't involve a common ABI?

- C has too many ABIs -- yes, there are a lot of different kinds of hardware, and sometimes multiple software ecosystems evolved in parallel on the same hardware (eg. Windows and Linux). What is the alternative?

- ABI changes are hard to make in a non-breaking way -- yes they are. It's fundamentally a hard problem. You could compile everything from source every time, but the open-source world seems to have decided that there isn't enough time or CPU power to "live at head" and build from source every time, like Google does: https://abseil.io/about/philosophy#we-recommend-that-you-cho... (Disclosure: I work at Google).

- "You Can’t Actually Parse A C Header" -- this is, IMO, the most actionable objection that the article raises. You could imagine a subset of C that does away with macros, typedefs, etc. in order to be easier to parse, and in doing so forms a more accessible ABI specification language. That seems eminently doable. But parsing is only part of the overall battle: actually implementing the ABI for 176 triples seems like the bigger problem. Especially when you add the ABI-impacting function attributes mentioned in the tweet.

As a C fan, it's hard to see the language itself blamed for what are (in my view) just fundamentally difficult problems.


I don't think the author blames it on C. C didn't choose to become the ABI of 99% of the world. But yet, it is.

> You have to speak the C ABI to talk to the OS [...] What is your alternative that doesn't involve a common ABI?

Emphasis is on the _C_ ABI. The alternative is something that at least tries to be designed as a universal ABI.

> there are a lot of different kinds of hardware, and sometimes multiple software ecosystems evolved in parallel on the same hardware

If there was an explicit ABI standard though, evolving ecosystems would have to go through the standardization and thus you just couldn't get conflicting ABIs for the same thing.

If such standardization could cut the number of ABIs in half, it'd still be a huge win. And I think it'd be possible to go much further than that. Furthermore, it may be possible to split orthogonal things apart, such that instead of e.g. 50 targets, you have 5 OS targets and 10 arch targets, each specifying a separate part of the ABI. (current ABIs obviously do this to some extent, but not in any specified & structured way)

> As a C fan, it's hard to see the language itself blamed for what are (in my view) just fundamentally difficult problems.

I also like C. I agree these are hard problems. But I don't think they're problems that C should be dealing with. But we're pretty much just stuck with C.


> If there was an explicit ABI standard though, evolving ecosystems would have to go through the standardization and thus you just couldn't get conflicting ABIs for the same thing.

But there are lots of reasons why projects/companies might want to create their own ABIs, or change existing ones. You seem to be suggesting that if an ABI standards body existed, that all vendors would be following a minimal and unified set of ABI standards. But the very examples given in the article contradict this vision. Arm64 does have an ABI standard (https://developer.arm.com/architectures/system-architectures...), but Apple chooses to deviate from it anyway with the aarch64-apple-* set of targets: https://developer.apple.com/documentation/xcode/writing-arm6...

I don't think this has anything to do with C per se. Whether it was the "C ABI" or the "Standard ABI (totally separate from C)", vendors would have incentives to do their own thing. Especially since this stuff all sits so close to the hardware, and the ABI has a significant impact on performance.


I'm not saying all would follow it, but at least they'd have to try to, or risk being poorly supported (unless they're already at the top of the food chain, but then they'd already just be the ones dictating new ABI standards anyway). The goal isn't to eliminate all variance, but to constrain the amount to which that is done, and have separated parts such that orthogonal problems can be dealt with separately.

I'd be surprised if a ton of applications in the wild aren't violating Apple's register x18 requirement, especially given that, as far as I can tell from searching, applications can just freely use it on macOS+Apple Silicon.

In my mind, an ABI standard would have some 3 parts - object layout on the heap, foreign function calling convention, and background requirements/standards (though i'm possibly missing some important part). The Apple changes would only affect the latter (assuming function calling has at least one volatile non-argument register, which could be reused for that). And object heap layout has pretty much no business being ever changed. (importantly, the first two would be only for cross-language communication; your language itself could do structures & calls however it likes)

edit: it seems that the ARM ABI itself names r18 "The Platform Register", allowing ABIs to change its meaning, and recommends programs to not use it if at all possible, so Apple's restriction isn't even that unreasonable.


> I'd be surprised if a ton of applications in the wild aren't violating Apple's register x18 requirement

macOS outright zeroed x18 on each context switch for a bit, and it's zeroed on each context switch as part of the Meltdown mitigation for older SoCs in iOS.

On Windows, x18 is used as the thread environment block pointer.

On Android, x18 is not accessible to third party apps as part of the ABI. It is reserved for the shadow call stack.

(tldr: on most OSes x18 is explicitly reserved by the platform)


see my edit - the ARMv8 ABI even specifies x18 to be possibly platform-defined. So those aren't unreasonable, and I'm probably overestimating how many things misuse it. But apparently that list includes GNU MP[0], and it's easy to view this problem as stemming from there not being a concrete ABI. (though, restricting the use of x18 on platforms that don't need it reserved could be pretty wasteful)

[0]: https://gmplib.org/repo/gmp/rev/5f32dbc41afc


> Emphasis is on the _C_ ABI. The alternative is something that at least tries to be designed as a universal ABI.

That's not an either-or. E.g. WinRT is an ABI that is specifically designed as cross-language, but it can be reduced to the C ABI (e.g. vtables are basically structs of function pointers etc).


Sorry but "we need a new universal ABI standard." asks for https://xkcd.com/927/


I mean, yes, right now, it has ±0 chance to become a thing. Still doesn't make status quo any better, and does not at all change that such a universal ABI probably could be much better.


It's definitely a risk, but also the wide variety of ABIs are not currently competing, except in a few niches. There are no existing comprehensive standards I could support, as far as I'm aware.


> "You Can’t Actually Parse A C Header" -- this is, IMO, the most actionable objection that the article raises.

You don't need to parse C headers in order to use the same ABI. You can read DWARF info which is designed to convey type definitions, function/method prototypes etc. in a machine-readable way. It's already there, no need for a separate "IDL" standard at all.


> You could imagine a subset of C that does away with macros, typedefs, etc. in order to be easier to parse, and in doing so forms a more accessible ABI specification language. That seems eminently doable. But parsing is only part of the overall battle: actually implementing the ABI for 176 triples seems like the bigger problem. Especially when you add the ABI-impacting function attributes mentioned in the tweet.

A more reasonable option would be to have a better-defined format to describe the ABI (let's call it the Depandable Wanted Abi Recall Format) and have the C compiler output that DWARF for each header. Then other languages that want to make use of that ABI only need to parse the intermediate format and don't need to care about all the ugly details of the source language.


> You could imagine a subset of C that does away with macros, typedefs, etc. in order to be easier to parse, and in doing so forms a more accessible ABI specification language.

A start? https://github.com/cil-project/cil


> This article is hard to parse because it doesn't present any alternatives that the author considers better.

Much of the content posted on HN nowadays is like this.


My Google SoC project was writing a c++ bindings generator for Common Lisp: https://lwn.net/Articles/147676/ (In my head this was ten years ago, but apparently it‘s nearing 17. Shit I’m getting old.)

C isn’t ideal, but it’s actually not so bad and it could be worse (it could be C++). Yes parsing C is non-trivial, but that’s true of every language. These days you have libclang and a dozen other decent C parsers. Back in my day we had to use a hacked up version of GCC (and we liked it).

Also, C doesn’t have a standard ABI, but every real world platform defines a C ABI. And it’s pretty simple. Meanwhile trying to handle all of the cases of C++ vtables took up weeks of my life (and I ended up shipping without fully supporting multiple inheritance, which is stupid anyway).

The bigger problem for writing FFIs is, in my opinion, memory management. That’s where it gets really hard to paper over for the binding user that you’re talking to C.


> Yes parsing C is non-trivial, but that’s true of every language.

Have you heard of our lord and savior LISP?


You can’t even tokenize Common Lisp without being able to execute Common Lisp (read macros). :D


Probably?

> My Google SoC project was writing a c++ bindings generator for Common Lisp


> Yes parsing C is non-trivial, but that’s true of every language.

Not sure about this. C is ambiguous without a symbol table, and the preprocessor is necessary for populating that. Parsing against all of the correct headers (e.g. libc, kernel headers) with correct macro values is tough.

Most languages' concrete syntax treat type references in an unambiguous way (e.g. separating them from values with `:`), which is just much easier.


SoC = Summer of Code, not System on Chip for anyone else who was confused at first :)


I thought they meant Security Operations Center


The O would be capitalized if that were so. ;)


This is a strange complaint that seems to reduce to C not having a well defined ABI.

Of course it doesn't. C implementations do. This isn't really any different than most other languages, but feels different because C doesn't have a blessed implementation that all other implementations must interact with.

That's a strength. It means C is found on esoteric microcontrollers as well as powerful modern desktops. That wouldn't work as well as it could if the ABI were uniform on all targets and implementations.

And yes, it can act as a protocol. It's the simplest way to access host ABI communication without understanding it.


Many of these implementations are needlessly weird and underspecified. C happens to work despite these quirks due to sheer amount of hours everyone puts to work around C's critical but vague notion of an ABI.

The fact that the biggest compilers don't quite agree on the ABI for the most popular desktop CPU is not a success of C's portability or flexibility, just a historical quirk caused by the C language moving slower than evolution of the hardware and needs of operating systems.


It's not a quirk of C; it's a failure of the vendors and their partners to appropriately coordinate. This outcome happened _despite_ the best efforts of toolchain developers to provide harmony.


The C standard working group is the avenue for vendors/partners to coordinate. Whenever the standard settles on "implementation defined" that's a confirmation that the vendors failed to reach an agreement (often because the standardization effort happened too late, and they've already implemented diverging solutions, and nobody wants to break their userbase).


An operating system has an ABI. Programs compiled from C source use the OS's ABI. There is no "C ABI".

The very premise of TFA is fundamentally incorrect, and there is no way to reach a valid conclusion from an incorrect premise.


I think your argument kinda agrees with TFA, as it's stating that C's nature as the defacto ABI for all interoperability on all operating systems is in conflict with it trying to, at the same time, still be a programming language. It would be very nice to have a non-C ABI.


> C's nature as the defacto ABI... It would be very nice to have a non-C ABI.

> An operating system has an ABI. Programs compiled from C source use the OS's ABI. There is no "C ABI".

I'm having difficulty getting these two ideas to reconcile.


On many platforms, such as FreeBSD and macOS, the application-exposed ABI for tasks such as "open a file" or "validate this x.509 TLS certificate" mandates that you link to a shared library and call into it using the C calling convention. (libc and Security.framework, respectively.)

Linux is unique in that you can hand-write assembly to perform a system call and have it work across OS upgrades. FreeBSD does not have stable syscall numbers: open() must be a libc call.


The application-exposed ABI for those tasks mandates that you execute a few assembly instructions to make a syscall (and perhaps also do some work that the libc normally wraps around the actual syscall for you). ABIs deal with machine instructions, that's why the "binary" is in there.

The OS-provided API for the tasks are libc and Security.framework. Do you have any suggestions for a better API?


The complaint seems to be less that C as a whole has a lot of ABI implementations, and more that those individual implementations themselves are also not very well-defined, even on the biggest platforms.

That is, in principle the strength of supporting a wide variety of platforms has absolutely nothing whatsoever to do with the quality of implementation on those platforms.


Linux and Windows have had rock solid ABIs for decades now. Just about any system-appropriate 32bit binary will run on either, if the necessary system libraries are present.


> Linux and Windows have had rock solid ABIs for decades now

Except they don't. The OS doesn't have an ABI at all, the specific complier does. That's why there's unique triplets for windows for whether or not you're compiling for msvc or gnu, as in x86_64-pc-windows-gnu vs. x86_64-pc-windows-msvc. Both are x86_64 windows, yet they are different ABI targets.

And even on MSVC windows alone there's not even a single ABI - there's __stdcall and __fastcall which change the ABI on a per function basis, and compiler options that change the default convention ( https://docs.microsoft.com/en-us/cpp/build/reference/gd-gr-g... ). So to figure out how to invoke Windows' "rock solid" ABI you need to not only know the header & compiler used, but also the compiler parameters.

Similarly, again per the article, clang & gcc on x64 Ubuntu 20.04 can't actually call each other reliably. They don't agree on a few things, like __int128 which is actually documented explicitly by the AMD64 SysV ABI!


Again, this really doesn't seem related to the article. Nobody is saying those ABIs are making breaking changes, but that they are poorly documented and full of edge cases that make it difficult to reliably produce one of those system-appropriate binaries, short of just giving up and pulling in the platform's entire C toolchain.


Outside of only the most trivial microcontrollers you're going to find edge cases numbering proportional to the complexity of the system. Software complexity is hard.

I'm not sure what's the goal in complaining about that. Just venting?


Just because it's hard doesn't mean it's already been done in the best possible way. Finding and documenting flaws (or "venting") is a great step towards improving.

Seriously though what on earth is your point here? This is an incredibly thorough article on an interesting, difficult subject, with a lot of room for different technical choices. Do you just not like the tone?


The author failed to prove that the current way isn't sufficiently good, or even that C is particularly bad in its role. They've only maligned its use and failed to appropriately consider why it's used.


Esoteric microcontrollers don't need an ABI. ABIs are for static and dynamic linking, and in the microcontroller world, there isn't a whole lot of linking going on. It's very common to statically compile everything into one single flash image, with libraries being in source form. E.g. this is the case in the Arduino ecosystem.


I don’t think there is anything wrong with the single compilation unit approach, but every embedded project I’ve been on on the last 10 years has used static linking, which doesn’t preclude ending up with a single flash image. 10 years is about the last time I worked on a non-arm based project, so that could be a factor (the pic compiler I was familiar with at that time did not have a linker, and I’m not sure if the avr compiler I used did)


They still have an internal ABI used when static linking; aligning structures, laying out data sections, function call behavior, and so forth. C's flexibility/ambivalence in this regard is why it's preferred on those architectures.


One could consider static linking in this scenario just an optimization for incremental builds with separate compilation of source files; it's not really used to, e.g. link a library that has no source. In that sense, it's just an implementation detail that's leaked from a build system that notionally recompiles from source but has unwisely exposed the units of its caching system.

Microcontroller programs are also pretty small; it's not difficult to load the whole program into memory on a workstation and compile the whole thing together. This is how Virgil (circa 2006) worked.


You still have to make choices that may conflict with languages that strictly define their internal binary representations.

And I definitely have received .o files from vendors.


> This is a strange complaint that seems to reduce to C not having a well defined ABI.

The complaint isn't that C lacks a well defined ABI, it's that e.g. x86_64-unknown-linux-gnu potentially lacks a stable ABI.


x86_64 linux has long had a stable ABI. Any given target for C is as stable as the host developer wishes it to be.


This is less about stability and more about clarity. The most unchanging and unbroken ABI in the world can still be a huge mess of undocumented special cases only really accessible via the combination of the platform's C compiler and its headers.


This. The intention for C was never to have an immutable, well defined super-portable ABI so that all the people who like to feel superior for not using C can paper over their C use in comfort. It just so happened that C is so simple that it is some 80% along the way to an ABI between systems.


I'd argue it's more like only 40% of the way to a proper ABI; sizeof(long) varying on the same CPU architecture is just horrible. Sure, it wasn't meant to be an ABI, but, well, it is. And we're stuck with it. And oh how nice would it be if we weren't.


> sizeof(long) varying on the same CPU architecture is just horrible.

On the other hand, `sizeof (int8_t)` is the same size no matter which architecture you are running on.

If you, as the programmer, are depending on the bit-width of an integer type, C has got you covered mostly.


i mean, if sizeof(int8_t) varied, noone would even try to use C as a lingua franca. I, as the programmer, would never expose unknown width integer types, but other things still will, and I'll be forced to interact with them. Allowing such things to even be present in an ABI is just horrible.


> On the other hand, `sizeof (int8_t)` is the same size no matter which architecture you are running on.

EDIT: See account42's reply below.

ORIGINAL COMMENT: No, it's not. CHAR_BIT is not required to be 8 and sizeof (int8_t) is not required to be 1. Hell, int8_t is not even required to be present in <stdint.h>, but int_least8_t is.


CHAR_BIT is required to be at least 8, int8_t is required to be exactly 8 bits if present, sizeof(int8_t) is required to return an integer size where sizeof(char) is 1. Hence, if int8_t is present then CHAR_BIT must be 8 and sizeof(int8_t) == 1.


To add a teeny bit to this, POSIX specifies CHAR_BIT must be 8, but C defines it as >= 8.


huh, I guess that's still not hard-coded. But a non-8-bit-byte system probably runs near nothing to interface with anyway, and would definitely need a completely separate ABI in any scenario.


This feels exactly like why Microsoft (and others) during the 90s started to define a well-defined subset of C, with some fancy IDL stuff, and a C-like ABI where all the C stuff needed could also be generated from the IDL, as standard somewhat-object-oriented interfaces like COM and other DCE RPC-likes.

Of course, the POSIX-likes never adopted this, whereas Microsoft nowadays has some fourth generation of this IDL stuff (WinMD) that they are also slowly porting all the old C API definitions to (see win32metadata, also used for defining stuff like the Win32 package for Rust).

Also, of course, this all has its own issues too, for one COM's definition of reference counting is a bit picky, and there were a lot of advanced 'implicit RPC' features that were also more inherent footguns, but at least it doesn't involve what is ranted about here.. mostly.


If you reach a bit over into IPC/RPC, rather than just linking, the options with decent IDL and cross language code generators start to increase quickly. For example D-bus, or androids binder. Some even use grpc or thrift between processes on same machine. All of these are of course operating one or two abstraction levels up but are still good to consider.


Agreed. This seems like a rant from someone who perhaps never used anything but Linux and Darwin (macOS).

Windows went to substantial lengths to support non-C languages. It’s not obvious that the result was much of an improvement.


From the blog post:

> Case Study: MINIDUMP_HANDLE_DATA

I think the author is well aware of Windows.


GNOME developed GObject Introspection and it works well for GNOME.


It really sucks for cross compilation.


I'm not clear on why this is a C problem specifically, when it sounds like the author is really bothered by the lack of standardization in ABIs. Why would other languages not have this problem? At the assembly level, you still need to know which parameters go in which register and how the stack is laid out in memory. Rewriting the Linux kernel in Rust wouldn't change the way that software interrupts work, so how would it give you a Rust-native system call ABI? Am I missing something?


It's not a problem with C itself. But it's a problem that C, which has loosely defined ABI, is actually used as the standard ABI of the OS. And the OS is not only about system call, but about system libraries (window manager, ...), and interoperabiloty between languages


As noted very clearly in the article, the problem with using underspecified versions of C as the ABI is _also_ a problem for C the language, as it limits the possible evolution of C.


ABIs are only a problem in environments that use dynamic shared objects. But the reason why that's the case isn't obvious. So when in doubt everyone just scapegoats C. I suppose the strongest language can take the punches.


I guess the author would also complain that English is the primary form of communication online because of all the edge cases it has


A lot of the comments here are saying that C not having a well-defined ABI is acceptable/good. For people writing C, yes, that may very well be a good thing. But as the article states, pretty much every single language must interface with C to be usable, and, for them, it means either calling out to clang/gcc/tcc, or going through the mess of transliterating the C ABI of each target to your own language. not fun.

I work on an implementation of a high-level language (implemented in C) that probably less than 100 people have used, and a C FFI has been asked for plenty of times. You can't get around it.

Would some thing other than C being the ABI be better? Who knows. But the current situation just sucks.

edit: I'd like to explicitly note that I like C as a language. But it still makes for a bad ABI, because it wasn't meant to be one, and barely even works as one.


What you're complaining about, as has been pointed out by several people elsewhere, is that actual real world platforms have different C ABIs. This isn't C's problem, it is a problem that C exposes.

The C language doesn't define system calls. It doesn't define a lot of things that were either (a) created in C and presented via a C ABI (b) created in something even less universal and presented via a C ABI. Both (a) and (b) were done by people and projects who are not the C language and do not control the C language.


If I was wrangling on FFI issues across multiple platforms and battling gratuitous incompatibilities, bad documentation and bugs, I would be ranting as much or more as the author.

The issue is that there isn't really an alternative. Obviously a cross platform ABI is never going to exist (different endianness, alignment, register usage, stack conventions just considering the CPU, OSs themselves add more complexity). We could have an IDL to describe in details the ABI, but then:

- you need all OS vendors to be on board, which is not going to happen.

- even if they did you can be sure it will be forked in a myriad of dialects and non conforming implementations.

- even if everybody plays ball, bugs will still happen.

The best next is for language designers to come up with community maintained IDLs and tooling to interface with various languages instead of waiting for platform vendors to provide them. This is a realistic solution that can work, but at this point you might just accept that C fulfils this role already even if it is far from ideal. Just embrace libclang and hold your nose.

edit: there is also the option of targeting a single virtualized platform like the JVM or CLR which is great, but not really appropriate for a system language.


Not sure what the point of this rant is. C is an old language. I remember switching from Motorola assembler on my Amiga to ANSI C on my new PC and complaining how high-level it was. But it was so much easier and faster to write more complicated software with it. So it became popular and operating systems were written in it. Now we have many even higher level languages that abstract a lot of the real computing that is required for CPUs to work, and people complain every time they get exposed to the low-level stuff. Well, sorry, but on the very low level computers are actually quite complicated. Not everything can be abstracted and hidden from your eyes. As CPUs become more powerful and our systems have more RAM, we can be more wasteful with how we use these resources. So more and more of the low-level stuff can be hidden. But there will always be value in programming in C in ways that take full advantage of the hardware (especially for simple, battery powered devices).


> Well, sorry, but on the very low level computers are actually quite complicated.

The author is clearly and abundantly aware of this, and it isn't her complaint. Did you read the article?


lol.


Interesting rant. I found it fairly humorous but I am not sure if the author meant it that way.

As I read it, the author's thesis is that all programming languages have to talk to the operating system and the operating system is written in C so the operating system uses C calling conventions which leaks C's "ugliness" into the implementation or expression of their beautiful language.

I kind of think of this as the "I like computers but don't really understand computation" fallacy. It is fundamental lack of understanding about the nature of computer architectures and what they can and cannot do vis-a-vis how you might express that in a programming language.

One of my professors in college was fond of saying that "All programming languages are just syntactic sugar around machine code." Which is fundamentally true, and tries to capture that at the end of the day what ever your language "says" has to be expressed in machine code to actually do what it does.

You will spend a lot of time in this space writing a compiler, and code generation is an art all of its own.

But you can side step, a bit, by not writing a compiler, and instead writing an interpreter. The series of articles that were posted here gave a good intro, and while you still have to do the "naughty bit" where you write code in some compilable language that can pretend to be a computer of some different form, you can make everything look like your language.

I always encourage people who are "learning computers" to actually write a compiler (there are some good starting points for that but online courseware from MIT and other sources can get you the lecture material too). Doing that helps broaden one's perspective of what language designers and implementers are up against with regards to pretty much every computer working the "same" way (Von Neumann or Harvard architecture wise)


Putting aside the personal jab, which the other comments have rightfully called you out for (really, this is why people detest “the orange site”) you’ve just fundamentally misunderstood the point of the blog post.

C is kind of a horrible way of specifying machine interfaces, because it tries to lift things into being portable when the interface is anything but. The ABI is what ties C to the specific machine interface that it’s exposing, and in doing so dooms the ABI from ever changing. The article has many, many examples: a simple one is that it’s stupid to expose a 32-bit value as an “int” because that means “int” needs to be 32-bit forever. In this sense providing a C API is bad because the interpretation of the API into ABI is dependent and on your tooling and sometimes even that doesn’t agree, with fun™ results. This very much isn’t a request for “please give Rust a nice interface to the kernel” but more a “there ought to be something better than having to run libclang just to talk to the kernel”.


So what do you suggest instead? Express APIs in yet another language (perhaps one with defined-length numeric types) other than the ones people actually use? An IDL? Who's going to do that work, from the definitions themselves to the bazillion language-specific bindings? I'm not saying that you should have to have an answer before you can criticize, but every solution to this particular problem sucks in some way. Some things are just hard. Jumping on a bandwagon to tear down something that already exists, ignoring or trivializing any problems it does solve or that alternatives would need to, is another much-despised Orange Site behavior. Nobody ever turns out to have a real silver bullet in their back pocket.


Well, you can tell why the orange site put it on the homepage, right ;) But to respond in seriousness: yes, the blog post exists to mostly highlight a problem, and it doesn’t have an easy solution. But there’s a couple things you can do to help: one is to define a sort of “restricted C” that is easy to parse and designed to be ABI stable, which I think Microsoft actually tries to do. You could use some sort of IDL, yes: web browsers do this for their “system interface”, and as ‘olliej mentions so does Fuchia. But the main point is to shed some light on the issue to start a discussion, because until now people have mostly been OK with defining everything using C and it’s been causing problems that nobody’s really thought about trying to solve, and I think the post does a pretty decent job of that.


Fuchsia/zircon uses IDL to define the system API interface and it works for them :D


Ok to be fair Fuschia could define their system API using Protobuf and it would work for them


:p

This was more an observation that you can specify a system ABI in a way that isn't a fragile as C's. At least it isn't xml plists :D


It's less that the OS is written in C, but more that literally every single interface to literally anything is either written in C, or isn't meant to be used by multiple languages. C is the de facto standard for interacting with anything not written in your language. Window management, I/O, graphics, everything has a C interface. There aren't even any alternatives (besides just sending a list of bytes through a pipe or something, which noone wants to do for obvious reasons).


> C is the de facto standard for interacting with anything not written in your language.

Not necessarily. C is how your language interacts with the OS if you use the OS's C library. But not all languages do. Go doesn't.

The only thing you have to do to interact with the OS is make system calls. You can do that without going through the C library. It might be a PITA to do it if you're not working on Go for Google, but if that's the real problem, that's what the author of this article should have complained about. (It's actually less of a PITA on Linux because the syscall interface is implemented using software interrupts. Windows is more of a PITA because the OS goes out of its way to make its syscall interface look like a C interface.)


> The only thing you have to do to interact with the OS is make system calls. You can do that without going through the C library.

If by "the OS" you mean "Linux". On Windows, calling syscalls is explicitly unsupported and Microsoft will break you mercilessly by changing syscall IDs between versions. On macOS and iOS, libSystem.dylib is the only supported way to call syscalls; syscalls are SPI and Apple will break you if you try to call them directly. BSDs likewise discourage you from calling syscalls.

The "syscalls are stable API" principle that Linux is famous for is a weird Linux-specific thing; other OS's don't do it.


Not since Windows Server 2022 and Windows 11.

To enable the same kind of container image portability between kernel versions that Linux has, Microsoft has decided to stabilise the kernel syscall ABI.


I'd like to read more about this. Do you have additional resources that you can link to?


> C is how your language interacts with the OS if you use the OS's C library. But not all languages do. Go doesn't.

...on Linux. Pretty much everywhere else, libc is the way to interact with the kernel; even Go goes through libc on ex. Darwin and OpenBSD, because Linux presenting a stable kernel ABI is actually pretty unusual.


>.. libc is the way to interact with the kernel..

On UNIX clones, written in C.

Win32 isn't libc, nor the mainframes ones are (language environments), or Android (Java)/ChromeOS(Brower APIs).


> Win32 isn't libc,

I'm less familiar with the Microsoft ecosystem, but I'm pretty sure it is; Microsoft named their libc MSVCRT.DLL, but it's still the canonical system ABI, and NT, AIUI, actively shuffles system calls between builds to prevent people even trying to use them directly.

The rest, and your general point, are fair, though; there's nothing preventing the creation of a system that doesn't work like this.


No, msvcrt.dll isn't the system ABI. It's officially not even part of the OS and you aren't supposed to rely on it existing. Kernel32.dll is the primary entry point for OS things, and it's distinctly not a libc.


MSVCRT is short for "Microsoft Visual C (++) Run Time".

It is not an API to the OS.


> Linux presenting a stable kernel ABI is actually pretty unusual.

Did you mean "portable"? Because the ABI to the Linux kernel system calls is very stable. Literally the 32-bit syscall interface has not changed (except by addition) in 30 years.

The darwin kernel is also pretty stable, except Apple moves so fast from one architecture to the next that they drop support for old ISAs pretty fast. I predict they'll drop support for x86 altogether in ~5 years.


> Did you mean "portable"? Because the ABI to the Linux kernel system calls is very stable.

I think the GP meant that the Linux syscall ABI is indeed very stable and always has been, as you say, but that the syscall ABI of other OSs is not. So Linux is "unusual" in that sense, not in the sense that its syscall ABI is stable now but wasn't in the past.


Sorry, yes, I misunderstood the comment.

AFAICT, SunOS (Solaris), BSD, and Darwin have stable kernel ABIs for the core (UNIX) system calls. I've never used the Mach syscalls on Darwin, but I can imagine those changing more often, because Apple does that. I would imagine that Windows supports 32-bit syscalls still pretty well...but I stopped using Windows in 2002, so tbh, not sure.


Darwin has always explicitly not had a stable kernel ABI (https://developer.apple.com/library/archive/qa/qa1118/_index...). Go tried to bypass the system libraries on macOS and it resulted in things breaking, so now they do use system libraries (https://github.com/golang/go/issues/17490).


I have a very different understanding of the situation; the best description of the situation I can find is https://utcc.utoronto.ca/~cks/space/blog/programming/Go116Op... (discussed at https://news.ycombinator.com/item?id=25997506 ) - from that article,

> The official API for Illumos and Solaris system calls requires you to use their C library, and OpenBSD wants you to do this as well for security reasons (for OpenBSD system call origin verification). Go has used the C library on Solaris and Illumos for a long time, but through Go 1.15 it made direct system calls on OpenBSD and so current released versions of OpenBSD had a special exemption from their system call origin verification because of it.


read is unlikely to change anytime soon on macOS but Apple can and does change its numbering for lesser-used but still very important system calls all the time.


They also change the syscall abi without touching the numbering. gettimeofday(2) is a pretty famous one there.

Then there’s also the Linux-specific issue of vDSO, which are not kernel code and to which the article applies in full (see: “debugging an evil ho runtime bug”).


As sibling comment notes, yes Linux has a stable ABI, and amongst all operating systems that is incredibly rare.


Even Go gave up and switched to using libc for OpenBSD[1]. You can use syscalls on Linux because Linux, lacking a blessed libc implementation, guarantees stability of syscalls as an interface. The BSDs, afaik, do not, so circumventing libc is liable to render your binary forwards-incompatible with future kernel versions.

[1] https://go.dev/doc/go1.16#openbsd


> The BSDs, afaik, do not,

That isn't just about the lack of guaranteed stability, as far as I understand it is considered a security feature. System calls that are not made through the libc may be actively blocked. So an attacker can't just use an exploit to inject a system call into a process, the call has to pass through the libc wrapper and in combination with ASLR that wont be trivial.


A better way of accomplishing this would have been to randomize the syscall numbers on a per-process basis, and map a "syscall number translation table" into the process when loading the executable.


Again, there's more than the OS kernel that a programming language will need to interact with. You don't go through syscalls to make an X11 window or read keyboard/mouse inputs or draw graphics. There are thousands of utilities that are meant to be usable cross-language, but in actuality use the C ABI (well, at least one of the 176 C ABIs there are).


>> C is the de facto standard for interacting with anything not written in your language.

> Not necessarily. C is how your language interacts with the OS if you use the OS's C library. But not all languages do. Go doesn't.

Go doesn't interact with anything written in a different language? I find this really hard to believe - it would result in Go not having a GUI library, not being able to use PostgreSQL, not being able to start external processes (like shells), etc.

I'm pretty sure that Go can do all those things, hence Go does interact with libraries written in a different language.


> Go doesn't interact with anything written in a different language?

That's not what I said. I said Go doesn't interact with the OS using the OS's C library. But, as others have pointed out, that's only true on Linux.


> but more that literally every single interface to literally anything is either written in C, or isn't meant to be used by multiple languages. C is the de facto standard for interacting with anything not written in your language.

Even that is not the real complaint, since the alternative to this (N different FFI-style APIs) is madness.

The heart of the complaint is "great, we have effectively a single interface to almost everything, which is probably a good idea, except that the interface sucks".


I don't disagree, I see an expectation on the part of driver/API providers that you can write (probably in C) and adapter layer that can talk "glorious language" on the input, and then do the naughty bits and talk to the C API on the output.

My position is that this is less of a burden than writing something that goes from the "glorious language" into machine code.

Understanding why that is less of a burden is gained by writing a compiler where you go right from "expressing what you want" to "machine code that does that".


> My position is that this is less of a burden than writing something that goes from the "glorious language" into machine code.

It's only less of a burden if you're guaranteed to have a C compiler hanging around any time the C ABI breaks. If you aren't guaranteed to have a C compiler hanging around any time the C ABI breaks, then you are probably just as well off generating C-ABI compatible machine code from an ABI description written in "glorious language" and updating it when the C ABI breaks.

Non self-hosting languages distributed as source code can just lean on the C compiler for everything.

It's also quite feasible to target Linux without worrying about the C ABI, as the Linux system call interface is famously stable, but that is not true of any BSD.


> My position is that this is less of a burden than writing something that goes from the "glorious language" into machine code.

I disagree. Machine code has a small stable surface area. Compiling to a bootable image for a single-tasking machine is easier than compiling to a well-behaved userspace unix program that operates the way users expect.


I mean, a C interface is better than a machine code interface. The problem with that is that, unless you include a C compiler (or at least a thing that understands the 176 C ABIs) with your language, you can't use it. (not that machine code would be better at that, but I'm sure that if people actually intentionally attempted to make a global cross-platform ABI, they'd easily make something much better than C)


> a C interface is better than a machine code interface

The last thing I want when I write code in assembly is to have to call a C library just to invoke a system function.


You'd obviously prefer calling into things made in your own language than some other. But, from any other language you'd much prefer a C interface over having to explicitly describe which registers & stack slots to put things in and read from.


I did this once trying to mitigate Go's FFI performance cost. It wasn't great but it wasn't awful either. If I had to do it again I'd generate the assembly trampolines.



There are a couple solid summaries in that thread based on my experience going through that exercise - thanks for the link

In case anybody wants to go down the rabbit hole of shit you should definitely not use in production (but we did anyway), this was my starting point: https://words.filippo.io/rustgo/

I did not, however, `no_std` or avoid heap allocation on the Rust side. Everything worked great, including running a multithreaded tokio runtime.

Still do not recommend in prod ;)


I think that's besides his point, which I take is: Replace C with an alternative and people will complain of similar problems, since all abstractions are leaky abstractions to some extent. Or in other words, since you can never be immune to this "leaky" problem, you will inevitably have similar kinds of problems despite using an alternative. Which is why you'll find every programming language has critics.

Also, keep in mind having alternatives isn't necessarily better, as it introduces a problem itself: it complicates things (the article even mentions a problem of this sort). Does having 100 programming languages, with dozens of OS's, each all doing mostly the same sort of things in different ways - simple? Might the programming world be simpler if there was 1 programming language/OS/way of doing something, rather than having 100s of alternatives? (This is just food for thought to demonstrate the point, not something I'm particularly advocating)


> Window management, I/O, graphics, everything has a C interface.

You say that like it's a bad thing. The situation has gotten much worse for third-party languages these days, with frameworks being written Javascript, Swift, etc., which are hard to bind to without fundamental impacts on your language design.


Yes, it is possible to make worse ABIs than C's. That's not a surprise. Doesn't mean C's is automagically the best possible. Something that, at the very least, doesn't have varying sizes for the same type between architectures/OSes, and precisely defines structure layout/padding, would be vastly better. (not saying that's a realistic thing to get at this point, but doesn't change the fact that it'd be much nicer.)


I’m not saying it’s the best thing, but I’m not sure you can get much better. A lot of things defined by the C ABI are things you’d have to do anyway if you’re compiling to native code (structure alignment, where’s the stack, what’s called-save, callee-save, where do you find the return value, etc.) Targeting the C ABI’s set of choices isn’t that different from targeting a different set of choices.

And there are gratuitous differences in C ABIs, for sure. But often there’s good reasons. X86 had no register arguments because few registers; same reason it had no thread pointer. RISC architectures had weird limitations on unaligned access, etc. affecting struct layout.

Maybe I’m being dense. What does a good alternative look like? CORBA?


If I could design my own cross-language ABI, I'd make these differences from C:

* Use fixed-size types instead of the 5 types for 4 integer sizes (is long 32-bit or 64-bit?)

* Allow multiple return values [so you could use multiple return registers, especially on non-register-starved arches]

* Allow unwind ABI (C has no way to permit unwinding!)

* Dedicated types for pointer + size for arrays

* Dedicated string type (i.e., distinguish between &[u8] and &str in Rust terms)

* Something that allows for vtables would be nice

* Something that encodes ownership (and eventual deallocation) is also potentially valuable


Between architectures you'd indeed need quite a few differences, but for the same architecture it'd be nice if things more or less agreed on things.

Specifically, I think it'd be very reasonable & possible to have an ABI that has various floats & integers (specified width of course), and pointers & structures, that has consistent conservative padding on all architectures and precisely one way any given structure is laid out in memory (ok, maybe two, with LSB & MSB integers; but I think that's literally all the variation there is (and MSB is nearly dead too)).

Not the most efficient thing on all architectures, but for language inter-communication it shouldn't be too bad to lose a couple nanoseconds here and there, and I'd guess it'd outweigh the alternative of needing more generic (and thus, less optimized) code handling each individually.


the big issues with C are that it doesn't specify the size of it's integer types.


The various C ABIs definitely do, though. And you generally know which C ABI you need to interact with (the __int128_t example from the article just seems like a bug -- luckily use of that type is rare).


You should think of C as a front-end for the computing platform you are targeting. The platform has an ABI, which you need to use to interact with the system components. So because every third-party has to speak that ABI to interface with the platform, they might as well use that ABI to interface with each other too. You can of course build your own little world within a platform, with all your binaries speaking a different ABI to each other. And that might have some advantages, but most people find those advantages outweighed by just using the same ABI as the platform.


Where is the C interface to Symbolics Lisp machines…?

There’s lots of alternatives, MSR wrote an OS in C#


Genera definitely had a C (and I think C++) interface.


Written on top of Lisp.


A lisp machine, by definition, runs lisp. The ABI problem only exists in environments where multiple programming languages exist.

I doubt much non-.NET software runs on that C# OS either.

If you want multiple programming languages to be able to sanely run within a single OS (and do more than just pure computation), though, you have the ABI problem, and this is what the whole discussion is about.


.NET was created as a Common Language Runtime.

When it was announced it supported about 27 languages.

From Microsoft themselves, J#, C#, VB.NET, and Managed C++ (later replaced by C++/CLI).

It was WebAssembly before its time (among many others), with tons of features that WebAssembly is yet to support.


The Symbolics lisp machines had C and Fortran compilers


Implemented in Lisp.


> the "I like computers but don't really understand computation" fallacy

> I always encourage people who are "learning computers"

Buddy, the author is a major contributor to the Rust project


As well as a former engineer on the Swift team at Apple. Aria is more qualified to talk about compilers than 99% of HN.


I think it's been a few years since Gankra has been active, but she was the original mastermind of std::collections as well as the original author of the Rustonomicon, if anyone wants her credentials.


She’s still on the website under lang-docs, at least: https://www.rust-lang.org/governance/teams/lang#lang-docs%20...


> Phantomderp and I have both recently been very aligned on a particular subject: being extremely angry about C ABIs and trying to fix them.

Damn beginners!


does that preclude one from making bad points in a rant?


no but even though this is written in a rant-ish style it isn't isn't really one given that the post is full of specific examples discussing all the things that are wrong with C as an interface.

This is probably the most technical post I've seen on HN in a while and accusing someone who goes fairly deep into the implementation details of compilers, and OS libs of 'not understanding computation' is somewhat bizarre.


I think it precludes one from “not really understanding computation”.


> I always encourage people who are "learning computers" to actually write a compiler

Does the Rust compiler count? The author has been a long time contributor.

> I kind of think of this as the "I like computers but don't really understand computation" fallacy.

You couldn't be more wrong about the author


I only made it about 80% through because, yes it was a rant. I'm pretty sure the author knew what they were talking about. The issues discussed were those of someone who knows how computers work with a great deal more depth than many (if not the vast majority) of developers. The point that programming languages are just syntactic sugar around machine code isn't really relevant here since, for better or for worse, C headers are the standard way to codify access to ABI's. A deep understanding may be fine if you aren't accessing many ABI's, but it doesn't scale well. Using C headers does scale well, but the author is pointing out there are many things about C that makes this difficult.


The most difficult part of C-headers-as-interfaces is the use of macros as part of the ABI.


Agreed. That part is just mind-bogglingly awful, and fairly unique to C/C++. AFAICT all the other problems are common across most alternatives, many of which are actually worse.


> "I like computers but don't really understand computation"

You know you could make your point without being condescending and trying to insult the author.


The author invites insult by being insulting themselsves, not to mention comically overweening.

They are free to write a new os in rust with superior abis if they want to show us all how it's done.

I'm sure it will be in use by the entire world in 50 years the way c is today, and no one from that time will write a blog post like this one because he of course will get it right.


There’s a difference between making (valid) complaints about the state of things and presuming that you (or anyone!) could fix them single-handedly


No software is going to be perfect. But new software can at least try and improve upon the old software.

Do you really think that Unix and C are the absolute pinnacle that our field has to offer, we shouldn't bother trying anything else?

(Not a unix or C hater, just a bit of a tangent about how it's cool that people make new things even if they won't be perfect).


>> Do you really think that Unix and C are the absolute pinnacle that our field has to offer, we shouldn't bother trying anything else?

Unix and C are not the pinnacle of ease of use, but rather the survivors of the past 50 years of programming language and operating system evolution. So many of their parents, relatives, competitors, and even children have died and yet they remain.

That means that they are the most adaptive and the most successful in a Darwinian sense, but not necessarily the most ergonomic or user-friendly.


That would be more compelling if their success was more directly coupled with their technical design.

Unix is not technically incompetent (certainly given the needs of the times it was designed in), but its real success now is largely based on it already being an incombent, and being open enough for projects to work from.

Original Unix is dead, it lives on through copy-cats, and they copied it mostly because an experienced user-base makes adoption easier.

In other words, the only time Unix was primarly selected for technical reasons was very short lived. Now its technical benefits are little more relevant than the technical benefits of a qwerty keyboard.


>> That would be more compelling if their success was more directly coupled with their technical design.

>> Unix is not technically incompetent (certainly given the needs of the times it was designed in), but its real success now is largely based on it already being an incombent, and being open enough for projects to work from.

I disagree.

The technical design of Unix and C as the implementation language were major factors to its success and versatility.

Eric Raymond describes the design choices and culture of Unix that helped make it successful:

* Rule of Modularity: Write simple parts connected by clean interfaces.

* Rule of Clarity: Clarity is better than cleverness.

* Rule of Composition: Design programs to be connected to other programs.

* Rule of Separation: Separate policy from mechanism; separate interfaces from engines.

* Rule of Simplicity: Design for simplicity; add complexity only where you must.

* Rule of Parsimony: Write a big program only when it is clear by demonstration that nothing else will do.

* Rule of Transparency: Design for visibility to make inspection and debugging easier.

* Rule of Robustness: Robustness is the child of transparency and simplicity.

* Rule of Representation: Fold knowledge into data so program logic can be stupid and robust.

* Rule of Least Surprise: In interface design, always do the least surprising thing.

* Rule of Silence: When a program has nothing surprising to say, it should say nothing.

* Rule of Repair: When you must fail, fail noisily and as soon as possible.

* Rule of Economy: Programmer time is expensive; conserve it in preference to machine time.

* Rule of Generation: Avoid hand-hacking; write programs to write programs when you can.

* Rule of Optimization: Prototype before polishing. Get it working before you optimize it.

* Rule of Diversity: Distrust all claims for “one true way”.

* Rule of Extensibility: Design for the future, because it will be here sooner than you think.

Source: http://www.catb.org/~esr/writings/taoup/html/

It was not openness alone that contributed to the success of Unix. There were other open systems, but they did not survive.


What's the part where you disagree?

Yes, those things did contribute. Originally.

But once Unix won, all it needed to stick around was momentum.


It not being possible to improve the situation doesn't mean that the current situation is good. Is the article's rant gonna achieve much? Most likely, nope. I'm fairly sure the author recognizes that.


> I always encourage people who are "learning computers" to actually write a compiler (there are some good starting points for that but online courseware from MIT and other sources can get you the lecture material too). Doing that helps broaden one's perspective of what language designers and implementers are up against with regards to pretty much every computer working the "same" way (Von Neumann or Harvard architecture wise)

I suggest looking up the author's CV.


Total foot in mouth, here... hope you are having a good day, otherwise!


>I always encourage people who are "learning computers" to actually write a compiler (there are some good starting points for that but online courseware from MIT and other sources can get you the lecture material too). Doing that helps broaden one's perspective of what language designers and implementers are up against with regards to pretty much every computer working the "same" way (Von Neumann or Harvard architecture wise)

Isn't it dependent on what IR do you use?


I literally made it to the third sentence. How is using something other than C going to fix ABI? Isn't it literally the same problem? Do other languages super mangle their symbols so there's no breakages anymore?

Memory safety etc is a complaint about C but worrying about ABI breaking is literally what you'll always get when you do anything other than assembly[0]. The author laments that Rust and Swift must speak to C but that isn't because of the K&R controlled cabal, albeit it might have been the initial reason. Today the reason everything must talk to C is because operating systems are written in C and expose their API and ABI in C. Write an OS in Rust or Swift (lol) and then get mass adoption and then you won't have to worry about interfacing with C anymore.

They do eventually touch on that, but as long as OS'es are in C then you need C. There unfortunately is no alternative.

[0] I mean technically, you have "ABI breaks" in ASM too, it's just the program goes it's merry way being zombie like until a seg fault happens or worse.


Please read the post, this is not the “ooh we need to mangle stuff and this sucks” blog post you think it is.


A lot of smoke here, but very little light.

What this article fails to capture is the fundamental question that is faced any time a new architecture is encountered: should 'int' be sized (number of bits) according to its original (or most recent) de-facto definition? Or should it be sized according to the natural register size of the architecture's general-purpose registers?

A lot of us went through this back in the mid-2000's when AMD64 (x86_64) came out. It took both Microsoft and the Linux crowd (just to name two communities) time to come up with their respective (and incompatible) translaitons of types. The top answer to this StackOverflow question summarizes this well:

https://stackoverflow.com/questions/384502/what-is-the-bit-s...

In the gaming industry, we came up with our own typedefs until the standards caught up. Things like int_8, int_64, etc. I say only a fool would ever use somehting as pretentious a concept as 'intmax_t' in normal code (which isn't a tranlation table of typedefs based on platform). "max" according to whom? That is not the way.


None of this is C. All of this is the OS.

Who specifies how you talk to the OS, and how native applications talk to each other? The OS does. And it does, in fact, differ across different OSes (with different calling conventions). The only reason C comes into this picture is because it runs on all the platforms these other languages do (and many more), so you can write an adapter between the language and C, and not have to worry about supporting 10 million different calling conventions, because some C compiler author has done that for you.

So the article is half right; this isn't a programming language. But C sure is.

(And that's not to mention the fact that to some extent the ABI and calling convention is determined more by the CPU architecture than the OS, much less the language!)


C is an obsolete [1] programming language that must still be used for interoperability between languages despite not being particularly good at that.

[1] Yes it is. It is lacking many important features. And people claiming they rather not have these features are like an author writing a book with notepad.exe because they think modern Word processor are too complicated.


Notepad is absolutely not obsolete. It's far better than a heavy word processor for quickly pasting something into or jotting down a note because it's fast and has hardly any extra interface to get in the way, and has very very little bloat. This is definitely something that authors claim as well, George RR Martin writes on MS DOS because modern word processors have too many bloated features and correct his spelling.


There are tools in between a heavy word processor and a very dumb text editor. Can you even undo more than one character now? I mean, sure, for quick little edits, Ok. But don't tell me you'd write a book in notepad.

And spell checking is a good thing.


I don't think we have the same definition of "obsolete" (C is definitely not "no longer produced or used"). There is absolutely no requirement to use C and you could implement it directly in Assembly if you chose, or in many other languages (most of which pre-date C).


C descriptions of the OS interface API...

Ok - I get that. But, there is a system call interface. int 0x80, syscall.

Now, these can be wrapped, and the result exposed in a completely different way. But that is the definition. If its "C" on one side and "C" on the other... well, ok then! I though Linux had vDSOs to allow direct "ABI" calls for performance reasons.... utilizing that will force a certain "C-ish" look. In turn, that can be wrapped. None of this changes quickly.

Heck. CP/M-80 had "CALL 5" with registers a certain way. Wasn't "C" by any stretch!

Because the C (POSIX, mostly) "API" is stable and available, we tend to use it. Wasn't always the case -- after all, FORTRAN I/O was all the rage back in the 60s (cf SNOBOL4).

If (whatever) programming system wants to avail itself of the C infrastructure, it is certainly free to do so. Stop the endless whinging about C! Why C? It is the only language in its class that works from Z80 to my Thinkpad.


On some platforms (Darwin, among others), the syscall interface is defined not to be stable. Syscalls must go through Libsystem stubs -- which is written in C.

Go made the mistake of assuming the Darwin syscall interface was stable, and an OS update broke binaries.


A lot of the outrage against C seems to be based on the assumption that it gained its prominence either in a vacuum or in a modern context. Oh, young ones, such is not the case. C is a creation of its time as is everything else like Linux and TCP/IP and SCSI. All of their warts seem obvious in 20/22 hindsight, but they were either not obvious at the time or were outweighed by other contemporaneous needs/constraints (usually lack of compute or other power). Bashing C without acknowledging its context and place in computing history only shows an author's shallow ignorance.

On a slightly different topic, few of the problems mentioned in the OP really have much to do with C. I still remember when the interfaces to the original Mac OS (the stuff in Inside Mac, before it got UNIXified) were expressed in Pascal terms. I even vaguely remember MTS interfaces expressed in 360-assembly terms. They were absolutely no better. There is an art to making ABIs and APIs and SPIs and network wire formats and on-disk formats future proof, but it has little or nothing to do with language. The problem is that a bunch of such interfaces exist out there - because they need to exist in concrete form to be useful - that weren't defined with that level of care. It's not much more than coincidence that C happened to be a dominant language in many of those times and domains.

An ABI based on a more "modern" language that involved more detailed types or (even worse) garbage collection would be far far worse for anyone using any other language. C has plenty of problems, but as a notation for expressing an ABI (much like Algol is still sometimes used as a notation for algorithms) it's really not bad.


I think I would go a little further than that. There was a point in the past where machine architectures were so alien that even getting your favourite language bootstrapped on it was a major challenge.

The inflection point came when "everything became a PDP-11/VAX" which at least made formerly expensive features affordable and set off a wave of innovation that required some kind of standard (especially the Motorola 68000 and the Intel 386).

Now we are so much further along, everything is starting to look like Unix in one form or another. To me, it wouldn't have mattered if the operations systems had converged on NT or VMS, the underlying demand was for Unix like systems and that is where we ended up, to the point where Microsoft have embraced it with WSL and all the big, proprietary OSs of the past are dead or irrelevent.

Sure, the ABI problem is real, but 40 years ago this outcome wasn't guaranteed and it's turned out to be absolutely magic in terms of the proliferation of new languages. From my perspective, everything "having to speak C" at a low level is a feature, not a bug. It might not be a perfect model, but it works and a lot of people are familiar with it.

That de-facto standardization is what makes modern computing possible.


Praising C without acknowledging the 10 years in system programming languages and OS research done outside Bell Labs only shows an author's shallow ignorance in system programming history.

Had UNIX been sold at the same price as the competition instead of free beer tapes, and we would be probably using some BLISS, Mesa or PL/I variant instead.


First off, nobody's praising C. Second, those other developments - which I quite likely know better than you do so stop trying to show off or make an implied appeal to authority - are out of scope wrt an article that's specifically about C. I think it would be great if the world had converged on something like Mesa instead of C, but that didn't happen and it's not clear that it would have prevented misguided rants like the OP from being written. Whataboutism is not the same as understanding/respecting essential context.


Not showing off, replying on the same tone as your comment. I am even reusing your written style.

I am perfectly in scope, given the historical mess that has driven us into this state.

That is the only reason why are even discussing about C to begin with.


What we need is an ABI-level IDL [1] to specify library interfaces, that every programming language could use for imports and would translate to implicit declarations, both for calls to the interface and/or for implementing the interface. Currently, C header files poorly serve that purpose, and at the same time, that use impedes the evolution of C itself. Such an IDL wouldn't necessarily solve C's own compatibility issues with its evolution, but it could dramatically improve the FFI situation if widely adopted.

[1] https://en.wikipedia.org/wiki/Interface_description_language


Personally I'm inclined to think that an IDL is the Right Way to specify an interface. OTOH, that has been tried, many times (ntauthority describes one series of attempts by MS but there have been others). Every one has succumbed to secondary complexity and/or simple lack of adoption. One common issue I've seen has been that the definitions end up feeling pretty non-idiomatic for most languages because they only support a minimal common subset of features. Would exceptions be more idiomatic? Too bad. How about union types? Too bad. What about ownership and memory lifetimes? Oops, alien again. In the end it's no easier to work with than the C definitions, plus generated code often creates problems with build systems and IDEs, so people just go with what already 99% works and deal with its warts as best they can.


It would need tighter integration with programming languages and compilers, e.g. you should be able to write something like `#include "libfoo.idl"` (similar to the ease with which you can write `extern "C"` in C++, or `__declspec(…)` in MS C, etc.) and not need an extra code-generation step.

Regarding features, it’s clear that you’d need to find some sort of common ground, though not necessarily just the lowest common denominator.


"there are languages people complain about, and languages nobody use" (quote from the creator of C++)

It's easy to complain about C or C++, but I don't see that many languages that compiles to machine code, and offer compilers for many platforms.

I agree that C is somehow the new "assembly".

C is the glue used for building operating system, so it's not surprising you need to use it in many places.

I'm sorry but I don't see any better language that is as simple to learn for students as C, and fixes some problems of C. Most new languages are often difficult to read, introduce a lot of un-needed sophisticated features, and are not that much used because they're too niche.

A good example is Rust. Sure, the safety is an awesome feature, but ADA already did it somehow, but it's not what developers need.

There is a reason the world talk english and not german or japanese. It's because english is just much, much easier to learn. Nobody cares if the language makes sense.

It's odd because HTML and javascript are so much more ambiguous and cause a lot of pain, yet I hear more complaints about C or C++.


> It's odd because HTML and javascript are so much more ambiguous and cause a lot of pain, yet I hear more complaints about C or C++.

HTML implementations (and to extent JS) are very forgiving - you could have thousands of warnings and errors raised by linter, but page will still render mostly fine. In this regard C would be closer to XHTML which would fail to render unless it's "well formed" and, well, nobody uses it nowadays.


No offense but this analogy is plain wrong. C is definitely like HTML, not XHTML, in that respect.


...You've just discovered why interfaces like COM and Cocoa are a good idea, yes. Welcome to 1999. Join me in cursing the proliferation of POSIX and FFIs. JOIN ME.


Win32 API is another example of something that has been pretty stable… But then, it is basically C. (Just like COM is basically C++.)


COM was meant to be language-agnostic, so I don't see how it's basically C++...


COM can also be seen as a mapping of C++ APIs (with some extras like interface versioning) onto C APIs, mainly as workaround for C++'s missing ABI stability (so that C++ APIs can be exposed via DLLs in a (C++) compiler-agnostic way).


Interfaces are made to match the C++ vtable. Using them from C, for example, is somewhat of a hassle. (Not to say anything about IDL…)


Not exactly. They use a different calling convention (for systems where 'this' pointers are normally passed in a special place) from usual C++ classes to match C-ish, and there's a few other restrictions for things (overloads, inheritance) that otherwise differ across C++ ABIs and wouldn't map well to other non-C++ languages.

The C wrappers generated by MIDL are also fairly similar to raw C++, compare (where pUnk is an IUnk *):

    pUnk->Something(&bA);
with...

    IUnk_Something(pUnk, &bA);
which expands to something like this:

    pUnk->lpVtbl->Something(pUnk, &bA);


Win32 API is frozen in Windows XP timeframe, no wonder it is stable, all major new APIs since Longhorn failure are based in COM.


Most of the problems exposed here aren't C related, but OS or architecture related. An ABI where function parameters are passed over A, B, and C registers and the return value is stored in the Z register have nothing to do with any language.


Except that C defines the way nearly everything everywhere passes function parameters over registers. Not C's fault, but still a thing affecting everyone that doesn't write C, as C doesn't even try to be a good ABI, because it wasn't meant to be.


What does this mean? C defines the mechanism(s) for passing arguments to a function? That's news to me. The parent's point was that the binary mechanisms are OS- and architecture-dependent. (And compiler-/build-dependent too; e.g., when compilers are designed to pass arguments via registers rather than on a stack.) As other commenters have tried to point out, the diversity of target platforms precludes a single, one-size-fits-all ABI -- for C or for any other language.

How does a language "try to be a good ABI"? An example of such a language would help.


C itself doesn't, but the OS deciding how it interfaces with C quickly results in all C programs for that OS doing the same thing, and, since C is the lingua franca of programming languages, in every single program with a cross-language API.

An ABI doesn't need to be a language. In fact, I'd say that makes it worse, as now you'd have a potentially conflicting goal, wanting to make it nicer for the language itself, possibly at the cost of making for a worse ABI.

But one could do well with not having varying width integers (looking especially at you, "long"), having a defined structure layout, and one (primary) calling convention per architecture (it not being the absolute best if goals change would be fine, as it'd only be used to talk between different languages, and that usually doesn't happen with such a frequency that a couple nanoseconds hurt). Sure, you'd still end up with some variance, but it'd be a lot easier to work with, and it'd be pretty hard to reach the count of 176 ABIs of C.

edit: ..and just not defining things that aren't actually related to ABI (or may need to change), i.e. intmax_t


> Except that C defines the way nearly everything everywhere passes function parameters over registers.

I don't think that that is correct. Sure, your C implementation defines which params go into which register in a certain way, but other C implementations define the same thing in a different way.


Ok, but the way this is exposed is via C so you really have to figure out what it is going to do to use it.


I think this falls into the category of articles where the author makes an eye-catching assertion in the title to pull you in and then totally fails to justify it other than by ranting about some personal gripe he has.


She, and the assertion is fairly well backed (snarkily, unlike your comment…)


I feel like there has been a lot of energy put into creating system programming languages that fix the shortcoming of C and C++. But it seems to me that the solution is to just realize that we rely on C too much in code bases, to talk to OSes, and between programming languages to ever get rid of it. So instead of trying to replace C in all of those cases, just make a better C. Simplify the language to the point where parsing it is almost trivial. Barring all of the above, why not just create a new standard where people agree that

A * B; And (A) - B Can only mean one thing. Sure this wouldn’t do anything for the old code, but at least where new C code was required we wouldn’t continue dealing with all of its complexity.

I feel like this is the same problem with CSVs. It is trivial to parse if people follow the standard but most people don’t. But if you create a parser that only accepts the standard and refuses to parse non standard CSVs then you never have to deal with any of the BS edge cases that come up. And if you show where in the code/CSV the ambiguity is, people might actually fix their code to get it to parse correctly instead of leaving it ambiguous


I have an old lapel button: C combines the power of assembly language with the flexibility of assembly language. Humorous, yes, but still true - it's a high-level, platform-independent assembly language that (used well) allows for creating all sorts of things great and small. I found it the perfect back-end to a Python front end, where the known problems of C could be minimized via control being done at the interpreter level and the compute-heavy work being done in C.


But C is neither as powerful as nor as flexible as assembly language.


I would also add then that assembly language is not as powerful or flexible as VHDL.

C is a high-level programming language not unlike, say, Fortran, and its power and flexibility is in expressing calculations and concepts rather than in representing the architecture of a particular processor or in how well-optimized the generated machine code is; compilers are very good at both, and, frankly, they should be expected to do a better job than a human could in a reasonable time - especially when code is still in flux or needs to be refactored.


Vectorized numeric kernels still often need to be written by hand, for one example of where modern compilers just aren’t good enough to make writing assembly obsolete.


Indeed! I recently hacked up a considerable amount of assembly and used a lot of tricks that would not be possible unless I used copious gotos in C and also non-standard tail-call optimization. Some tricks (like writing inline/out-of-line codepaths) quite literally are not possible in C.


> Now C isn’t just a programming language, it’s a protocol.

But the image above that is calling conventions for functions on a particular processor. What does that have to do with C?

> So actually semantically parsing a C header is a horrible nightmare

I'm confused. It seems to me the right way to go would be to improve C such that it's more precisely specified, easier to parse, etc. Genuine question: why is that so hard? (I suspect the reasons are more institutional than technical.)

Variable declarations are syntactically weird in C, for example. And newer languages seem to have better ways (that are actually context-free). So why couldn't C be evolved toward that?

> int foo(int x, int y)

becomes

> foo(x: i32, y: i32) -> i32

and so on.

> You don’t see England trying to improve itself, do you?

I guess that's a saying I'm not familiar with. Seems to me England has improved a lot over the years.

Go easy on me, geniuses.


> It seems to me the right way to go would be to improve C such that it's more precisely specified, easier to parse, etc. Genuine question: why is that so hard?

It's not hard to specify a language that is all those things compared to C, but then it is, ipso facto, not C.


I don't get it. At one point C changed its syntax for function arguments (significantly, I'd say), was that new language ipso facto not C?


C compilers still support old-style function argument syntax. They have to, for backwards compatibility. Plenty of actively-developed software being compiled on modern versions of the C standard still has some older functions whose signatures are written in K&R style.


Ok so K&R style is going to be removed (good!) as others have pointed out. The compiler could provide a tool to automatically update the function signatures, making it relatively easy to update legacy code.

Also, it seems to me that anyone who wants the best for C would want to get rid of the context-sensitive parsing. So that would include those on the standards committee. Here's hoping they can continue improving the language over time, albeit slowly.


So deprecate that. Start popping up warnings. Surely C has deprecated something in its lifetime, right?


It has already been deprecated for decades, but I doubt they will ever remove it. Compatibility with millions of lines of pre-existing code is simply more important than the somewhat niche concern that it makes C harder to parse.

And we’re talking about something from the premodern time of C. Changing how declarations are written now is a complete non-starter; there would be absolutely zero appetite for that, and even if the standards committee did it, everyone would just continue using compiler flags for the last version of the standard before the change happened, so compilers and parsers would have to continue supporting it anyway.

> Surely C has deprecated something

Again, deprecated yes, but the only thing I can actually think of them removing outright is the gets function, which had way more potential to cause harm than these parsing issues.

I think you are just massively overestimating the extent to which the C committee is willing to break things. The language is managed and standardized in a totally different way from things like Python.


Well, it so happens that C23 has so far been decided to remove K&R function declarations! (the "int fn() int arg1, arg2; { /*code*/ }" kind) So it's not utterly impossible for old things to get removed (though who knows if that'll change).

But C syntax really isn't the biggest problem in this. Even if you decide to write a whole new C parser for your new language, you'd still need to support the 176 ABI triples (or not work on many systems) and detect & choose which one the user's system uses. It wouldn't be hard to reduce that number by a lot if anyone actually would've tried to.


> Well, it so happens that C23 has so far been decided to remove K&R function declarations!

Interesting! I stand corrected, then.


Why wouldn’t it be hard to reduce that number? Architectures do differ after all.


Architectures do differ, but there's really not a good reason to have dozens of ABIs per architecture. But that's what we have - 15 for aarch64, 9 for armv7, 27 for x86-64, etc. Most things in each set most likely do match, but there's no specification of how they do, so it's hard to consider them in a way other than just a dozen different ABIs that might as well be for different architectures.


C23 removes K&R declaration lists.


I disagree with the author. C is an amazing language but its "lack" of an ABI is, in some ways, a byproduct of the language itself. C essentially sits half a layer above assembly and is meant to give the programmer fine-grained control over the processor and memory without requiring intimate knowledge of x86/ARM.

This is why pure C projects can turn into an unmanageable behemoths. This is also why the best way to use C is to use it to implement core algorithms and data structures which can then be called by higher-level languages. Numpy/Scipy did this perfectly and now their use is now ubiquitous within the Python community.

Most software engineers I know who have a background in EE love C, simply because it maps very well to what a processor actually does during execution.


> C is an amazing language

[citation needed]

> the best way to use C is to use it to implement core algorithms and data structures

is this a joke. you literally cannot write containers in C unless you commit to heap-allocating everything and storing it as void*

> Most software engineers I know who have a background in EE love C, simply because it maps very well to what a processor actually does during execution.

lol. no it absolutely does not. i have a B.S. CpE and have actually built simple processors. the C execution model has nothing to do with how silicon operates, and modern silicon in particular goes to absurd lengths to put up a façade that c programs can use to pretend they're still on a pdp-11 while the processor goes and does other things.

easy example: here's a memory address. what happens when you try to read from it


>> Most software engineers I know who have a background in EE love C, simply because it maps very well to what a processor actually does during execution.

>lol. no it absolutely does not. i have a B.S. CpE and have actually built simple processors. the C execution model has nothing to do with how silicon operates, and modern silicon in particular goes to absurd lengths to put up a façade that c programs can use to pretend they're still on a pdp-11 while the processor goes and does other things.

I'm an EE who's been writing bare-metal firmware for over ten years, and who's helped develop memory subsystems for microcontrollers. What you're saying is certainly true for PC CPUs, but the C execution model works just fine for a Cortex-M or other low-end CPU. No "absurd lengths" are needed; there's a clear relationship between:

  MOV R2, #0x400
  LDR R1, [R2, #5]
and:

  volatile uint32_t *b = (volatile uint32_t *)0x400;
  uint32_t a = b[5];
>easy example: here's a memory address. what happens when you try to read from it

The CPU puts the address on the address lines and sets some control signals appropriately, and the SRAM returns the value at that address on the data lines. (Simplifying, obviously; I'm not going to go dig up an AHB spec.)

Cache? What cache? SRAM reads are single-cycle. The flash memory probably has some caching, but that's in the flash subsystem, not the CPU. And a lot of your most important reads and writes will be to memory-mapped registers, which had better not be cached!

No, C does not express the details of the instruction pipeline or complex memory subsystems directly in the language. Neither does assembly. C also does not cover every CPU instruction -- that wouldn't be portable at all. That's what inline assembly and compiler intrinsics are for.

C strikes a balance between portability and closeness to the hardware while remaining a small language. It does this very well, which is why it has historically been so popular, and still is for some purposes. Not all software is CPU-limited data processing on a 64-bit server.


Also the "PDP-11 facade" is needed so that the thing can be programmed in assembly language without the board support people doing bring-up tearing their hair out. A sane assembly language that you can read instruction by instruction to understand the abstract effect is necessary for more than serving as a C compiler target.


> [citation needed]

Runs just about every computer from the tiniest microcontroller to the largest supercomputer. Has been doing so for 50 years, despite a constant parade of miracle languages that were going to replace it Real Soon Now.

When Miracle Language of the Day actually does what C does, across the same variety of hardware, I will be the first to congratulate it and its designers.

But I don't think that's going to happen any time soon.


It runs on all hardware because hardware manufacturers support it, which they essentially must do because that's what's expected. It's a self-fulfilling prophecy (and arguably a vicious cycle).


Or, it’s a virtuous cycle of people recognizing a valuable tool and giving back to its ecosystem by developing for it (even if that’s not intended–it’s a second-order effect).

It can also be seen on another axis, that of organic growth versus what each new miracle language tries to be, which is a centrally planned and all-encompassing solution, which isn’t possible without the entire rest of the industry just stopping and waiting for it all to be made to work.

C already works. So people work with it.


It's "organic" because, as this article is pointing out, creating alternatives is really difficult due to this exact lock-in, both at the OS level and at the hardware-vendor level.

You shouldn't need a "miracle language" or even an "all-encompassing solution" to have a chance to break free of this.


> creating alternatives is really difficult due to this exact lock-in,

That is nonsense.

As I noted below, low-level OS internals have been reimplemented numerous times, by small teams of volunteers at that.

If you want to "break free", buckle down and do the work.


Sure. You can re-implement everything starting with the Kernel, as long as you don't have to interface with any of the C microcode on the hardware itself. And, yeah, people are doing this, for instance with Redox OS.

But if you actually want to program something usable in conjunction with existing software, such as Linux, you need to use the C ABI. There is no alternative.


> is this a joke. you literally cannot write containers in C unless you commit to heap-allocating everything and storing it as void*

Who cares long as the interface is type-safe?


> [citation needed]

Good joke, yeah. The whole "can you give me a source for that?" thing is getting quite old.

Oh, what? You were serious?!


> Most software engineers I know who have a background in EE love C, simply because it maps very well to what a processor actually does during execution.

I agree with this. One thing to note though: C maps onto small CPUs pretty well. It doesn't map directly onto the x86_64 execution model at all.

Today's fast CPUs are very different than the abstract machine that C represents. They run all kinds of things out-of-order, do speculative execution, run instructions in parallel, engage in branch prediction, etc. If anything the C execution model is almost a like a little VM that gets mapped onto the core that's running it.


Most of this is not really C-specific, really. The ABI issues are likely to arise in any systems programming language, and in any lingua franca between languages.

These things come up in C first because C is the lingua franca.


Imagine if your language had to speak several other languages instead of variations of C implementations to work on different systems. Now that would be even more chaos potentially.


Ah, the `intmax_t` problem is a fun one. Perhaps it should never have been introduced.


Yes, for the most part intmax_t (and similar guarantees) is considered a mistake.

In C++, the next mistake is going to be the hardware_{constructive,destructive}_interference_size constants that have serious ABI implications. I think that current GCC position is that they are not part of the ABI and they are subject to change (but also controllable from the command line).


> My problem is that C was elevated to a role of prestige and power, its reign so absolute and eternal that it has completely distorted the way we speak to each other.

I submit that this Tower of Babel truth is older than C and a reflection of the humanity that created the tool, not the tool itself.

Minimize the entropy and be at peace with existential imperfection, say I.


Funny you should mention entropy. C is indeed a high-entropy programming language, and one indeed should prefer a language with fewer ways of doing or expressing the same thing.


WebAssembly (wasm) will save us all! https://hacks.mozilla.org/2019/08/webassembly-interface-type...

Who knows, in the future maybe wasm will be the common ABI between languages.


I sympathize with the authors pain and understand what they are striving for though I also don't have clear cut solution in mind. A lot of blood sweat and tears are expelled for compatibility amongst things that for the most part are not that important or interesting, because they are a trivial data transformation. The choice of calling convention, endianness, data layout. Imagine if an engineer's design could only work if a specific thread handedness was specified for each screw. It feels like a problem that could be solved with something better, but there's a "leaky" aspect that has to be addressed. Sometimes a uintX is a number with ordinal and/or arithmetic properties and sometimes it's a bit field where every bit has a semantic meaning.


In the same vein, my favorite troll on this topic is "The C language is purely functional" (2009) from Haskell programmer Conal Elliott's blog: http://conal.net/blog/posts/the-c-language-is-purely-functio....


> Rust and Swift cannot simply speak their native and comfortable tongues – they must instead wrap themselves in a grotesque simulacra of C’s skin and make their flesh undulate in the same ways it does.

Sure they can, you just need to build the mechanism that allows them to.


>hear that everything on Linux is “just a file”, so let’s open a file on Linux!

That man page is for glibc. You instead want to look at man syscalls and man syscall to look up how to interface with he kernel. Linux does not require programs to use any amount of C.


Okay, here's the Linux syscalls(2) man page: https://man7.org/linux/man-pages/man2/syscalls.2.html

The second sentence is:

> System calls are generally not invoked directly, but rather via wrapper functions in glibc (or perhaps some other library).

Okay, so we can't avoid C that way, but what's the "invoked directly" way? Well, it's syscall(2): https://man7.org/linux/man-pages/man2/syscall.2.html

This man page actually reiterates what the other one, that you shouldn't be making syscalls directly! But in any case, what is `syscall()`?

...it's a C macro.

So yeah, Linux does require programs to use C. You've just been isolated from it.


It's true that the man pages assume C and the calling conventions are based on the C ABI, but it is possible to code the syscalls directly in assembly. In particular Go and Zig have such "bare-metal" syscalls.


You still have to follow the C ABI when interfacing with C. That's the exact problem being called out in the post.

Zig solves interoperability by incorporating an entire copy of LLVM. Go does do bare metal syscalls, but as mentioned elsewhere in these comments, this has caused breakage on Mac when the kernel was updated, because this interface isn't stable.


>So yeah, Linux does require programs to use C. You've just been isolated from it.

No, in the syscall man page it tells you how to call a syscall. For x86_64 you put the syscall into rax and then use the syscall instruction (syscall is the name of an actual instruction). That man page includes more information like what registers arguments should be in or where to find the return value.


I think that “generally” is targeting average programmers.

Programming language authors, writing their own stdglibbabbyscript, are exempt from that warning? That’s of course a big task nobody wants to do so many of them just piggyback on glibc with thin wrappers.


Well, yeah. Hence the rest of my comment. And if you don't go through glibc, then you still must follow the C ABI rules (since that's the only thing the kernel understands), and you are at risk of having your calls break when the kernel is updated (it's already been mentioned elsewhere in these comments that this actually happened to Go on Mac).


Ok, but what of other OSes?


Well, I can tell it's a programmer and not an engineer who works at hardware levels. All the complaints are "1st world" or "high level" programmer/computer problems.

If you've ever designed hardware and had to bootstrap it up to a "high level" situation - then these are nonsense concerned when you are merely trying to get a new hardware design up and running. You can always bootstrap to a language that already has all these things. The author obviously has never done anything like that. Not based in any reality where C matters.


C is still a programming language, even if you are frustrated with its ecosystem/standard.


Everyone's here talking about ambiguous symbols but anecdotally speaking I've maybe run into this problem once or twice in roughly 10 years.

Hardly an argument, in my opinion, and it's annoying to see parroted on every thread about C's syntax.


It’s not a problem for the typical C programmer, it’s a problem for the person writing anything that parses C. And it’s not the only issue.


I understand that. I'm more asking who's running into problems parsing C these days? A symbol table is stupid easy to implement.


Any programmer can use any programming language that he wants so long as it is C.

Henry Ford


The D compiler has a full C compiler in it for parsing headers.

The compiler also does the ABIs for several targets, even more if you include the LLVM and GCC backends.

C is a shit old language but we have basically tamed it.


It is just the simple fact that the user land API of all current operating systems is C. A programming language need to support at least C ABI for this purpose.


Ol' Bappyscript can just invoke open() from assembly code; no need to write a little C shim. That's usually how I end up doing system calls anyway.


The question is how do you call open() from assembly code? On which register do you pass which arguments? Since open() is defined in C, you basically have to speak the C ABI, and therefore C.

And open was just an (easy) example. Now do that for xcb_create_window and all the other functions of all the system library you want to interact with.


> Since open() is defined in C, you basically have to speak the C ABI, and therefore C.

.. but what does open() call? Do you think that's air you're breathing?

(The GP is implying you can just call directly into the kernel via a syscall,.. and all you need to worry about is your OS's syscall interface. You don't necessarily need the C ABI; unless of course your OS's kernel uses it for syscalls, at which point: sorry-not-sorry.)


Well, I know that's not really the point, but for xcb_create you can just speak the X11 protocol over the wire. No need to call any xlib function.


How should struct foo { int baz; float goo}; be laid out? What is the problem with the C abi that is fundamentally broken? Why shouldn’t args be passed on registers where possible?

What’s the problem with the ABI?


I prefer to use Nim. C is the default backend and abstracts a lot of pain away.


[flagged]


I don't see the bad faith or ill intent here (except maybe from you...). Nobody is pretending not to want to use Rust or any other language, that's explicitly spelled out in the article.

The reality the article laments is a bit more specific than just OSes being written in C. It's that the interfaces those OSes provide are ostensibly specified in terms of C, but also (because this is outside of the C standard's purview) in terms of a wide range of underspecified, underdocumented, and inconsistent C ABIs. Fortran hits this just as much as any other language- C itself hits it, and the article demonstrates this with multiple examples.

I'm not sure where you get the idea that anybody here lacks the required insight to work in this space. Because the author sees problems with the status quo, she must not know assembly or how hardware works? That's clearly absurd, look at the contents of the article (and the author's past experience if that doesn't convince you). Perhaps it is you who is lacking some context?


[flagged]


So this is obvious bait, but the author's argument is the same regardless of which language you want to use. (Like I said above, it even applies to C itself.) Rust happens to be one she has actually worked on, but its hodge podge of features and package manager are unrelated to the ABI issue, which it's pretty clear she does understand.


It isn't any kind of bait. And libfortran makes direct syscalls, making the author's argument flawed and lack of insight obvious, at least to me. The syscall interface is not C, if you look at the generated assembler code, the only artefact remotely related to C is pushing the arguments on the stack before making a jsr or a call instruction (and some languages, like for instance C, have the reg directive, which puts the arguments into processor registers, making the system call nothing more than a jsr, call, or trap instruction). That somebody actually believes it's C tells me that they don't know what's happening under the hood; it would certainly not be prudent to take what someone like that says for granted. It's the blind leading the blind again, as is often the case here. She did not grasp that the system calls are only documented in terms of C for convenience and that the generated assembler code has nothing to do with the documentation convenience. She would have known that and would therefore not have made the argument if she understood how the underlying hardware works and how to program it in assembler...

If she had had the requisite insight, she would have for instance known about programming the Commodore Amiga or the ATARI ST (among other platforms): if she had looked at the API's there, they are documented not only in terms of C (and explicitly for convenience), but primarily in terms of assembler. The Amiga System Programmer's Guide is an excellent example.

So again: this is all about arguing for Rust, in bad faith, due to lack of breadth and width of insight. Toxic for our industry at large. If she (or anyone else) is going to argue such subject matter, she (or anyone else) needs far more knowledge, experience and insight than she currently possesses...


Absolutely bait, and I'm falling for it. The author knows that you can make syscalls from assembly, directly in terms of the ABI those syscalls use, and trying to frame this as some kind of misunderstanding is absurd. You are willfully misreading the article.

Not to mention, your continued obsession with syscalls is itself missing the point:

For one thing, this is about more than syscalls. Many interfaces (e.g. userspace shared libraries, VDSO) are documented and implemented purely in terms of C and ABI the platform uses for C, not just as a convenience.

For another, many platforms actually do specify their kernel interface purely in terms of C and their C ABI, and do not have a supported way to make syscalls that does not go through such an interface- some kernels will refuse any syscalls that do not come in through their libc; others change their syscall ABI every release and ship wrapper shared libraries for userspace to use. That some platforms do support direct syscalls does not change this. (This also means that libfortran making direct syscalls will break on these platforms, as Go did.)

Your behavior here is what is toxic for our industry. It is quite clear that the article is describing a real issue (over-reliance on "C," which is, as typically used, a shorthand for "C plus the platform's C ABI," which is itself poorly specified and poorly followed) but you are twisting it into some weird axe-grinding vendetta against Rust, and completely making things up about the author in the process.


What you claim is technically impossible: there is no way to stop someone coding in assembler from making a kernel system call.

I am absolutely arguing against Rust because it’s a horrible language and even if it weren’t it’s not as ubiquitous as C or Fortran. Rust is absolutely toxic for the industry but especially so are the people who promote it.

If you make syscalls against a backwards-compatible kernel API, nothing will break. One just needs to know how.


> there is no way to stop someone coding in assembler from making a kernel system call.

Take a look at OpenBSD, then- if you make a syscall from any language at all, but it's not from within libc, the kernel will simply kill the process: https://lwn.net/Articles/806776/

This forced Go to switch from making direct syscalls to making them indirectly via libc: https://utcc.utoronto.ca/~cks/space/blog/programming/Go116Op...

I don't really care about your opinion of Rust, but if anything can make people who behave like you this upset, I'm all for it.


Not just Rust, but every language will need to interface with C sooner or later. And not just to the OS, but to the window manager, graphics APIs, libraries (ffmpeg etc). So, unless you think that C is the perfect language for doing literally everything, this still is a problem.

Assembly won't at all help you because the interface to anything other than syscalls (which are stable pretty much only on linux, and linux isn't the only operating system) is the C ABI for the specific OS & architecture combination. Yes, C works pretty much everywhere, but thus every language needing to support C is a mess.


Fortran doesn't need libc to make syscalls.

(And Linux was the very last kernel which begat stable syscalls: HP-UX, IRIX, Solaris and illumos have all had stable syscalls for several decades.)


Does fortran use syscalls on BSD, macOS, Android, iOS, and Windows too?

But, again, the OS isn't even the main problem. There are many orders of magnitudes more libraries that use the C ABI as their interface than what OSes provide through syscalls.


Yes, of course it does: libfortran implements the same or similar things libc does; in order to do that, it has to make system calls.


I find the notion of “how many do we need” an insult to human creativity. Also, simply asserting that those making new languages want to “carve out a name for themselves” is highly presumptuous.


When one's person creativity causes another a lot of work and unnecessary problems, then we have a problem. And a huge problem at that. This is exactly the case.


How does the existence of another programming language cause a lot of work and unnecessary problems? Like, the existence of, say, Zig, which I haven’t used, hasn’t seemed to cause anyone any problems.


All it takes is for someone to "get creative" and write a mission critical piece of software in their newly invented language, and then one gets stuck deciphering that mess when things go awry, which they always do.

Can you not see the consequences, or have you never had that happen to you?


People are talking about C, I think the issue is organisational structure, capitalism, lack of governance.

Universities are too academic and companies are too selfish or busy.

Writing native modules for Java, Node or anything else you run into these shenanigans which are very costly to the world.

I really do think in 2021 that G, MS, Nnvidia, Intel etc. should get together and create something simple and clear for ABIs and more importantly get everyone on the program.

That, and Javascript 2, which dumps a lot of its weirdness and comes with a comprehensive standard lib.


Both C and C++ are unfixable. You have to start new language.


If it isn't the neither is any other programming language.


These C-hating articles are getting out of hand on HN. I see atleast one every week on the front page. If you don't like C, STFU, pick-up whatever FOTM BS you like, GTFO of here, and leave us "normies" alone who have to get actual stuff done which absolutely requires C.


I don't think you read the article


Ok, I have entered into a wrong shitfest.

First. An ABI had nothing to do with C, it is an OS/hardware calling conventions - how the stack is used and procedure's parameters being passed. That is a machine code level.

Second. Redefinition of hardware types is beyond bullshit. I can't even comment on this.

Third. Yes, Every. Single. Architecture will have its own ABI. This is how linux kernel is organised (as a direct consequence of the above) and it has nothing to do with C. Again, it is a machine code level.

The rest is not worth commenting.

I missed the fact who wrote the article, so I am ok with being flagged in this discussion. This is the new normal.


Wow. I knew that some folks really didn't want to be told that their pet language wasn't the end all be all, and I know people hold strong opinions about [insert language here], but the comments in this thread are somehow even more revealing than the usual fair of comments insisting that [insert language here] isn't used for anything despite its use in shipping (low-powered) embedded devices, end-user consumer-facing products, kernels, webservers, frontends, bootloaders, memeory allocators, etc, etc.

But to go on and assert that it "has" to be this way, or that its this way strictly out of necessity... I don't know how to describe it politely without saying "stockholm syndrome". What is it about C, or about a certain subset of C-fans that results in them asserting that 'it has to be this way' or tossing derision at those who strive for better? Do some of you realize that is born out of living in an ecosystem where we don't have to deal with this bullshit and thus don't see it as "normal" the same way you do?

It's okay. I've watched the exact same type of excitement faced with derision on countless technologies in the past. Oh how wrong I've been to bet on [insert cluster technology here] or [insert linux service manager here], surely those would never take off because hand-maintained bash scripts are fine right? Just like an undefined fragile ABI is fine, right?

It kind of makes me think of some long screed I read last night from someone making alls sorts of claims about Linux ond the Desktop while insisting that they use Xorg. Meanwhile, under Sway/wayland, I am driving 3 monitors at different resolutions and refresh rates and they all perform perfectly on testufo.com).

Your trauma doesn't make your choices better. Your supposed growth in the face of now-unnecessary pain doesn't make you smarter. I have contributed code that runs on thousands of peoples desktops every day, in both C and [insert language here] and I cannot believe this is still a fucking conversation. I mean, thankful for one side of it, but sad that in yet another facet of life there are folks insisting that the shit-we-have is the best possible. Sad.


Nah it’s a programming language author just don’t like it.

You don’t have to write FFI for Linux. Use your great new language to write an OS, or one of the many OSes not written in C.

Let’s face the facts, your new language probably solves some pretty theoretical problems and not problems people actually have like, hey I need to layout these structs exactly so the hardware will work. Or I can parse a packet off the wire in a reasonable amount of time.


idk man, C is catching a lot of hate lately, but I honestly don’t see why. C is a fine programming language, in fact, outside of maybe the lisp, logic programming, and concatenative language families, C pretty much is the only programming language. Every other language really just wants to be C while fixing a lot of problems that aren’t actual language problems but rather programmer problems. If we were all perfect human beings C is suitable for any task and would be flawless. But programmers make mistakes and often attempt to code in languages without actually understanding the language (a good example of this is not understanding memory models) and tons of modern features do not exist if you remove C as the motivating, negative factor they were a response to.

And sure, because of the way things shook out, programming languages have to speak C in order to do something useful as fast as possible, but if you carry this sort of argument to its logical ends, it amounts to the same thing as saying “I would love programming if only I didn’t have to compile to machine code” ASM has the same problems—architectures have different instruction sets. Or further, this is like complaining about the fact that you need to code against a particular fundamental design scheme like von neumann. This is just the nature of computing and different ideas and markets yielding different products.

Also this article makes a mistake of criticizing C on the grounds that it’s now a protocol and poorly designed as a protocol while simultaneously admitting that it wasn’t designed to be a protocol—so it seems a bit unfair to call these “bad design decisions” on those grounds. Furthermore, it’s a lot easier to make these sweeping criticisms than to effectively solve what is not only a massive technical but also social problem.


>a lot of problems that aren’t actual language problems but rather programmer problems

This is a really naive take.

In aerospace design we have the concept of "Design for Human Factors" or "Human Factors Engineering" which is basically an acceptance that the pilot or crewmember is a fallible, flawed "system".

That is to say in aerospace we recognize that human beings have limits. They have finite breadths of attention, and they panic in a crisis, and they make mistakes when fatigued, and they're not capable of performing repeated tasks 100% consistently.

That's why for example we make the "fuel cut off" switch a different size and color than the intercom switch. That's why we make the engine fire extinguisher handle very prominent and easy to grab, etc.

For how many decades have software folks been writing use-after-frees and off-by-ones?

When is Software as a field going to get over its collective hubris and admit that C is poorly designed from a Human Factors standpoint? A tool that requires unattainably- or unsustainably-high human performance is a shitty tool; get a better one.


That’s an enlightening counterpoint and I agree. I think, when common human factors are known and understood, we should design against them.

I also would not recommend writing any large-scale modern software in C unless absolutely every piece of the codebase is performance critical, which is highly unlikely, and in which case you can probably use c++ for at least a few more safety features. Is C still a good choice for simple CLIs and small programs that don’t do anything drastic? I’d argue it is.

I was moreso using hyperbole to try and point out that it’s easy to judge our forebears for problems that we didn’t even know were problems yet. Saying “C is a poorly designed language” usually amounts to saying “C is a poorly designed language from today’s perspective” which is a bit disingenuous considering we wouldn’t even know about these problems without C. It’s kind of hard to be omniscient and design against things you didn’t even know are problems yet. Furthermore, for all the flak it gets, I think C got a lot of things right—it’s easy to forget that programming languages copy more from C than they distance themselves from it—really the only major wins are giving programmers less control over memory, easier numeric types, easier string manipulation, and generics, and go proved that even the last enhancement isn’t really necessary.

Sorry for playing the role of C apologist in this thread, but I feel a lot of these broad complaints about C are sort of immature insofar as they seem to eschew all historical context. I guess I’m just trying to point out that I see a lot more genius in C’s design than I see foolishness, but some people would have you think the C designers were complete idiots with the way they disparage the language. There’s a palpable lack of humility in such complaints.


>Saying “C is a poorly designed language” usually amounts to saying “C is a poorly designed language from today’s perspective” which is a bit disingenuous considering we wouldn’t even know about these problems without C.

I mean okay, but whether or not we have historical perspective doesn't change the fact that as we now understand it, C is poorly designed. If it's poorly designed, then it's poorly designed and we should stop using it.

There was a time when we "lacked historical perspective" and we built the De Havilland Comet, too. People died. Turns out that square windows cause stress concentrations so we started using better techniques. Nobody was arguing "yeah but for simple CLIs and small programs that don't do anything drastic, square windows are fine". Square windows suck so we make rounded corners now.

C sucks, we should use something that sucks less.


So many expletives. Perhaps a calmer and long term view might change the world.

How?

Take Linux. Currently written in C. Yay! Write a module for it in something else. Good. Perhaps you've heard of Rust? Ok, great. Write some more modules in that. Write even more. Even more. Now there's just a lot of Rust talking C ABI. Hmmm. If only... oh wait. Let's change that. Time passes. Boom! The whole thing is Rust. No C ABI target at all other than for C applications, which is how things should be.

This is one way to proceed and it may have already begun.

https://www.zdnet.com/article/rust-in-the-linux-kernel-why-i...

In reality C is a classic and has weathered many storms, fads and changes in fashion. Even with other languages in the linux kernel it will probably stick around for a lot longer.


And what about other languages than C and Rust? There still needs to be an "interface lingua franca" between OS internal modules too, not just between user code and "system DLLs".


What about them? As you say you still need a calling convention. I'm not disagreeing with that.

And I chose rust at random because its an available systems language. Why not something else? You don't actually need C at the bottom layer. It won because the OS are most commonly written in C, it's encountered elsewhere and therefore popular. Usually because it helps get the desired result. That isn’t nothing either.

In days of olde there was pascal calling convention. Why? The os was written to that, because it was written in that. Surprise! I'm not even suggesting these are bad things.

Linux for me is a kernel. The choice of C is purely an implementation detail based on pragmatic decisions to suit its purpose. I am merely suggesting that the calling convention isn’t locked at all layers for all time. Yes you need a common convention. But it needn't be C and it doesn't need to be C all the way up or down either.

I'll leave it there. Seems that discussing this kind of thing isn't wanted.

Side note: I wonder what would have happened if I'd chosen Lua for my example. Would people have gotten hung up on the fact that its commonly implemented in C? Would they have locked onto that yet not realised that it needn't actually be written in C at all? If you wish to explore that, swap rust for lua in my previous comment. There are ways of using lua as a calling convention without even referencing C. If you want a more realistic example, swap in dart or more likely D. Everything I've said still applies. And yes you can have mutliple calling conventions or even a simpler one. Its just a convention of how to pass parameters, amongst other things. Thats basically it.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: