C is the glue interface that connects all the different languages together. I wouldn't want to see anything new added to the language that complicates this lingua franca.
I liked C in the 80s, 90s and 2000s. It was a neat and simple language, but that changed as compilers got more aggressive with their optimizations and exposed just how complicated and unintuitive the spec actually is (and the impossibility of avoiding UB). C was a good learning experience for language committees (as was C++), but it's time to cut our losses and move on.
I also liked C++ in the 90s, but soon grew to despise it. When they tried to revamp it and make it safer to use, I was hopeful, but that's now gone. It's too complicated, the error messaging sucks, and there are too many different ways to do the same thing (with ever changing best practices). In a way, CMake and C++ are two peas in a pod.
This is why I now pin my hopes on rust and zig. I want to write low level code, but not in C or C++. I want a build system that's not Makefiles, CMake, or (shudder) gradle.
> C is the glue interface that connects all the different languages together. I wouldn't want to see anything new added to the language that complicates this lingua franca.
Indeed, this is one of the two things where C is still king, the other being running on weird hardware with specialized compilers.
That said, C is not a great glue language:
- Zero terminated strings are only used in C so every other language needs to do a copy to pass a string to C, even C++ (for string_view at least).
- A lack of fat pointers means that buffers need to pass their pointer and length separately, which is awkward, or use a custom struct which other languages will know nothing about.
- It has no standardized error handling mechanism to help with writing automatic wrappers on the other language endpoints.
- It has no modules or namespacing.
So while C will continue to be the glue language of the programming world, I wish it was just a bit better than what it is, would make language interoperability a lot nicer...
> That said, C is not a great glue language: - Zero terminated strings are only used in C so every other language needs to do a copy to pass a string to C, even C++ (for string_view at least).
I don't think that's what he's talking about. No matter what glue you had, you need to pass NUL terminated strings to C code that uses NUL terminated strings.
The glue interface does not require any such thing though.
> A lack of fat pointers means that buffers need to pass their pointer and length separately, which is awkward, or use a custom struct which other languages will know nothing about.
Again passing things between two languages will always necessitate that. One language might have a different fat pointer implementation than another.
And the nice thing about glue code is that it can be easily automatically generated calling wrappers so "awkward" does not come into it - you change one language's data types into another's.
> - It has no standardized error handling mechanism to help with writing automatic wrappers on the other language endpoints. - It has no modules or namespacing.
I don't see how this matters for the interface glue. You can define the error handling with it how you like. You put it in your language's namespaces and modules.
Your comment is assuming that language A knows about language B. That's one case, but not the usual one.
Rather, languages like Rust/Zig/C++ and even C# and Java let you expose a C API such that other languages can understand the shared library as if it was written originally in C, because C is the only thing every other language talks.
By necessity this involves using standard types available in C otherwise the other languages have nothing to go on, they'd just have a bunch of opaque types and no way to create them.
If C had better builtin constructs, these "extern C" APIs could be much better semantically, such that automatically generating wrappers to this C API would result in safer, more efficient, and more ergonomic code in the host language.
> Your comment is assuming that language A knows about language B. That's one case, but not the usual one.
> Rather, languages like Rust/Zig/C++ and even C# and Java let you expose a C API such that other languages can understand the shared library as if it was written originally in C, because C is the only thing every other language talks.
The C interface glue is not the problem here. You can write the library with a completely different language and using fat pointers, and simply adjust the calling convention at the C interface, and it can be called by a different language that uses fat pointers and everything can use fat pointers.
The actual problem is you can't just say that your interface glue is going to have all these wonderful features and that therefore everything will magically work and use them.
> By necessity this involves using standard types available in C otherwise the other languages have nothing to go on, they'd just have a bunch of opaque types and no way to create them.
Using C interface glue does not require you to even write a line of C code let alone C types if your compiler knew how to call the APIs.
> If C had better builtin constructs, these "extern C" APIs could be much better semantically, such that automatically generating wrappers to this C API would result in safer, more efficient, and more ergonomic code in the host language.
Again there's no reason your language can not wrap those and expose your desired higher level semantics or features to your code on the other side of the wrapper layer.
I'm sorry, maybe I'm not making myself clear and we're talking in circles. Yes the C interface is the problem.
Lets say I'm developing an API in Rust that I want everyone to be able to use. I wrap this API in some C functions. I use fat pointers (slices), strings with length, etc. all custom struct types. I choose to signal errors in some my_err* parameter.
Then someone developing in Java wants to use my library. If they use an automatic wrapper generator (which knows nothing of my API), all the semantic information on slices, strings and errors is lost. They need to make an additional wrapper on top of the stuff the automatic wrapper generator did.
If C had better builtin types, the slices exposed by the "extern C" API in Rust would just be standard C slices. The strings just standard C strings, the errors just standard C errors. The semantic information would be preserved at the C level, and therefore the automatic wrapper generator could add code to automatically translate slices, strings and errors into the appropriate types in the host language.
You seem entirely focused on the fact that the C ABI can be used to implement stuff like COM or SWIG for language interoperability. True, but not really relevant.
Since you seem to have vastly the scope of your complaint down to "it doesn't generate all wrapper code automatically for all features", I'll take that as you conceding that the interface does not prevent these features from being implemented across it. Great.
And if that's the biggest remaining problem with it, it would be possible to address by augment the interface glue with some extra metadata in a C comment or something nice and simple to describes higher level constructs without the base being tied to or require some specific implementation of them. Just write up a spec and we're done.
I agree with you, you can certainly do it, but whatever you do is not the standard that every language talks, because that standard is C :).
I can add all the metadata comments I want to my C header saying that the int I return from foo() is an error code, but I'm not going to get Java, C#, Haskell, COBOL, and so on to agree on that new metadata, specially not when each of them would probably come up their own solution.
That C became a standard glue that everyone agrees on was a miracle. I can't think of many other standards that pulled that off. It's not happening again. Hence any improvement to C, and C specifically, benefits everyone. Adding extra crap outside the standard does not accomplish the same purpose.
I've actually thought of defining such a thing before, but I'd just be reimplementing SWIG, and of course the problem with SWIG is that they have to make all the infrastructure to support the different languages, it isn't the language developers themselves developing the interface (pinvoke, jni, etc.)
> Your comment is assuming that language A knows about language B. That's one case, but not the usual one.
You are free to expose an API where strings are not null terminated, etc. Anyone in any language is going to have to work their data into the API you provide so if it is alternatively complicated to make an intermediary language happy that's not really helping.
This brings up a vague idea I have had. Redis is sort of a data structure server that can be used across languages. Could we not have an opensource data structure and algorithm library? Where languages, instead of implementing their core types and algorithms in their own library, instead add them (or reuse existing) ones to this library. Then your language becomes just a mapping of your languages semantics over these existing structures & functions.
Unfortunately this doesn't work for managed or interpreted languages with their own runtimes, where you want the types to run on their VM.
An example would be JIT-compiled C# data structures, no way to implement those on top of a shared library since specific instantiations have to be produced and compiled at runtime.
Such a data structure and algorithm library as a springboard would be still be incredibly useful, but then you wonder why it's not called "the C standard library" :)
I wonder though, I know a lot of tricks for performance are used, such as in python API, judicious use of inlining, putting some common functions in headers files etc. Depending on how far down the rabbit hole you want to go, could this library not also include code generation functionality for JIT's? Or even at compile time, is there anything stopping a library from generating code that could be included in your VM runtime as just an opaque function that can be called?
At that point you're effectively creating your own VM, and then you may ask why doesn't someone simply use the JVM or the CLR. Once you reach that degree of complexity, there are simply too many tradeoffs.
I certainly won't stop you from trying to develop such a thing, could be a nice way to get languages bootstrapped faster. Just keep in mind JIT compilation is forbidden in iOS, so e.g. C# needs an entirely different compilation strategy for that platform.
You can, but then you cannot automatically generate a wrapper that can convert language B strings to language A strings disguised as an opaque type C, because there is no way for the wrapper generator to know it is a string type, unless you wrote the wrapper generator yourself for your API specifically.
Such annotations exist in specific compilers (e.g. clang has things like _Nullable and _Nonnull), they are just not part of the standard (unfortunately). But those annotations are visible in Clang's ast-dump, so they can be used to create more correct language bindings.
Yes, I'd love it if they got standardized. Having them implemented in a C compiler already helps quite a bit for more "standardized" adoption over some ad-hoc approach.
I certainly don't want C to be getting exceptions or templates or other heavyweight constructs, just some QOL stuff that allows for more semantic information to be shared.
I agree with you. However, I think the real value of the language is that, despite flaws, it provides a reasonably consistent platform in this "glue" context. It better to have one imperfect language, than a dozen perfect.
I think it would be beneficial to create a better interoperable ABI description (one, not a dozen). Not in a form of a language header, but a data file that can be easily consumed by tools of any non-C language.
1. C headers are exceptionally annoying to correctly parse and interpret. You basically have to have a full C compiler, which is a PITA for a non-C tool. It's not just complexity of the C syntax, but dependency on system headers, implicit platform-specific type sizes and struct layouts, and an arbitrary web of ifdefs which may need to be generated by an arbitrary build system. Everything here resists automation, and if you get any details wrong, you won't know until you see memory corruption.
2. The common baseline between most programming languages has outgrown what C can express. Namespaces and other name mangling, unwinding, non-nullable types, slices, ownership/destructors, thread-safety annotations, etc. C++ has some of these features, but it solves none of the above problems, and only adds more complexity and its own compiler-specific behaviors.
It has been attempted at least at the OS level. COM and the much improved WinRT are one example (WinRT relies heavily on metadata), the Objective-C runtime on Apple platforms is another (which instead accepts lots of dynamism).
The problem is that OS vendors are the only ones with enough sway to force people to use these things, and even then just barely (the Rust WinRT support is developed by Microsoft themselves).
The only entity with enough sway over every language in existence when it comes to supporting an ABI is, ironically, the C standards committee, because every language already talks C and needs to continue to do so.
I know "upvote" comments are frowned upon, but I just wanted to say I most certainly agree. I'm glad we ended up with such a language, even with all its flaws.
> Zero terminated strings are only used in C so every other language needs to do a copy to pass a string to C, even C++ (for string_view at least).
This is one of the most frequent complaints and I find it ridiculous. C has string literals that encode zero-terminated strings, but you don't have to rely on these zero terminators. There are a few awquard "string functions" in libc, most of which you should just ignore. Besides maybe the printf() family (which you probably wouldn't want to use for interoperability there isn't anything that requires zero-terminated strings.
This language gives you memory to work with, and it's up to you how you want to achieve your goal.
> A lack of fat pointers means that buffers need to pass their pointer and length separately, which is awkward, or use a custom struct which other languages will know nothing about.
It could be considered a security problem that sizes have to be passed separately, but what I consider awkward is actually the slices approach: In general the count field in a "slice" is redundant data - think about parallel arrays, where with slices you need to synchronize the count information across all slices. Passing separate length is the correct approach from a normalization standpoint.
> It has no standardized error handling mechanism to help with writing automatic wrappers on the other language endpoints.
Similar to the point about slices, not including any notion of "errors" is the correct way of doing it. In general you don't want errors in a system, only data that you handle in an appropriate way. Also, returning what people usually consider errors is often a bad idea, as such error information should better be stored in a persistent handle or context struct.
> It has no modules or namespacing.
Not a problem, since there are names that you can already use for disambiguation. Namespaces only add more ways to identify the same object, which leads to ambiguity and/or hard to read and hard to grep code. (Namespaces encourage the use of not-fully-qualified identifiers - actually this is the only thing they really add, and apart from some metaprogramming tricks you can play I don't think it's a great addition).
Because a C "string" is a char* by convention, any time a library exposes string parameters that's what they use. This includes libraries not written in C! I can't even think of a counter-example. Using an own string type adds too much friction, particularly because it prevents automatic binding generation. With errors such a standard doesn't even exist so you get a mess.
And yes, as a library developer you need to signal errors somehow, and whichever solution you use, the binding generator doesn't know about it. Rust has a nice error propagation mechanism which lets me attempt a set of operations but abort early when an error happens, but the binding generator can't use it.
Namespaces are a similar problem, they force library devs to try and come up with unique but short prefixes. Had C had namespaces the "short" constraint wouldn't be a problem, pretty much eliminating the chance of collision.
Am I right in assuming you're an embedded developer? Typically when I have this discussion with people the points that you make come from embedded devs. They're not reflective of the needs of library developers.
> This is one of the most frequent complaints and I find it ridiculous. C has string literals that encode zero-terminated strings, but you don't have to rely on these zero terminators.
Just a quick question. Is there somewhat up-to-date guide, list, book, whatever of good best practices for C, or maybe small libs for simple string handling, and common pitfalls (checking for over/underflows etc.)?
I don't know any in particular, but a few tips here.
I would say rule 1 is to not use strings unless needed :-)
My rule 2 would be to not use a library because that would probably add too much complexity (especially with regards to integrating memory allocation).
There are different valid ways to represent strings, but a basic approach is of course to use arrays of bytes, i.e. pointer + length (struct String { char buffer; int length; }). An important consideration is the choice of allocation scheme for the byte buffer. I'd recommend to use statically allocated strings (like char buffer[32];*) where possible, and to look into memory arenas. Don't make "resizable" strings unless absolutely needed (with resizeable strings you might run into dangling reference problems more easily, and you will probably need a "capacity" field in addition to pointer + length). Most dynamically-sized use cases do not need resizing; you can conveniently cover them with a separate string builder (which can be implemented using a large statically allocated storage or using a resized-as-needed storage. Once the string is assembled, the string builder can create the final immutable string by allocating for example from a memory arena.
> C is the glue interface that connects all the different languages together.
That's one of the issues in most UNIX systems. Windows sidestepped it with COM (and a much improved WinRT as a successor, with a .NET-y object model). On Apple systems, Obj-C is pretty much usable from other languages too, being the standard ABI to glue it all.
My takeaway: bring one of those two options properly to Linux, which is needed hard by now, to not fall to the lowest-common denominator.
(C++ isn't a solution because of a dubiously stable ABI on quite some platforms)
Erm, isn't the whole point of COM that it is also just a "C glue interface"? Not how one would typically design C APIs, but COM APIs are C APIs nonetheless.
But it's higher level. It lets you expose interfaces and defines memory management and threading vs using raw C functions where you have to manually encode all of that for every other language.
D-Bus is for IPC not an ABI. If there was something similar it'd be gobject, but i don't think any language other than Vala speaks it "natively" (and considering how much Gtk cares about backwards binary compatibility, i don't think it'd be a good idea to target it anyway).
It can run in-proc and fits the scenarios being given on previous comment.
I remeber Bonobo, DCOP and KPart days, so I know pretty well what D-Bus is capable of.
In micro-kernels the IPC infrastructure is the userspace ABI, as it is the only way to call into OS services, beyond the little glue library to make it possible to do so.
DBus is an IPC protocol, it is the first thing the specification[0] mentions, there isn't any way where it would make sense for it to be used in-process nor the spec provides for such functionality. Some higher level library could potentially provide an API for exposing and consuming (object) interfaces that can be accessed either in-process or out-of-process, though only the latter would really be using DBus and the former would bypass it (and both would be only really usable only by users of that library, not anything using DBus).
DCOP is/was also an IPC protocol, though much simpler than DBus as it was essentially about passing arbitrary data between processes with some IDs attached (the higher level API did pretend that they were objects and method calls but the -still public- lower level API worked in terms of string identifiers and data streams), which makes it unsuitable for a cross-language ABI. KPart however (which is still being developed and part of KDE) is sort of a relative idea to COM in that it is really something similar (in concept) to ActiveX, so it could handle some uses (though it lacks the genericness of COM), except it being C++ based also makes it impractical for a cross-language ABI - which is the entire reason COM was brought up (and why IPC approaches like DBus and DCOP aren't relevant to what was discussed).
COM nowadays runs mostly out of process unless you like crashing the whole application, hence the reference to nano-COM for in-proc APIs that are basically OO ABI.
That is why any modern Windows has plenty of COM Surrougate processes running at any given time.
As for D-BUS, I stand corrected, I though it had such optimization in place, just as other mechanisms offer.
Indeed, and rust, zig, fortran, and julia don't have the problems with c that c++ does, so is it a problem with c or with c++ that the author is actually talking about?
The biggest problem is that there is still a significant faction within the C++ committee that wants closer relations between C and C++, or even a merging (if such a thing were possible). "C++ is a better C" is a mantra that can still be heard to this day within those hallowed halls.
If C++ would just treat C as an interface like everyone else does, this problem would shrink considerably.
That wasn't possible anymore from the moment on when C++ decided to not treat C as a proper subset (like Objective-C did), but instead fork its C subset into a slightly different language.
If C++ had just treated C from the start, it would have been much cleaner, and nobody would have used it.
Humans are stupid, short-sighted creatures, and programmers are no exception. The only way to get C programmers to use C++ was to make C-with-classes a drop-in replacement for C. Replace "cc" by "CC" in your build system, and you magically get new features, without breaking old code (at least not too much). Sure it crippled C++ from the start.
If C++ had stopped at binary compatibility (either with an FFI like Rust or Zig, or by deciding on a subset that can talk to C interfaces), it would have forced users to use two compilers, as well as learn an entire new language. Few people would have been willing to put in the work, Stroustrup knew this, and he wanted his language to be useful now.
Add to that c++ is infectious, both in a code base, because once you add c++ you can't go back, and in dependency chains. A c++ library that depends on c libraries often only exports c++ interfaces so the things that would build on them must be c++. Where julia might call into a rust library with a fortran and zig dependency.
If only. For all the "moar strong types" goodness C++ claimed, in practice there was very little. The same could have been achieved with stricter warnings, that C compilers now have. With TypeScript at least you have the option to go from full dynamic typing auto cast madness to meaningful compile time checks.
That being said, if I had to write a serious web app today (that is, something that absolutely requires client-side scripting), I would consider using JavaScript as a compilation target only, alongside WebAssembly.
Will wasm ever get direct access to the dom so that isn't needed? I keep wanting to jump into wasm, but I don't really want to learn any more javascript than I already know, which is just enough to play codespells and screeps. And I certainly don't want to learn a framework which seems to follow the madden nfl release schedule.
Ideally I wouldn't learn a single thing about JavaScript, and go straight to WebAssembly. Possibly chose a language that already has a compiler for both so I don't even have to learn the standard.
And no framework either, I'd do my own things with as few dependencies as I can possibly manage. No point tying myself to some protean monstrosity that breaks code every 6 months and is abandoned 5 years later. (OK, not a web dev, so I don't know about that last one. It's just what I glean from the web's reputation.)
Goodness, I'm running a zero-warning policy with -Wall -Wextra all the time. They have removed tons of bugs in my code before I even touch the first sanitizer, they're my first line of defence.
The comparison is flawed, because Javascript is a subset of Typescript, but C is not a subset of C++. A better analogy would be that C++ is to C what Dart is to Javascript.
Also, even now, the intersection between the two languages is large enough to be useful. I believe the entire Lua VM has been written in that subset, and so has my cryptographic library.
Though you'd have to be careful, it's still possible today to write useful C code that also compiles as C++.
Honestly, this is a C++ problem and the article should be named "The Problem with C++". Like you said, other languages do not have these problems to the same extent.
C++'s problem, which was its killer feature for adoption, but now has grown into a problem was having been born in UNIX world with copy-paste compatibility with C.
So in many cases, although a better solution exists, many devs will code C+ unless forced to do otherwise.
I'm still a proponent of C as portable assembly, and any sort of optimizations / UB that gets used resulting in warning messages that can improve the source code of the program rather than the one time artifacts the compiler produces.
I'm curious, are you working on such a C compiler?
For the two decades I've used C, I've not had a compiler that resembles "portable assembly." The unoptimized code they generate is extremely naive and far removed from anything an assembly programmer would write, easily resulting in many times more instructions than a simple & straightforward assembly implementation. As one might expect, the performance of such code is atrocious.
With gcc and clang, optimizations are absolutely mandatory if you want compilers to generate anything resembling good assembly. And it is still an uphill fight to avoid silly code. Just last week I scratched my head because gcc insisted on recreating a constant that is already there in a register, in that very same register it is recreating it in [1]. You could force it to keep the value in a register by creating a top level variable and use the asm("register") extension, but that just resulted in gcc making copies of that value into other registers.
Also worth pointing out that there's a lot of stuff in assembly that you cannot express in C directly. You want a rotate instead of two shifts and OR? You absolutely have to have an optimizing compiler, because rotate does not exist in C. Want to shift and test carry? No, that is not possible in C. Compiler extensions give you access to some things (e.g. popcnt) but the vast majority of assembly is only achievable indirectly by assuming the compiler can optimize your code and figure out what instruction you want. Also those extensions hardly make it portable..
I also think the vast majority of C code I see looks very different from assembly; it is written under the assumption of an optimizing compiler, which converts idiomatic C to somewhat idiomatic assembly.
C cannot express quite a few useful things that assembly can. Things like tail calls, stack management, non-flat address space just cannot be expressed.
I'm hopeful for Rust. I'm a C++ developer. I use Python all the time as well when I want something to just work, and fast.
But by god I love C. I use C for most of my library interfaces so that I can easily make C++, Python, etc versions in more pleasing versions for those languages. It's much, much easier to work from C and to another language than it is to cross language boundaries laterally.
That probably won't change, so I'll always use C at least a little bit when building tools.
In that respect, C is still winning. It's upstream of everything, easy to compile, and has libraries for everything.
The Zig creator is very conscious of this, talks a lot about rejecting many proposals and ideas to keep the language simple, not being an ass-hole about it, just to avoid what happened to C++.
And then the question is whether it will grow fast enough, or get overshadowed by a language that’s a bit more like a weed, growing fast in whatever direction it can, and whether it will then die because of lack of sunlight.
It’s tricky to balance features, backward compatibility and ‘cleanliness’ in a (language, standard library)
Computer science is less than a century old. There's no reason to think we know enough to design perfect languages yet. But with an additional 30 years of hindsight, we can make somewhat better languages now, and we should.
We already had 20 years of hinsight that C's authors decided to ignore, had UNIX not been free beer for all practical purposes, history would have taken another path regarding C's adoption.
It would have sufficed that the language had proper bounds checking, string and array types (that doesn't preclude pointers to everywhere on the universe), strong namespaced enumerations, required &array[0] instead of pointer decay, proper sized allocation.
All lessons from what was being done from JOVIAL, ESPOL, NEWP, PL/I various offsprings, during the 20 years that preceded C.
There's absolutely no guarantee that these problems will be avoided in rust and zig; we only have the benefit of hindsight to avoid the problems we have now, and a recorded history of the kinds of things that tend to go wrong in committees (factions, infighting, BigCo interference, intrigue, egos, bike shedding, etc).
It's not a particular operation that is the problem, it's writing a whole real-life program with no UB that is the problem.
Even so, one example in C++ where it's almost impossible to avoid UB is treating a piece of memory as both a struct and raw bytes with guarantees of no copying. This comes up a lot in network packet processing.
Well, in practice memcpy and std::bit_cast will not normally copy, if you're careful with how you use the variables and so on, so it's not completely impossible to do this.
There are also compilers (g++ for example, if I understand correctly) that define type punning through unions to work like in C (allowing you to read and write through different members of the union, which the C++ standard prohibits).
I bet they use UB in more than one place. It's just some instance of UB that compilers aren't exploiting with crazy interpretations that lead to faster non-working programs.
The people saying that it's technically possible to create that code without UB say that with a certainty that isn't viable. You don't need to just read the standard from beginning to end to determine if your code has UB, you need to also remember it all. Some part of it not applying to your case doesn't stop the UB on that part from appearing in your code, and there isn't a usable list of UB you can quickly check against (the closest that exists is the set of UB that compilers are using on their crazy extrapolations, what is not the same thing).
There is just no way to know that you avoided it all.
Not offhand, no. It's been more than 5 years since I last did any serious C dev work.
But for an entertaining read, there are some fun stories floating around about Linux vs GCC with the kernel devs resisting compiler optimisations, and even holding back the "supported compiler versions" over it :)
> > There Are No C Conferences. Maybe that’s why the C++ committee is now over 10 times the size of the C committee.
I have this controversial thought: C people understood one thing before everyone else (including the creators of C++, Rust, and any other language), and is this:
“Perfection is Achieved Not When There Is Nothing More to Add, But When There Is Nothing Left to Take Away” -- Antoine de Saint-Exupery.
Super-bloated languages that keep adding features every month/year are far from perfection.
Now bring it on!
Note: I love both C and C++. I prefer C because it's simpler (I write firmware for a living). Sometimes I use C++ with simple classes and I might throw in a template or two if I am in a good day. Of course my classes are for my use only: don't try to copy/move them!! I won't chase every little nitpick of deleting and overriding every operator/constructor!! And the further I might go is implementing a Singleton.
Hey, do you mind pointing me to some learning material regarding firmware? I have a totally unrelated background (I'm a chemist) although I picked up some coding during my PhD (the usual "science stack": Python/shell scripts). After that I learnt C and fell in love with its simplicity and power and firmware/drivers always seemed like magic to me so I'd like to understand them a little bit better.
Not the GP, but to me the only difference between firmware and software is that firmware is often (but not always) "burned" right into a piece of hardware, which is usually a chip that has a microcontroller of some sort inside it. The PC BIOS is also called firmware sometimes - because it is "burned" into the PC itself.
A device driver, on the other hand, can be implemented in software or firmware, or, usually these days, as a combination of both. Drivers indeed are magic - since they usually designed to interact with, and to control, physical systems, they must work in real time, and writing a driver requires understanding the system's behavior to a minute detail.
Let's see what (and how it) lands in C2x. Of course, as any other language, C might be taken over in the future by a new generation and be bloated, too. But so far, it is contained.
In any case, nobody can deny the amount of bloat in C++ has no parallel.
> Naturally at this point one could ask why not just use C++ instead
I might do, in a limited and simpler form, as I said before. Of course I am not developing something like SKIA; just firmware in a limited and constrained environment (not even using std).
As I like to put my privates at the bottom of the definition of a class, having to use the class keyword and then having to put a public access specifier immediately after is fairly annoying :).
I stick to lean and simple C89 with benign bits of c99/c11. With nice tables of functions and clever recursive pre-processor naming, you can scale that easily to large modular applications and keep everything very clean.
The absolute and unquestionable truth: a c++ compiler is abysmally more complex and hard to write than a C (89 with benign bits of c99/c11) compiler, and this is mecanically true due to the syntax complexity difference. There are many shmol and cute alternatives C compilers out there (and there are working better each day which goes by, and some are doing more than enough optimizations), and I would like to use them, that which is not possible with c++, due to its complexity hence shuting the door hard on any reasonable efforts of writing a "working" alternative.
That said, personnaly, I have high hopes (maybe too high) if RISC-V is successful: I see everything assembly (not even C anymore) with high level script languages of the like of python/javascript/lua/etc. I am writting more and more assembly these days (mostly x86_64 though, desktop RISC-V are only starting to ramp up in performance), and the main pitfall seems to be the abuse of the macro language of the assembler, which could be averted if explicitely aware of this bias.
Can you expand a little on the pitfalls you see concerning the 'abuse of the macro language'? I have an undeniable pipe dream that my hobby language can be essentially a well-defined and explicit thin cover for assembly language primitives. I think your thoughts about the downsides of 'higher-level' idioms/practices when using assemblers would be very on point for me to consider.
I use FASM, and FASM has probably one of the most powerfull macro languages out there. FASM actually implements _only_ this macro language and _not_ the x86_64 assembler itself. The x86_64 assembler is actually written using this macro language, like the support of ELF, COFF, etc, binary formats. Some ppl have implemented an ARM assembler with this macro language (and I am thinkink about RISC-V). Its author likes to refer to this macro language as an "assembler writting toolkit" which looks to me more than fair.
As a practical example, this macro language has full support of modular and hierarchical namespaces. I found myself lost into dividing my code paths in basillions of namespaces and that way too much. Excessively, I also started to write my "own" language using this macro language: I was not writting assembly anymore, only macros and some with hardly any assembly in them.
This is actually a bias that happens with any computer language: the tendency of devs to "maximize" the usage of some syntax, which kind of increase exponentially with this very syntax complexity, to reach a point where a Rube Goldberg machine looks efficient and sane (yes, this is severe irony).
If I'm writing C++, I am doing it because I need nanosecond latency (and the delicious language libraries). If I only need microsecond latency, I would use Golang. If millisecond is fine, it'd be Python in a hot minute.
If I only need (as is 99% of the time the case) a smidge of code to run nanosecond-fast, I'd write a C extension and call it from a higher-level language that does the orchestration code. The C extension would by definition be simple code, so there is no real problem with C as such - it's ideally suited for that role.
Only in the rare case that a large program needs to be nanosecond-fast would I even consider C++ (and I'd push really hard to learn if that really needed to be true!), and then I'd write code like it was 2003 - smart pointers and POD classes, but that's it. The new libraries are wonderful for saving code and getting good fast implementation of algos!
>Only in the rare case that a large program needs to be nanosecond-fast would I even consider C++ (and I'd push really hard to learn if that really needed to be true!)
This is mostly true for web development where bottlenecks just exist in IO and beyond. For gaming this is not the case. The entire industry needs nanosecond latency so engines must be written in languages that are nanosecond-fast.
Yes, that's true. I've actually been part of 3 published games, all in C++ -- all written with the ethos I mention above (POD classes and basically write like it's the year 2003). Bitcoin is the same way, written in C++ -- unfortunately that codebase went hogwild with new features just to do it. Codebases like zcash are absolutely nightmarish for no reason at all (given what they are doing is so trivial)
As a long-time C aficionado, part-time wanna-be language-lawyer, and just general supporter and fan, the fact that there is no C conference was interesting. I guess I knew that, because if there was such a conference I would have known, at least I would like to think that. On the other hand I've never been to a programming language conference (probably self-fulfilling by being primarily a C programmer at heart) so it's not "a thing" for me.
Anyway, some things I think would make for interesting content at such a conference, should one decide to exist:
Modern C. It would be useful with practical down-to-earth comparison between C the way (I feel) many think it "should" be written, i.e. C89-style with lots of worry that the compiler may be broken, and something more modern/sensible.
Standard directions. Some kind of survey-style talk trying to both summarize known interests covered by the committe now, and also perhaps polling the audience for input about where C needs to go.
Compiler architect's thoughts. Get the major compiler writers in a panel, have them talk about how various language (mis)features affect their ability to write good compilers, and what changes would help with that, if any.
Code sight-seeing. Just having a few knowledgable folks navigate some well-known codebases on a projector with commentary/analysis would be very interesting I think, both to see how others have solved things, and perhaps spot problems/possible improvements. Must be done respectfully of course.
Now I almost feel as if I miss that conference. :/ More ideas, anyone?
> As a long-time C aficionado, part-time wanna-be language-lawyer, and just general supporter and fan, the fact that there is no C conference was interesting.
There doesn't need to be a conference for everything under the sun. I don't see a screwdriver conference popping up around me anytime soon.
I wonder if the reason for there being no C language conference is: It's generally accepted that C is "dead" or should become dead. In fact at C++ conferences you will see many anti-C talks:
C++ is a semantically overloaded language. It is very far from "C with classes" at this point. In C, every construct has a direct mental mapping to its rough representation as machine code, and its corresponding side effects. C++ adds a "meta stage" after which the machine code might be the same, but it takes a lot of time for the programmer to figure out what is really happening.
I dislike C++ for the same reason I dislike all languages that try to be everything. You can use the language for years and there are still parts of it that you don't understand properly. C on the other hand has such a minimal and clear syntax that you can learn it pretty fast and feel confident about the language.
While true, most C coders are fine the idea of only introducing complexity upon demand.
No, what I mean is these subtle things that are impossible to get right: stupid integer promotion rules, needlessly complex declarations, messy stdlib functions, undefined behaviour as a feature, etc.
And then there are languages, which are fractions of the amount of syntax you have in C, much more consistent throughout, which can actually claim to minimalistic.
As far as I can tell, the definition of C++ is mostly driven by compiler folks (and wannabes), motivated to add features for their own career benefit and generally doing so in the way that's most convenient for them (e.g. std::move as a pseudo-function instead of as real syntax that the already-insane parser would have to handle). Since only the in-group's interests are considered, the much more numerous regular C++ users are ill served.
For all its faults, there's much less churn in the definition of C and what change does occur seems to be driven by clearly demonstrated needs out in the real world. C might seem stagnant, outdated, incomplete, or unsafe - all fair criticisms - but at least a competent C programmer's hard-won knowledge of how to overcome these deficiencies doesn't get turned on its head every few years as a new language version comes out. Even decades-old C (at least since the K&R vs. ANSI function-declaration change) is still readable and scarcely less idiomatic than something written last week.
So, referring to the OP's title, which language has the bigger problem?
I don't think they are wannabes. Getting a feature accepted to C++ is hard. That's why most people push their features into Boost (well, MVP style). The C++ committee has respected people with years of experience.
Because my takeaway from the article is this: C++ piled loads and loads and them some loads of stuff onto a slim, easy, comprehensible language, is now a gigantic pile of complexity. that has precious little to do with its roots any more...and every attempt to solve its problems includes piling on more complexity.
C++ is an overloaded language whose complexity seems to be growing with each revision. Heck, I remember Bjarne Stroustrup even semi-lamenting it in one of the interviews.
I don't understand why this is a C problem, whose syntax and definitions have been relatively stable for the last 3 decades. Author has gripes with C++ 'universality' IMHO but just piling it on C.
Because it isn't a C problem. C has been basically stable for decades.
C++ is free to overload itself by piling ever more stuff onto itself, but any problems this language has, are not problems of C, or problems C needs to fix.
As a (now, primarily) C developer who still uses plenty of C++ in mixed-language projects when needed: C++ should only provide as much C compatibility as is needed for consuming headers with C-API declarations, and nothing more.
Microsoft's 2012 advice [1] to compile C code with a C++ compiler was misguided (at the time this was just an excuse to not support the latest C standards in MSVC). Writing C code in the so-called "common C/C++ subset" is a massive PITA for a C coder, because this common subset is a nonstandard dialect of C which will forever be stuck in the mid-90s.
Most programming languages are really good. People who talk ill of them just don't know them well, and also probably never lived while there were no high-level languages. And honestly, people are creatures of trends; we don't use the most popular languages because they're flawless.
The only major problem with C is its community never took off with a popular library packager and linter. Every other language that seems so cool today would be just as painful as C if they didn't have so many contributed modules and tweaks. Port those to C and people would fall in love with it all over again.
The C package manager was and still is (largely) any UNIX/Linux distribution and the source packages for them. That’s how common C is. Linters are up to the user. There are many, but older UNIX systems that I’ve used generally included a `lint` command. If you really wanted to, you could consider UNIX/Linux operating systems the “IDE” and package manager for C.
> [C++] also had type inference very early on, but the developers of the mid-80s were not quite ready for that and Bjarne Stroustrup was pressured into removing auto, until it was added back to C++11.
Does anybody have background information on this claim? Was there actually a working compiler with it back then or was it just some proposal paper at the time?
Not "was", "is"! auto is the opposite of static in a way. Since it's the default method of storage for local variables, it's very rare to specify it explicitly. C++ just took this C keyword and made it mean something completely different.
I hope they don't, at the very least we can afford to choose a new keyword. Even if auto was infrequently used, it's no good to silently change the meaning of existing valid programs.
Would it change the meaning of existing valid programs? C's "type inference" algorithm is to assume that everything is an int by default. So
auto x = 7;
is equivalent to
auto int x = 7;
which is equivalent to
int x = 7;
(since "auto" is the default storage class). If the C standard folks do things right, then the only change will be to allow some previously invalid code like the following:
auto str = "foo";
Such code will compile on a C compiler (hopefully with a warning!) but won't have a defined behavior. I guess a more problematic case would be
auto x = 9999999999999L;
Where the value will be truncated in C but will be inferred as a long in C++ (assuming typical sizes for int and long int). Again, though, I am not sure if truncation of signed constants has a defined behavior in the standard.
auto pi = 3.14;
double result = 2 * pi;
printf("the circumference of the unit circle is %g\n", result);
When interpreted as C89 program, this computes the value 6. Compiled with a C++ compiler it computes 6.28. This is exactly due to that both languages have an auto keyword, but they mean different things.
This situation is perhaps not so likely to occur in real usage, but I think it could and should have been avoided.
Naturally at this point one could ask why not just use C++ instead, well that is what happens when a language group is so firmly against moving upwards, but still wants the features on their language.
Naturally missing from the list are any kind of improvements to string or arrays alternatives, as usual.
In classic "get off my lawn" style I thought C++ divergence from C when they abandoned cfront was where it all went wrong.
At least with cfront you had some idea of what was getting generated.
My favourite C++ compiler was Symantec 7 on windows. It was a great, affordable tool for making MFC applications and had a neat GUI builder too. I still have the CD of it but it refuses to install on windows 11 sadly.
Nowadays I just use C or C#, C++ just seems like a hard to use version of C# and I have little use for the extra performance it might have.
The problem with C is undefined behaviour. Not mentioned in this post.
Specifically the somewhat recent idea that code which does something naughty is best deleted. This makes updating a compiler a hazardous activity for approximately all 'C' code. Also makes refactors somewhat risky when they send code down different paths in the compiler.
I wonder if this judgement and execution result from collaboration with C++.
I write a lot of code in a C++ codebase and I'd say C++ is horrific. Not just because of its roots in C, no I think it is actually a worse language than C. Notable detriments in the language include references, classes, a lot of template stuff (some of it is ok, but it makes separate compilation almost impossible).
References are kind of redundant when you already have pointers, and they make the performance characteristic of the code you read less clear (am I passing down a pointer, or am I copying the whole thing?).
C structs can already do most of what C++ classes can do. Including inheritance and virtual methods, when you really really need them. As for encapsulation, pointer to implementation provide a more stable, harder to cheat interface than private members. A must for stable ABIs.
C does lack generics, and I do often miss them, but templates really aren't a good solution. I don't know what a good solution looks like, I haven't looked at Zig or Rust yet.
Another thing I miss in C is a namespace system, especially when you ship libraries: makes every name longer, and you still have annoying name clashes from time to time.
> References are kind of redundant when you already have pointers,
No. They're not. References must point to valid objects (i.e., cannot be nullptr and cannot be random addresses in memory), and you cannot do arithmetic on them. They are much safer to use than pointers, so actually, references make pointers redundant.
> C structs can already do most of what C++ classes can do. Including inheritance and virtual methods, when you really really need them.
Yes, references have nice constraints that makes them harder to misuse than pointers. The gap however is smaller than it first look: dangling references are still a thing, and one does not simply make pointer arithmetic by accident. Don't get me wrong, I do like references enough that my C++ code is littered with them. I'm just saying they don't add that much to the table. They're nice, but I never really miss them when I write C.
Yes, writing your own virtual table in C is mighty cumbersome. On the other hand, I need those about once every 3 years, even in C++.
The roots in c being a problem to me doesn't indicate the blame for them is necessarily inherited from c. The confusion about 'auto' elsewhere here is demonstrative.
C++ is superficially similar to c in a lot of ways that don't actually translate well across the boundary between the two.
So some c things get misattributed to c++ and many c++ things get misattributed to c and disentangling these is nearly impossible if you aren't intimately familiar with both of them.
Being intimately familiar with either is a detriment to maintaining unmuddled working knowledge of the other.
Of course it will be a circle jerk about how C is generally better because it is more simple. Then a bunch of people mentioning we should abandon C++ in favor of rust.
C is generally better because it doesn't treat you with kid gloves, if you don't know what you are doing you will get fucked in the ass. That said, any competent engineer would know what tool to use to solve a given problem. You don't use a sledge hammer to fix a watch.
C did it already in 1979, yet people keep ignoring it.
> Although the first edition of K&R described most of the rules that brought C's type structure to its present form, many programs written in the older, more relaxed style persisted, and so did compilers that tolerated it. To encourage people to pay more attention to the official language rules, to detect legal but suspicious constructions, and to help find interface mismatches undetectable with simple mechanisms for separate compilation, Steve Johnson adapted his pcc compiler to produce lint [Johnson 79b], which scanned a set of files and remarked on dubious constructions.
We've been here before with other aspects of human behaviour.
If you really don't want people to do it, make it impossible and then they can't. You can't write an integer overflow in WUFFS because that doesn't compile. You can't use your Google.com WebAuthn credentials to sign into fakegoogle.example, even if you really, really want to, even if you're 100% sure this is a good idea, it can't be done. Any time you stop short of this, you mustn't really want people to stop doing it and so they won't.
The next strongest defence, still not used often enough, is to make it very annoying to do things that are a bad idea. Default deny for example, the compiler obliges you to go in and explicitly allow the obviously bad thing you've done each and every single time you do it. A DLR train can be driven in fully manual mode instead of being automated, but it goes annoyingly slowly if you do that.
Linting is far below the visibility of even compiler warnings, most of your target audience will never see the message. It's not even "Do not look into laser with remaining eye" but more like a "Caution: Eye hazard" warning written inside the never-opened instruction manual.
Agreed, the only way to enforce something is to make part of the type system, and that is how we land on C folks complaining about straightjacket programming languages.
By “reinvent everything,” I believe you mean to say that you have to make your own linked-lists to have adjustable strings, and you have to create structs with function pointers to have objects…
The simplicity is either in the language and tooling around the language, or it is in the code you read and write in that language. C chose to be simple in the language itself and not in the code.
The single greatest praise that I will give to C is that the entirety of the language can be held in the mind of the programmer using it, with zero need for reference. This is impossible in most other languages.
> By “reinvent everything,” I believe you mean to say that you have to make your own linked-lists to have adjustable strings, and you have to create structs with function pointers to have objects…
Spot on.
> C chose to be simple in the language itself and not in the code.
But it market itself being simple so people even newcomers think that C is simple in every regard. Only few people rarely acknowledge this fact.
> The single greatest praise that I will give to C is that the entirety of the language can be held in the mind of the programmer using it, with zero need for reference. This is impossible in most other languages.
In other languages, which parts you are unable to held in your mind?
The only instance where I need references in C++ is STD library features. But if we measure, it take less time than if we were to reinvent it on C.
What I think a lot of detractors miss is that this reinvention need not be done by every single C programmer. In any significantly large and/or old C codebase, there will be local data structures, functions, macros, and idioms to solve the kinds of problems specific to that domain in a way suitable for that domain. For example, in my own domain of servers for distributed filesystems and similar things, the codebases I worked on had sophisticated ways of dealing with memory lifetimes across "discontinuous" control flows because of queuing, RPC, and so on. Those ways worked. I saw similar things in kernels, databases, etc. When I started working on a C++ codebase for yet another distributed storage system, every programmer seemed to be fighting to make C++/STL do the right thing in a memory-lifetime milieu that the language and library designers had apparently never contemplated. Worse, each programmer seemed to be solving those problems in different ways, leading to a proliferation of approaches which sometimes weren't even compatible with each other. That code had far more copies and memory leaks than the C code I'd worked on previously, despite the previous code having been worked on mostly by programmers with a significantly lower skill/experience level. Ironically, there was more wheel-reinvention in the C++ code than in C.
I'm not saying C is the right choice for every project. C++ isn't either. C++ might be a pretty good language for doing the sorts of mid-level things that the people on the C++ language committee like to do (surprise surprise) but there are many broad domains at both higher and lower semantic levels for which it can be a poor fit. For a lot of low-level stuff a language that lets people define their own domain-specific abstractions will beat an "every bell and whistle and all wrong for what we're doing" language every time.
Why would you put function pointers in structs? The function still needs to be handed the struct when called (no 'this'), and every time you instantiate the struct you need to set the function pointers to the correct thing again.
It gains you approximately nothing, as far as I can tell.
For me, C is perfect and just works and C++ doesnt. That being said, both their std libs are a big mess. It's absolutely horrible. The problem is the macro preprocessor. It's a horrible feature that should not have existed.
C is not perfect. It's closer to assembly, but you can't do anything useful without reinventing most things that is already done in C++.
C++ STD lib is not a big mess. At least, I am using it for decades. I don't know if you really used it because those who are against STD doesn't understand the point of abstractions and it's utility. They always tend to cite why would I use vector when I can create a "simple" linked list myself with pointers.
I liked C in the 80s, 90s and 2000s. It was a neat and simple language, but that changed as compilers got more aggressive with their optimizations and exposed just how complicated and unintuitive the spec actually is (and the impossibility of avoiding UB). C was a good learning experience for language committees (as was C++), but it's time to cut our losses and move on.
I also liked C++ in the 90s, but soon grew to despise it. When they tried to revamp it and make it safer to use, I was hopeful, but that's now gone. It's too complicated, the error messaging sucks, and there are too many different ways to do the same thing (with ever changing best practices). In a way, CMake and C++ are two peas in a pod.
This is why I now pin my hopes on rust and zig. I want to write low level code, but not in C or C++. I want a build system that's not Makefiles, CMake, or (shudder) gradle.