Hacker News new | past | comments | ask | show | jobs | submit login
C for All (uwaterloo.ca)
376 points by etrevino on Mar 23, 2018 | hide | past | web | favorite | 312 comments

There are a lot of features thrown into this language that don't seem worth the learning costs they incur. What are the problems you're really trying to fix? Focus on the things that are really important and impactful, and solve them; don't waste time on quirky features that just make the syntax more alien to C programmers.

* `if` / `case` / `choose` improvements look fine, though not that important.

* Exception handling semantics aren't defined.

* `with` is pointless and adds gratuitous complexity to the language.

* `fallthrough` / `fallthru` / `break` / `continue` are all just aliases for `goto`. It's not obvious to me that we really need them.

* Returnable tuples look very nice.

* Alternative declaration syntax looks like a nightmare. If we were redesigning C from the ground up, a different declaration syntax might be better, but mixing two syntaxes is a terrible, terrible idea.

* References. Why? They only add confusion.

* Can't make head or tail of what `zero_t` and `one_t` are about, or why they would be useful.

* Units (call with backquote): gratuitous syntax, unnecessary and confusing.

* Exponentiation operator: gratuitous and unnecessary.

Yeah, Ping, I agree. It reads like they missed the key lesson of C — in Dennis Ritchie's words, "A language that doesn't have everything is actually easier to program in than some that do." And some of the things they've added vitiate some of C's key advantages — exceptions complicate the control flow, constructors and destructors introduce the execution of hidden code (which can fail), and even static overloading makes it easy to make errors about what operations will be invoked by expressions like "x + y".

An interesting exercise might be to figure out how to do the Golang feature set, or some useful subset of it, in a C-compatible or mostly-C-compatible syntax.

I do like the returnable tuples, though, and the parametric polymorphism is pretty nice.

> Can't make head or tail of what `zero_t` and `one_t` are about, or why they would be useful.

I suspect it's the same problem C++ has/had (C++11 fixed it) with bools (see the safe bool idiom [0]). Basically treating a type like an integer (arithmetic object) and boolean (logical object) at the same time is problematic (especially for a "system" type meant for extending implicit system behavior). Because then I can do `if(BoolObject < 70)` when I only meant for `if(BoolObject)` to work (where "BoolObject" is some object evaluating to a bool, and by evaluating I mean coercing/casting).

Here it looks like they approached it by making 0/1 (effectively C's false/true) different types and relying on their simpler/more-powerful type system (e.g. because they don't have to worry about C++'s insane object system). Not a terrible idea if they were otherwise actually sticking to their goal of "evolving" C (most of their features are radical departures from the language like exceptions). C++11 solved it by clarifying how implicit explicit casting [sic] of rvalues works in certain keywords (which I strongly doubt anyone can say was the simpler way of solving the problem).

[0] https://en.wikibooks.org/wiki/More_C%2B%2B_Idioms/Safe_bool

I bike-shedded in this thread about exponentiation. But taking a step back the bigger issue is there are so many poorly justified features thrown in.

I don't have the feeling that the authors appreciate the appeal of C as a simple language that maps closely to hardware features.

This is a big random collections of extensions that piqued some implementor's fancy. There is seemingly no effort at narrowing down to the cleanest or most important ideas. It totally kills the clean, simple aesthetics of the the base C languge.

Exponentiation operator gratuitous and unnecessary?


It doesn't improve readability. Compare:

   discrim = b² - 4ac;                  // Standard notation
   float discrim = pow(b, 2) - 4*a*c;   // C
   float discrim = b \ 2 - 4*a*c;       // C∀
I would argue that these are presented here in descending order of readability.

Also its typing rules are really complicated; apply it to two integers and magically you are thrown into the floating-point world where you can never be completely certain of anything, but if you use an unsigned exponent then you stay safely in integer-land.

The choice of operator seems very odd to me. Wouldn't `^` be significantly more readable?

^ is already used for bitwise xor.

What about then, like some other languages?

Python uses * * , but that is ambigous with

  int a = 1;
  int *b = &a;
  int c = a ** b;
  // Am I casting b to an int (returning it's address) and exponenting it or am I derefrencing b and multiplying it's result with a?

How would this be different from &? It is also a unary operator and && is a different binary operator.

'||' and '&&' are distinct tokens in C as far as i know, i.e. not handled as two consecutive '|'s or '&'s.

So your example would unambiguously be parsed as "a to the b:th power". Whereas the other case would need explicit parens:

   int c = a * (*b);
Similar example for &:

   int a = 1;
   int b = 1 && a; /* 1 LOGICAL_AND a */
   int c = 1 & (&a); /* 1 AND address of a */

Just dropping in to say you are completely correct. It is called "maximum munch" and mandated by the C spec. I recently wrote a toy C compiler and was confused until I learned this.

I suppose ^^ might work, although a little odd because by consistency it would otherwise be the "logical XOR", a mythical operator that doesn't actually make much sense.

> a mythical operator that doesn't actually make much sense.

Hmm I'm pretty sure practically every programming languages have it. It usually looks like "!=" or "<>".

The even more obscure logical XNOR is usually denoted "==" or "="

Alas, this doesn't deal with the idea of truthyness as a proper logocal XOR would, so it is incorrect in many of the most popular languages, including C, where a value that is true is not always equal to another value that is true. This only works in the much more strongly typed languages, or when you force cast both sides to a boolean with something like !!

Yes! In C its full spelling is "!a != !b".

Not quite..

Perl has logical xor, which is occasionally useful. I usually reach for it when argument checking, where it makes sense to have either this param or that but not both.

> What about then, like some other languages?

I assume there's a double asterisk there, and it's being eaten by the formatter?

Right yes. I forgot to escape, and now it's too late to.

Probably too much ambiguity with pointers

Idiomatic C would say:

  float discrim = b*b - 4*a*c;
Using pow for a small integer power is a no-no: less efficient and less accurate.

I agree that \ is an awkward choice. A Fortran-like double asterisk ∗∗ is out because of ambiguity with pointers; single caret ^ out because it is already reserved for bitwise xor. Maybe double caret ^^ or asterisk-caret ∗^ could be used? That would read okay :

  double discrim = b^^2 - 4*a*c;
  double discrim = b*^2 - 4*a*c;

> Using pow for a small integer power is a no-no: less efficient and less accurate.

Using pow for a small integer power compiles into the exact same code: https://godbolt.org/g/CjoHdJ

Only for 1 or 2, unless you turn on --fast-math.

  Using pow for a small integer power is a no-no: less efficient and less accurate.
I think you missed the point.

It looks like most people here are so eager to jump on the "this feature is good, this sucks, overal I'm not impressed"-bandwagon (with the typically unwarranted strong opinions that programmers always have when it comes to this) that they didn't bother to explore the rest of the website in more detail. Go to "people" page and you see that it's a language implemented by professors, PhDs and master students from the Programming Language group at Waterloo[0][1]. Scroll down and you'll see that a number of these features came from the master thesis of a student:

    Glen Ditchfield, 1992
        Thesis title: Contextual Polymorphism
    Thierry Delisle, 2018.
        Thesis title: Concurrency in C∀.
    Rob Schluntz, 2017.
        Thesis title: Resource Management and Tuples in C∀.
    Rodolfo Gabriel Esteves, 2004.
        Thesis title: Cforall, a Study in Evolutionary Design in Programming Languages.
    Richard Bilson, 2003
        Thesis title: Implementing Overloading and Polymorphism in Cforall
    David W. Till, 1989
        Thesis title: Tuples In Imperative Programming Languages.
    Andrew Beach, Spring 2017.
        Line numbering, Exception handling, Virtuals 
So basically, it's a research language, more-or-less developed one student at a time.

[0] https://plg.uwaterloo.ca/~cforall/people

[1] https://plg.uwaterloo.ca/

I'm all for the evolution of C, but this list...

1) has some downright idiotic things (exceptions, operator overloading)

2) has a few reasonable, but mostly inconsequential things (declaration inside if, case ranges)

3) is missing a few real improvements (closures, although it is not clear whether the "nested routines" can be returned)

Agree 100%. Improvements to C would be things like removing "undefined behavior", not adding more syntax sugar. If anything, C's grammar is already too bloated. (I'm looking at you, function pointer type declarations inside anonymized unions inside parameter definition lists.)

> Improvements to C would be things like removing "undefined behavior"

This nonsense again. I don't get this "undefined behavior" cliche. It seems it became fashionable for some people to parrot it like a mantra as a form of signaling. Undefined behavior just refers to something that is not covered by the international standard, and therefore doesn't exist nor should be used, but an implementation may offer implementation-specific behavior.

When people talk about "removing undefined behavior", they usually mean requiring that compile-time-detectible undefined behaviors be converted into explicit errors.

For example, there are quite a few people who would like to see a C where you can't actually write this:

    // int x, y;
    if(++x < y) { ... }
...because, well, the behavior of integer overflow is undefined in C, so that code could technically do anything, even though it seems perfectly innocent, especially when coming from a checked language.

Of course, you can't do anything in the C standard to require that this code work as-is, because the C standard applies to architectures where mutually-exclusive things happen under integer overflow. But you can always just disallow it completely, and require that people use intrinsics that are explicit about what overflow behavior they expect (where that behavior reduces to plain output on target architectures that follow it, and to a shim on target architectures that don't. You know, like floating-point support, or atomics.)

...because, well, the behavior of integer overflow is undefined in C, so that code could technically do anything

No compiler is going to go out of its way to compile an increment into the machine's usual increment instruction and an additional overflow check that does whatever, just because it can. It's going to compile it into the machine's usual increment instruction and what happens on overflow is what happens naturally.

It's as absurd as claiming that even "x + y" can invoke undefined behaviour, because while the standard allows it, any compiler that compiles such an addition to anything other than the machine's addition instruction (i.e. with implementation-defined effects), to speak nothing of adding the additional(!) checks to deliberately do something else, is clearly not benefiting anyone.

"In theory, there is no difference between theory and practice. In practice, there is." Pure fearmongering, IMHO.

> No compiler is going to go out of its way to compile an increment into the machine's usual increment instruction and an additional overflow check that does whatever, just because it can

What people are actually worried about is when the compiler starts removing - not adding - seemingly unrelated code in a hard to reason about fashion. And compilers absolutely will go out of their way to do this in the name of optimization and performance. And it will do this because it got smart enough to prove that the "unrelated" code can't run without first technically invoking undefined behavior, at which point it can jump to the wild conclusion that it must never actually execute (or that it can remove the code even if it does, because it's legal for the compiler to do anything after invoking undefined behavior - including not execute that code!)

Sometimes the removed code is important security checks, leading to CVEs, hotpatches, etc. - this is not theoretical, and is not remotely new at this point: https://www.grsecurity.net/~spender/exploits/cheddar_bay/exp...

It also makes reporting compiler bugs annoying, as you first have to definitively prove to yourself and the compiler guys that you've actually got a compiler bug, rather than a compiler "feature" of aggressive optimization within the letter of the C++ standard. It's only out of pure stubbornness that https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84658 got reported upstream, I was assuming it was UB in our codebase most of the way down and thus INVALID as a compiler bug...

But perhaps CVEs and expected behavior being borderline indistinguishable from compiler bugs to most C and C++ programmers I know is just "fear mongering" as you say. IMNSHO, it's not \o/

It's not about the compiler doing "extra checks" to deliberately do something different. The real issue is with aggressive optimizations.

In order to get maximum performance, the compiler is allowed to assume that the programmer doesn't invoke undefined behavior. In other words, it can replace code with something that is equivalent in the presence of UB, but does something totally different in the absence of UB. See e.g. https://blog.regehr.org/archives/767 for some examples of how this can go wrong. (My favorite is the third one.)

Modern C compilers do some very tricky things with undefined behavior.

See https://blogs.msdn.microsoft.com/oldnewthing/20140627-00/?p=... for a very detailed example.

See https://blog.regehr.org/archives/1307 for some strict aliasing examples.

Compilers deciding to elide code paths that contain undefined behavior is weird, especially when it chooses to silently elide your checks for division by zero or overflow. It's not weird that actually dividing by zero can do anything; it's weird that by having a possible division by zero can allow the compiler to decide that (divisor==0) is false and ignore it.

> they usually mean requiring that compile-time-detectible undefined behaviors

I don't believe that's the case for plenty of reasons, such as:

- compilers already do that ( yeah, it's one of those RTFM things. See for example GCC's undefined behavior sanitizer)

- the standard already specifies exactly what it is left undefined, thus it's a compiler-related issue (see point above)

Let's face it: some people mindlessly parrot the"undefined behaviour" mantra just for show.

> compilers already do that ( yeah, it's one of those RTFM things)

I believe 99% of what people care about the C standard doing, re: UB handling, is requiring compilers to make certain behaviours the default, rather than hidden behind different flags that C newbies who don't understand UB (who thus code most of the bad C!) won't ever set.

> - compilers already do that ( yeah, it's one of those RTFM things. See for example GCC's undefined behavior sanitizer)

well, no they don't. UBSan is at run-time because most UB is impossible to catch at compile-time.

Implementation defined behavior is not the same thing as undefined behavior.

Undefined is out of the scope of the language entirely. Using a non-existent index into an array, for example. While you might reasonably expect the program will just look past the end, there is not guarantee it will do so. Optimizing compilers, in particular will assume such a thing cannot happen, and can assume a code branch that does something like this is impossible to reach and discard it entirely.

No one will "fix" such an optimization bug, because the code behind it is valid for ASTs that may have been put into that form from conforming code generated by macros and branches that wouldn't be called. There's nothing to fix.

You're telling it to do something impossible, and it's assuming it can't happen.

There's also behavior which is undefined for hardware reasons.

An example is what the C standard calls "trap representations": Bit patterns which fit into the space occupied by a specific type, but which will cause a hardware trap (exception, interrupt, what have you) if you actually store them in a variable of that type. The only type which cannot have trap representations is unsigned char. Basically, what it amounts to is this: C compilers don't compile to a runtime, they compile to raw machine code with, perhaps, a standard library. If you do something the hardware doesn't like when your program runs, well, the C compiler is long gone by that point and the C standard makes no guarantees.

More prosaically, storing to a location beyond the end of an array might not cause a segfault. It might corrupt some other array, it might cause a hardware crash, it might even corrupt the program's machine code. Because C is explicitly a language for embedded hardware, with no MMUs, no W^X protection, and no OSes, the C standard can say very little about such things.

You're mixing up "undefined behavior" and "implementation-defined" behavior. Implementation-defined behavior is fine. "Undefined behavior", as the term is used in the C spec, means that literally anything can happen. The compiler is allowed to assume that UB never happens, so if it does the program can produce random results.

See http://en.cppreference.com/w/cpp/language/ub

> You're mixing up "undefined behavior" and "implementation-defined" behavior. Implementation-defined behavior is fine. "Undefined behavior", as the term is used in the C spec, means that literally anything can happen.

Actually I didn't. My point was rather obvious: the whole point of the standards specifying UB is precisely to let implementations define the behavior themselves.

> the whole point of the standards specifying UB is precisely to let implementations define the behavior themselves.

This used to be the case. Signed integer overflow for instance is undefined because some CPUs go bananas when you try that. Other platform performed 2's complement just fine, and we used to be able to rely on this.

No longer.

See, the standard doesn't say "implementation defined". It doesn't say "undefined on platforms that go bananas, implementation defined otherwise". It says "undefined" period.

Signed integer overflow is undefined on all platforms, even your modern x86-64 CPU. Compiler writers interpreted it as a licence to assume it never happens, to help optimisations. For instance:

  int x = whatever;
  x += small_increment;
  if (x < 0) { // check for overflow
      abort(); // security shut down
  proceed(x);  // overflow didn't happen, we're safe!
Here's what the compiler thinks:

  int x = whatever;
  x += small_increment;
  if (x < 0) { // only true if signed overflow -> false
      abort(); // dead code
Then the compiler simply deletes your security check:

  int x = whatever;
  x += small_increment;
Don't listen to Chandler Carruth, nasal demons are real. Some undefined behaviours can encrypt your whole hard drive, assuming they're exploitable by malicious inputs.

This is the sort of emotionally-charged fearmongering around UB that really makes any discussion pointless. That example is wrong. Integers can be signed. If a compiler cannot prove x >= 0, then it simply cannot remove that code.

Now, if you used

    unsigned int x = whatever;
    if(x < 0)
There would be an obvious case for removing that if.

A very simple test case demonstrates that GCC can remove tests in the presence of signed overflow, even in ways that change a program's behavior.

    $ cat undefined.c
    #include <limits.h>
    #include <stdio.h>
    #include <stdlib.h>

    int main() {
        int x = INT_MAX;
        if (x+1 > x) {
            printf("%d > %d\n", x+1, x);
        } else {
    $ gcc --version
    gcc (Ubuntu 7.2.0-8ubuntu3.2) 7.2.0
    Copyright (C) 2017 Free Software Foundation, Inc.
    This is free software; see the source for copying conditions.  There is NO
    $ gcc undefined.c && ./a.out
    $ gcc -O3 undefined.c && ./a.out
    -2147483648 > 2147483647

Yes, that example is well-known but different; here, the compiler is assuming that x + 1 will always be greater than x, which is entirely something else than the parent's assertion of assuming that x + small_increment will always be positive.

The difference doesn't matter, you would know better if you weren't clinging so hard to your beliefs. Here's the "difference":

  $ cat undefined.c
  #include <limits.h>
  #include <stdio.h>
  #include <stdlib.h>

  int main() {
      int x = INT_MAX;
      if (x+1 < 0) {
          printf("%d < 0\n", x+1);
      } else {

  $ gcc --version
  gcc (Ubuntu 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
  Copyright (C) 2015 Free Software Foundation, Inc.
  This is free software; see the source for copying conditions.  There is NO

  $ gcc undefined.c && ./a.out
  -2147483648 < 0

  $ gcc -O3 undefined.c && ./a.out
The security check is gone all the same.

Your "small_increment" needs to be at least large enough to turn the smallest negative int into a non-negative number, and that makes it unrepresentable as a signed int itself.

There are cases where the compiler removes misguided overflow checks, since "perform UB, then check whether it happened" doesn't actually work, but your example is not such a case.

> "perform UB, then check whether it happened" doesn't actually work

This raises an interesting point, because in many cases, assuming 2's complement and wrapping around, checking for overflow after the fact, as opposed to preventing it from happening in the first place, is actually easier. (And actually works if you use the `-fwrap` flag.)

The right thing should be easier to do than the wrong thing. It's a shame this is not the case here.

I was of course assuming that `whatever` was positive, and the compiler knew it. See this sibling thread: https://news.ycombinator.com/item?id=16664546

> let implementations define the behavior themselves

That's literally the definition of implementation-defined behavior.

Undefined behavior really means undefined; in terms of the C language, there are no constraints on behavior. Sure, you might get a result one way on one implementation, but if you rely on that you're technically writing a dialect of C, and need to let the compiler know using flags.

Implementation defined behavior does the same thing for a given implementation. The compiler can produce totally different code every time you compile a program with UB. It can even affect code that is totally unrelated to the UB.

I'm not sure I buy that second part.

Legalistically, I guess it could (the behavior isn't defined, after all), but typically the optimizer makes some valid-only-if-the-code-is deductions and things snowball from there....

That's not the point, though. In the case of undefined behavior, a compiler doesn't have to define the behavior or even act consistently (most evident by the behavior of the compiler's optimizer).

This is entirely different than implementation defined in that a conforming compiler has to document the behavior they implement and do it consistently.

I'll just refer you to a comment I wrote last week, which I covers a specific case where undefined behavior and how a compiler chose to handle it caused a security problem in Cap'N'Proto.[1] I think it's about as condensed as I've seen it explained, while linking to further information.

1: https://news.ycombinator.com/item?id=16596409

I've noticed the same thing. A strongly adversarial relationship between compiler writers and users, justified by religious adherence to The Holy Standard and a complete lack of understanding of how people actually want and expect the language to behave in practice.

Undefined behavior just refers to something that is not covered by the international standard, and therefore doesn't exist nor should be used, but an implementation may offer implementation-specific behavior

Indeed. Even the standard itself, to quote its definition of undefined behaviour (emphasis mine):

"behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this International Standard imposes no requirements NOTE Possible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message)."

The fact that the standard "imposes no requirements" should not be taken as carte blanche to completely ignore the intent of the programmers and do entirely unreasonable things "just because you can", yet unfortunately quite a few in the compiler/programming language community think that way.

It's why I'm very encouraging of more programmers writing their own compilers and exploring new techniques, to move away from that suffocating and divisive culture. Compilers should be helping users, not acting aggressively against them just because it's allowed by one standards committee.

The thing is that you only see this compiler writer culture among C and C++ devs.

Other language communities that put correctness before performance at any cost, don't share this mentality, including the compiler writers.

> I don't get this "undefined behavior" cliche.

The problem is that you can do everything in portable C that you can do with undefined behavior in C, and often in a more straightforward fashion.[1] The compiler won't tell you that you are doing it wrong; there are many examples, tutorials, and books that encourage you to do the wrong thing. The undefined behavior will work until you switch to a different system. Why allow the wrong thing to continue to happen?

[1] A very good example of this is endianness:



You are conflating undefined with implementation defined. If you use undefined behavior the compiler might simply delete your code since it's not defined.

You and a lot of people in this discussion seem confused the other way. The amount of things in C that are implementation-defined is relatively small and most has to do with character encodings and byte size (storage unit). The kind of behavior people are pointing to is definitely labeled as undefined, and there is a lot of it. Take a look at Harbison and Steele for example.

Or it's something undesirable and you parroting back the definition of UB doesn't add anything.

I think many people overlook that "undefined behavior" mainly limits portability. Invoking undefined behavior won't cause your program to do random things, it just might not do the same thing when you switch to a different compiler. This is still a big problem, but not as big as people make it out to be, in my opinion. I think undefined behavior problems in the standard could be fixed, but compilers would need special flags for backwards compatibility with programs that rely on it.

> Invoking undefined behavior won't cause your program to do random things

According to the spec, it literally can. The compiler is free to replace your entire program with unrelated functionality. It can ever do a different thing each time it compiles the program.

There are implementation specific behaviors (# of bits in a char), which are different.

Undefined behavior is undefined even using the same version of the same compiler. If it does produce the same repeatable behavior, consider it luck.

Implementation-defined behavior limits portability in the way you describe, but UB's not the same thing. IB should behave the same way in the same implementation. UB doesn't have to.

> "undefined behavior" mainly limits portability

I had the impression that some things are UB because they can't specify a single behavior that would be efficient across all platforms C targets.

No that's implementation-defined. Undefined means code that is logically incorrect, but too expensive for the compiler/runtime to check and handle (reject) it. A simple example is array out of bounds access. It's too expensive to demand the compiler prevent it or guarantee to trigger some kind of signal at runtime. But if you are willing to pay the cost , you can use a language or compiler that is safer and promises to throw an exception or provides safe statically sizd arrays

> function pointer type declarations inside anonymized unions inside parameter definition lists

Could you please provide a code snippet of this kind? Hard for me to visualize otherwise. Thanks.

Sure, lets do it in pieces

    typedef int (*my_func_ptr_t)(const void*,const void*);

    typedef union { my_func_ptr_t func_ptr; int* data; } my_union_t;

    void my_func (my_union_t param);

    // Now exploiding it

    void my_func (union { int (*func_ptr)(const void*,const void*); int* data; } param);
Of course this is very basic example, but there is hope to make it into an IOCC entry.

Can you explain why exceptions and operator overloading are "idiotic" things? Are you from the Go school of boilerplate-error-checking-code design, or something?

Exceptions work great in garbage collected languages. Reasoning about exceptional control flow w.r.t. manual memory management is a total nightmare.

> Reasoning about exceptional control flow w.r.t. manual memory management is a total nightmare.

Exceptions make manual memory management easier because a proper exception system has unwind-protect[1]. Exceptions are just movements up the stack - exceptions combine naturally with dynamic scoping for memory allocation (memory regions/pools). This kind of memory management was used in some Lisp systems in the 1980s, and made its way into C++ in the form of RAII. By extending the compiler you can add further memory management conveniences like smart pointers to this scheme.

Now if you want to talk about something that actually makes manual memory management a total nightmare, look at the OP's suggestion for adding closures to C.

[1] http://www.lispworks.com/documentation/HyperSpec/Body/s_unwi...

C does not have RAII-like memory management in any way. Exceptions work beautifully with memory management like that, but if it's not there, you can't just say it should work because memory management should work like that.

So basically you're saying, before adding exceptions, add RAII-like memory management, and then actually add exceptions. I like both features, but am not sure how you'd wedge RAII into C. Any ideas on that?

> C does not have RAII-like memory management in any way.

Yes it does, as language extension on gcc and clang.

It is called cleanup attribute.



That's not C, _the language_, those are compiler features.

I explicitly mentioned they are extensions.

Thanks! I learned something. Didn't know about that.

> C does not have RAII-like memory management in any way.

C does not have memory management in any way period. The C standard library does. How you get to something with dynamic scoping like RAII in C is to use a different library for managing memory. For example Thinlisp[1] and Ravenbrook's Memory Pool System[2] both provide dynamically-scoped region/pool allocation schemes.

[1] https://github.com/vsedach/Thinlisp-1.1

[2] https://www.ravenbrook.com/project/mps/

[On the Cforall team] For what it's worth, one of the features Cforall adds to C is RAII.

The exception implementation isn't done yet, but it's waiting on (limited) run-time type information, it already respects RAII.

I have often wanted c99 with destructors for RAII purposes.

you do not need destructors if you put your stuff on the stack

just putting stuff on the stack in C won't magically call `fclose` or `pthread_mutex_unlock`, unlike destructors

I might have missed this.. but how is Cforall implemented?

A new GCC or LLVM frontend, or is it a transpiles-to-C implementation ala. Nim or Vala?

Transpiles-to-(GNU-)C -- it was first written before LLVM, if we were starting the project today it would likely be a Clang fork.

Clang harks back to 2007 and LLVM 2003.. is this a research project that was recently taken back up?

I was curious about the implementation because I've had rough experiences with Vala and Nim's approach. Unlike with "transpiles-to-js" languages, transpiling to C has some tooling gaps (debugging being the big one). I admittedly don't have a ton of experience with either language but I couldn't find a plugin that gave me a step-through debugger for something like CLion or VS Code. You can debug the C output directly but this will turn off newcomers and assumes the C output is clean.

The initial implementation was finished in '03, and we revived the project somewhere around '15, so your guess about a research project that was recently taken back up is correct.

We intend to write a "proper compiler" at some point (probably either a Clang fork or a Cforall front-end on LLVM), but it hasn't been a priority for our limited engineering staff yet. I think we are getting a summer student to work on our debugging story (at least in GDB -- setting it up so it knows how to talk to our threading runtime and demangle our names), and improving our debugging capabilities has been a major focus of our pre-beta-release push.

What I will like is more strict compile time checks. Most C pros have to rely on external tooling for that.

It's maybe not quite what you're looking for, but Cforall's polymorphic functions can eliminate nearly-all the unsafety of void-pointer-based polymorphism at little-to-no extra runtime cost (in fact, microbenchmarks in our as-yet-unpublished paper show speedup over void-pointer-based C in most cases due to more efficient generic type layout). As an example:

    forall(dtype T | sized(T))
    T* malloc() {  // in our stdlib
        return (T*)malloc(sizeof(T)); // calls libc malloc

    int* i = malloc(); // infers T from return type

Excuse me for my lamerism, but can you tell me what is a polymorphic function?

My idea was that if it is better to do as much compile time checks as possible before you introduce run-time checks. Does that void pointer protection run faster that code that was checked at compile time? How?

A polymorphic function is one that can operate on different types[1]. You would maybe be familiar with them as template functions in C++, though where C++ compiles different versions of the template functions based on the parameters, we pass extra implicit parameters. The example above translates to something like the following in pure C:

    void* malloc_T(size_t sizeof_T, size_t alignof_T) {
        return malloc(sizeof_T);

    int* i = (int*)malloc_T(sizeof(int), alignof(int));
In this case, since the compiler verifies that int is actually a type with known size (fulfilling `sized(T)`), it can generate all the casts and size parameters above, knowing they're correct.

[1] To anyone inclined to bash my definition of polymorphism, I'm mostly talking about parametric polymorphism here, though Cforall also supports ad-hoc polymorphism (name-overloading). The phrasing I used accounts for both, and I simplified it for pedagogical reasons.

Good point.

Sum types are the new trend.

exceptions because in embedded contexts they may not always be a good idea (and C targets such contexts). overloading because it is too easy to abuse and as such it gets abused a lot by those who do not know better. The rest of us are then stuck decoding what the hell "operator +" means when applied to a "serial port driver" object

> 3) is missing a few real improvements (closures, although it is not clear whether the "nested routines" can be returned)

Ah, I wish Blocks[0] would have made to into the C language as a standard†... Although you can use them with clang already:

    $ clang -fblocks blocks-test.c # Mac OS X
    $ clang -fblocks blocks-test.c -lBlocksRuntime # Linux
Since closures are poor man's object, I had some fun with them to fake object-orientedness[1].

† or at least that the copyright dispute between Apple and the FSF for integration into GCC would have been resolved (copyright transferred to the FSF being required in spite of a compatible license).

[0]: https://en.wikipedia.org/wiki/Blocks_%28C_language_extension...

[1]: https://github.com/lloeki/cblocks-clobj/blob/master/main.c#L...

Constructs like closures come at a cost. Function call abstraction and locality means hardware cannot easily prefetch, instruction cache misses, data cache misses, memory copying, basically, a lot of the slowness you see in dynamic languages. The point of C is to map as close to hardware as possible, so unless these constructs are free, better off without them and sticking to what CPUs can actually run at full speed.

Closures are logical abstractions and cost nothing, since they are logical. Naive runtime implementations of closures can of course be a bit slower than native functions, but so can be everything.

Clousure costs a lot if we are talking of real closures, that capture variables from the scope where they are defined, because you need to save somewhere that information, so you need to alloc an object with all the complexity associated.

And it can easily get very trick in a language like C where you don't have garbage collection and you have manually memory management, it's easy to capture things in a closure and then deallocate them, imagine if a closure captures a struct or an array that is allocated on the stack of a function for example.

I think we don't need closures in C, the only thing that I think we would need is a form of syntactic for anonymous function, that cannot capture anything of course, it will do most of the things that people uses closure for and doesn't have any performance problems or add complexity to the runtime.

> so you need to alloc an object

Not always! Rust and C++ closures don't need to allocate in every case. I can speak more definitively about Rust's, but as long as you aren't trying to move them around in certain ways, there's no allocation, even if you close over something.

Consider this sum function, which also adds in an extra factor on each summation:

  pub fn sum(nums: &[i32]) -> i32 {
      let factor = 5;

      nums.iter().fold(0, |a, b| a + b + factor)
The closure here closes over factor. There's zero allocations being done here.

If you want to return a closure, you may need to allocate. Rust will let you know, and the cost will be explicit (with Box). That's where my sibling's comment comes into play.

If a closure cannot be optimized out, i.e. the scope of the closure outlives the scope of the function it captures a variable from, than this closure is equivalent to a heap allocated struct, which cannot be allocated on the stack either if it outlives its scope. So the cost is still the same.

Couldn't agree more. and i'll add another:

the suggested syntax is ridiculous. What is this punctuation soup?

   void ?{}( S & s, int asize ) with( s ) { // constructor operator
   void ^?{}( S & s ) with( s ) {          // destructor operator
   ^x{};  ^y{};                              // explicit calls to de-initialize

This has been tried many times before, and eventually all these attempts die a lonely death. Why use extensions anyway? If one desired the luxury of modern scripting languages, switch to C++, Rust, Go or one of the other alternatives the article mentions.

Because regardless how some of us might dislike C and its security related issues, the truth is that no one is ever going to rewrite UNIX systems in other language, nor the embedded systems where even C++ has issues gaining market share.

So if one finally manages to get a safer C variant that finally wins the hearts of UNIX kernels and embedded devs, it is a win for all, even those that don't care about C on their daily work.

Until it happens, that lower layer all IoT devices and cloud machines will be kept in C, and not all of them will be getting security updates.

I do not question the usefulness of C, I use it in my daily work. What I am saying is that most C developers that use the language day-in-day-out know quite well what they are doing, and don't need yet another non-standard way of writing the code. Safety is a good point, but the initiative doesn't even mention the word, and there is no reason to assume the C-for-All extension targets safety at all

"What I am saying is that most C developers that use the language day-in-day-out know quite well what they are doing"

I'm going to strongly disagree with that statement.

I might have agreed with you if this was C++. C is a small enough language that most programmers can understand most of it.

Toyota was full of programmers that apparently "understood what they were doing."

Fair point, but changing the language won't change the amount of global state. and the associated complexity, nor will it make subsystem supervision work correctly. Changing the language will not prevent unbounded recursion or the associated stack overflows and subsystem failures. Changing the language will not fix mis-analysis of task switching overhead. And changing the language will not fix manufacturing issues with the PCBs.

As far as I'm aware, one of the very few toolchains that even try to improve on this over C are Ada/SPARK.

> but changing the language won't change the amount of global state

Global mutable state is marked as Unsafe in Rust.

> nor will it make subsystem supervision work correctly

Erlang is built specifically around this concept.

Perfect is the enemy of good here, throwing out a whole language due to one case doesn't help anyone.

> Global mutable state is marked as Unsafe in Rust.

It's also the simplest way to avoid dynamic allocation and the associated OOM issues. So, short of doing static analysis to bound heap usage at compile time, that makes things worse.

And Toyota already got the static analysis wrong for their stack usage. At least globals will fail to compile if they won't fit.

Are you referring to the "unintended acceleration" scandal from ~10 years ago? If so the NHTSA investigated[0] that and found no flaws in electronics. The problem was essentially people pushing the gas when they thought they were on the brakes. Pedal "misapplication" I think it's called in the report.

[0] https://www.transportation.gov/briefing-room/us-department-t...

> The problem was essentially people pushing the gas when they thought they were on the brakes

Oh dear no. Certainly not.

Read https://users.ece.cmu.edu/~koopman/pubs/koopman14_toyota_ua_...

My favourite part from is, "Watchdog kicked by a hardware timer service routine".

A watchdog timer is a piece of hardware that decrements a counter every microsecond or similar. The control system's main loop, running on the CPU, "kicks" the watchdog by setting the counter to a value like 1000 each iteration. The result is that if the CPU fails to execute the main loop often enough, the watchdog will "fire". This a) tells you that you have a bug and b) typically reboots the system so it has a chance to recover.

Toyota used a timer service routine to kick the watchdog. This defeats the purpose of the watchdog. The control software can happily get stuck or crash and the watchdog will not notice. The fact that an engineer added this "feature" tells you that the watchdog was firing in development. That should have been addressed by fixing the buggy software, not by disabling the test.

The fact that the disabled watchdog made it into the production release is unforgivable.

In that investigation they seemed to place the blame more on sticky pedals and floor mats, not operator error.

That wasn't the final word, though. I believe this is what the GP was referring to:

> When NASA software engineers evaluated parts of Toyota’s source code during their NHTSA contracted review in 2010, they checked 35 of the MISRA-C rules against the parts of the Toyota source to which they had access and found 7,134 violations. Barr checked the source code against MISRA’s 2004 edition and found 81,514 violations.


> Their descriptions of the incredible complexity of Toyota’s software also explain why NHTSA has reacted the way it has and why NASA never found a flaw it could connect to a Toyota’s engine going to a wide open throttle, ignoring the driver’s commands to stop and not set a diagnostic trouble code. For one, Barr testified, the NASA engineers were time limited, and did not have access to all of the source code. They relied on Toyota’s representations – and in some cases, Toyota misled NASA.


Bad practices can carry over to any language. They’re clearly incompetent at using C, but that doesn’t mean that they won’t be incompetent in another language.

Even ANSI/ISO C working group acknowledges that “thrust the programmer” doesn’t quite work.

This is because trusting the programmer is fundamentally wrong, no matter the programming language. In any good development process the actual coding is the least amount of work - for a reason.

This was the statement I was referring to.


Spirit of C:

a. Trust the programmer.

b. Do not prevent the programmer from doing what needs to be done.

c. Keep the language small and simple.

d. Provide only one way to do an operation.

e. Make it fast, even if it is not guaranteed to be portable.

The C programming language serves a variety of markets including safety-critical systems and secure systems.

While advantageous for system level programming, facets (a) and (b) can be problematic for safety and security.

Consequently, the C11 revision added a new facet:

=> f. Make support for safety and security demonstrable.



Okay, this makes me rethink my previous statement. There are two kinds of trust here: A low level programming language should not impose restrictions on the code that prevent the programmer from doing what needs to be done, even if it looks wrong. This is how I read the 'Spirit of C" that you quoted. And certain applications would be impossible to write without it. But you need a development process to make sure that your system does exactly the right thing. So the quote should read "trust the process" rather than "trust the programmer".

The trouble with a "safer C variant" is that it must remove features, or at least more heavily constrain programs to a safer subset of the language. This makes it not backwards-compatible.

I think the only successful "subset of C" is MISRA.

SaferCPlusPlus[1], for example, is a safe subset of C++ that has compatible safe substitutes for C++'s (and therefore C's) unsafe elements. So migrating existing C/C++ code generally just requires replacing variable declarations, not restructuring the code.

For C programs, one strategy is to provide a set of macros to be used as replacements for unsafe types in variable declarations. These macros will allow you, with a compile-time directive, to switch between using the original unsafe C elements, or the compatible safe substitutes (which are C++ and require a C++ compiler).

The replacement of unsafe C types with the compatible substitute macros can be largely automated, and there is actually a nascent auto-translator[2] in the works. (Well, it's being a bit neglected at the moment :)

Custom conventions using macros to improve code quality are not that uncommon in organized C projects. Right? But this one can (optionally, theoretically) deliver complete memory safety. So you might imagine, for example, a linux distribution providing two build versions, where one is a little slower but memory safe.

[1] shameless plug: https://github.com/duneroadrunner/SaferCPlusPlus

[2] https://github.com/duneroadrunner/SaferCPlusPlus-AutoTransla...

Maybe Frama-C as well.

I remember reading a paper from around 2007 that asserted that most of MISRA did not catch or significantly prevent major bugs in code, indeed it asserted that much of the standard was useless. I am failing to find it now, as I cannot remember what terms I used, and I am not at a library computer and therefore I cannot search behind paywalls beyond abstracts.

Cannot say that this is unexpected, but I was interested to find some papers, presumably these two: Assessing the Value of Coding Standards: An Empirical Study [1], Language subsetting in an industrial context: A comparison of misra c 1998 and misra c 2004 [2]

[1] http://citeseerx.ist.psu.edu/viewdoc/download?doi=

[2] http://leshatton.org/Documents/MISRA_comp_1105.pdf

Interesting, thanks for the papers.

What makes you think that a safer C variant would win the hearts of UNIX kernels and embedded devs any more than C++ (which started as just a C variant).

Because most UNIX kernel devs are religiously against the idea of ever touching C++, even if their beloved C compilers are implemented in C++.

I highly suspect it's not that Unix C diehards are against the idea of touching C++—it's that they're against the idea of using anything but C. I don't think anything can win over those kernel developers.

Its the complexity of the language vs the benefits it provides. C has many pitfalls (UB) but its simple. IMO a language that could migrate C programmers would have dependent types and an effect system with a nice syntax.


It can compile small enough to run on an Arduino: https://github.com/stepcut/idris-blink

Yep you are quite right.

Other OSes not tied to UNIX culture were always more open to reach out for C++, even if constrained to a certain subset.

Using a compiler partially written in c++ is very different from writting and mantaining a kernel (or whatever) in c++.

Maybe. The L4 people did write one of theirs in C++. L4-style kernrls are among the fastest in existence. Some even fit in the L1 cache of the CPU.


I can see your point if you're saying compiler use lets them avoid a language they just dont want to use. Which they couldnt if using it for an OS.

Other OSes, not impregnated with UNIX culture, have a different view on C++'s use on kernel space.

OS X and its descendents is the only exception.

Which goes back to NeXTSTEP using Objective-C and offering UNIX compatibility only as a path to bring software into the system, battling against SGI and Sun market space.

Even OS X's C++ usage is kind of a historical oddity. OPENSTEP drivers were written in ObjC for an API called DriverKit, but OS X/Darwin/xnu replaced that with the C++ IOKit. The rest of the kernel is still C. I wrote a comment a few years ago explaining why I think that decision was made:


Thanks for the hint.

Simplicity, no STL or templating, no OOP connotations

I don’t necessarily think it would, but if it did, those would all be reasons

[actually on the Cforall team] This is basically our pitch -- the last 30 years of language design features applied to a language that is not only source-compatible with C (like C++), but actually maintains the procedural paradigm of C (unlike C++) -- idiomatic C code should require minimal to no change to make it idiomatic Cforall code, and skilled C programmers should be able to learn the extension features orthogonally, rather than needing to figure out complex interactions between, say, templates and class inheritance. We are working in the same space as C++, and do draw inspiration from C++ where useful, but with the benefit of decades of watching C++ try things to see where the important features are.

There's also some really neat language-level concurrency support; work is ongoing on new features and a shorter summary, but you can see one of our Master's student theses for details: https://uwspace.uwaterloo.ca/handle/10012/12888

the existence of exceptions seems to bely the idea of minimal changes to make idiomatic C code idiomatic Cforall code.

While C does have longjmp and friends, usage of them is hardly idiomatic, so most C code assumes no non-local tranfer of control happens when calling functions. Coding with non-local transfer of control and without require very different idioms.

It would probably be much smaller than the gigantic C++ standard.

Okay, so the creators of this safer C promising that it isn't going to grow to the same size as C++? I guess if they're promising that, that makes sense.

C++ was a C variant once.

Because it's not C++, it's C.

It doesn't require them/us to change much, just add these flags and the compiler will warn you about things that are unsafe.

I've always found it quite sad that no one is interested in bettering C enough to push relevant changes through.

Because there is this myth that any good programmer is able to write safe C.

Only newbies make memory corruption errors.

Yet even Dennis acknowledged correct code mattered, and Johnson created lint in 1979!

Largely ignored until clang and its analyzers came into the scene.

counterexample: RedoxOS

Although it's more along the lines of Plan9 - a unix-like system that ignores the bits of POSIX that really suck.

If it doesn’t support 100% POSIX it isn’t UNIX.

Also how would you make memcpy() safe in a POSIX implementation on Redox?

Almost nothing is 100% POSIX. Most Linux distros aren't. Who cares as long as the important bits are there.

Then most linux distributions aren't UNIX, as they do tend to deviate from POSIX in places.

It's GNU/Linux, and GNU = GNU is Not Unix.

So it says Not Unix right on th tin.

To be perfectly fair, if it isn't UNIX it isn't UNIX.

> Also how would you make memcpy() safe

Easy, don't have memcpy().

Which means that minimal requirements to win kernel and embedded devs is to integrate well with the rest of the C ecosystem, including myriad of C compilers and to be really well suited for low level work. This excludes pretty much all ideas, but meta languages that produce C code. Might even be necessary to promote the language itself not as a new language, but as a meta preprocessor for C to avoid alienating developers. But realistically this is not feasible nor necessary. There are much more feasible ideas to improve safety, than forcing half of the world to learn a lot of new things and change.

Unfortunately not all systems are Solaris running on SPARC, with memory protection enabled.

Something akin to Intel MPX.

They're not going to rewrite any of them in this dialect, either.

You can use c++14 on embedded devices just fine, and people do.

You can, as long as you stay away from templates and lambdas and STL containers; in short, most of the reasons you’d use C++.

STL, RTTI and Exceptions should be avoided on embedded platforms (talking about 8 bit µCs here). I've extensively used both templates and lambdas on 8 bit AVRs (both the Tiny and Mega series); actually, writing templated code for µCs is a great way to avoid overhead stemming from function pointers etc. while still having well maintainable code.

> Because regardless how some of us might dislike C and its security related issues, the truth is that no one is ever going to rewrite UNIX systems in other language

Now will they in this.

> So if one finally manages to get a safer C variant that finally wins the hearts of UNIX kernels and embedded devs

A safer variant wouldn't be C. What makes C great for OS development is that it is just a step above assembly and you as a developer are given tremendous amount of power to do good and evil. C#/Java are programming languages with training wheels and it's great for application development. But for low level coding required for OS, network stacks, databases, etc, you really have to take the training wheels off.

I suppose you can try and make the C type system more stringent, but then it wouldn't be C. And considering they are aiming for backwards compatibility with existing C and its immense code infrastructure, they will have to keep the "flaws" in c for all.

Time would be better spent making the libraries/kernel/etc sturdier but if they can pull it off and win the hearts and minds of OS developers, then so be it.

Also, people have been trying to sideline C for decades. Each attempt has only reinforced C's standing and reminded us why C is so essential for OS development. Anyone remember the ill-fated attempt by Sun with their JVM centered JavaOS?

The only way to make C safe without losing performance would be to accompany your C code with a formal proof that it avoids undefined behavior, and use a compiler which refuses to compile the code if the formal proof doesn't validate.

Which would be essentially impossible for any language like C.

KCC is an executable, formal semantics for C that does something like that. Runtime Verification Inc uses it for their bug-hunting tools.



A full time C#, can confirm. It's an awesome business logic langauge, but the hot path eventually gets rewritten in a very C-like style, with all Linq, exceptions and allocations thrown out.

The new 7.x features are going to make it easier.

Some of us are still stuck in 3.5

Can you explain why? I assume those c# programs are used for backend stuff on servers, surely it isn't hard to get a 4.6+ runtime installed?


I can switch to 4.6, but don't want the risk of an experimental feature yet.

Right, forgot about Unity, that makes sense :)

Microsoft already proven twice that languages with training wheels can be used for writing OSes.

Google is using languages with training wheels to write core components of Fucshia (TCP/IP stack and file system tools are written in Go), as well as the new Android GPU debugger (also in Go).

Looks pretty ambitious. My take from skimming:

* switch, if, choose and case extensions look good.

* I can see the justification for labelled break/continue, but looks pretty hairy. Might discourage rethinking and refactoring to something simpler.

* I'm wary of exceptions.

* I don't like the 'with' clauses.

* Weird to add syntax just for mutexes, but they integrate concurrency/coroutines later, so maybe it make sense.

* Tuples are generally useful, but C11's unnamed structs are generally good enough, ie. instead of [int, char] you can return "struct { int x0; char x1 }" or something.

* New declaration syntax is welcome, but the old syntax probably isn't going away, so I'm not sure it's a good idea.

* Constructors/destructors are good. Syntax looks weird though.

* Overloading is very welcome.

* Not sure about operators, but they have their uses.

* Polymorphism is welcome, though it looks a bit cumbersome, and it should come with a monomorphisation guarantee for C.

* Traits seem like too much for a C-like language. I can see the uses, and the compiler can optimize this well, but they're probably too powerful.

* Coroutines are cool.

* Streams look interesting, but the overloading of | will probably be confusing.

I'm more or less in agreement, but I just though it was worth adding that the tuple's could actually have a lot of merit, I think I'd like to see them (Though I'm not sure the syntax is perfect parsing wise. It might be smart to prefix them, like `tuple [int, char]` or something.).

It seems like anonymous struct's fill the void, but a big problem with anonymous struct's is their types are never equal to any other, even if all the members are the exact same. So that means that if you declare the function as returning `struct { int x0; char x1; }` directly, it's actually mostly unusable because it's impossible to actually declare a variable with the same type as the return type. Obviously, the fix is to forward declare the `struct` ahead of time in a header file somewhere and then just use that type name, but that gets annoying really fast when you end-up with a lot of them. The tuples would allow you to achieve the same thing, but with a less verbose syntax and would allow them to be considered the same type even without forward declaring them.

> So that means that if you declare the function as returning `struct { int x0; char x1; }` directly, it's actually mostly unusable because it's impossible to actually declare a variable with the same type as the return type.

Are you sure about that? I remember playing with this last year and structural equality seemed to work when returning structures from functions. I was using clang, so it could conceivably have been an extension... (edit: some online C compilers do indeed return an error in this case)

If that's the case, then just make anonymous structs employ structural type equality and you have better tuples.

`gcc` definitely throws an error. It tells you something like "struct <anonymous> was expected but struct <anonymous> was found". It's a pretty fantastic error message /s

> If that's the case, then just make anonymous structs employ structural type equality and you have better tuples.

Yeah, that would work, I'd be fine with that. I don't think it's quite as good as a dedicated syntax though, just because the `struct` syntax is a lot more verbose then a concise tuple syntax could be, and defining `struct`s inline is pretty clumsy.

Yes it's more verbose, but avoids adding new primitive to the language for something that is probably not too common. I'm also not a fan of tuples because the fields aren't named. I mean, which field in a return type of [int, int] do I want exactly?

At least anonymous structs would name the fields and so the type serves also as documentation.

GNU C is probably my favorite extension of C. There's a lot of good stuff in there. The vector extensions make it really easy to write platform agnostic SIMD code.


Please don't use GNU C, or any other non-standardized version of C. A huge part of the reason C that is so widespread is because it's a well defined standard implemented by many compilers for many platforms. GNU C is defined by its implementation, which is awful.

> Please don't use GNU C, or any other non-standardized version of C.

Everything standard was once non-standard ; if no one uses it it will never be standardised and we will be left with a poor status quo. For instance, there wouldn't be int8_t, etc... if people weren't using non-standard macros beforehand. Likewise for atomics, threads, etc.

I disagree with this. A lot of GNU C works on both, GCC and Clang which covers most platforms out there.

Those extensions are useful and allow better portability across architectures. E.g. SIMD extensions is much better than writing two implementations with NEON and SSE intrinsics.

Please do use GNU C if you're going to use C. The viral nature of the Linux kernel has forced GNU C to be an important de facto standard. Take advantage of that!

Google has taken the effort to make Linux compile with clang, as part of their efforts to wipe gcc from Android toolchain.

There is a LLVM talk about it.

However I do agree with you.

clang and gcc cover most of the systems that matter today and their C extensions definitely make C a safer language.

There is nothing wrong with using GCC extensions.

Also, porting C is not that hard and does not require you to touch internals that much.

Why not ? If I don't care about portability because for example I'm writing a software that it's meant to be used only on Linux because it uses Linux specific libraries or system calls and I know that there gcc it's the standard I use the extension if they can simplify my code ?

I've used nested functions more often than I'd like to admit, because dealing with insane callback-heavy APIs is made a lot easier with them.

I've never really thought to use nested functions for that. I guess I've never really thought of what nested functions were for.

Those of us who had nested functions/procedures and “procedural types” in Pascal (or Algol, but that’s beyond my experience) back in the day have :-)

Then K&R came, and took our nice toys away.

Fortunately, the GNU Pascal compiler needed nested subroutines, so they exposed the feature in their C compiler, as well.

And now even C# has them. :)

On C++ we can fake them with lambdas.

Welcome to the lambda party ;). Passing nested functions as pointers is how lambdas often work behind the scene.

I love using nested functions when I have to write state machine code. It's a hella a lot better than the old school way using macros to do the same thing.

Also, the compare-and-set operations enabling lock-free code is pretty cool.

Considering how hard it is to write truly exception-safe C++ and considering how major C++ code bases don't allow exceptions, adding exceptions to C does not seem like a good idea.

I've always liked the idea of djb's boringcc[0], except with different definitions of undefined based on what users were using C currently with. This would allow people to "upgrade" into boringcc with their current code bases. So with a single invocation of a compiler, you couldn't use more than one set of defined undefined behaviors.

[0]: https://groups.google.com/forum/m/#!msg/boring-crypto/48qa1k...

I would love a gcc optimization level, like -Og which only applies optimizations that don't interfere with debugging information, where all undefined behavior is specified.

Does anyone know if undefined behavior is specified in CompCert? Or does CompCert simply not allow you to write programs with undefined behavior?

Whether exceptions are good or bad depends on what error handling strategy your product needs. For some software, it's better to try and recover no matter what. For others, complete failure is preferrable to operating with invalid state. Exceptions can be a blessing or a curse depending on what you need. Having them in your toolbox is certainly an advantage over having no choice.

>Having them in your toolbox is certainly an advantage over having no choice

I disagree. Dependencies, or coworkers, will use them despite your decision not to use them. When a dependency does use them, chances are the documentation is poor or non-existent.

"Recover no matter what" doesn't require exceptions. A common C idiom is to call a function like f(input, *err), where err points to memory where f can write error diagnostic info. Clunky, but I like how it makes the "exceptions" somewhat self-documented in the function signature.

> Considering how hard it is to write truly exception-safe C++

Is "writing truly exception-safe" something that necessary ? for me, the biggest benefit of exceptions is that I can have some code throw from anywhere and display a nice error pop-up message to my user which informs me of what went wrong and revert what was currently happening during the processing of the current event, since the "top-level" event loop is wrapped in a try-catch block. Often enough, the user can then just resume doing whatever he was working on.

> Is "writing truly exception-safe" something that necessary?

If you want your connections cleanly terminated, your temporary files removed, and your database transactions invalidated, yes.

> If you want your connections cleanly terminated, your temporary files removed, and your database transactions invalidated, yes.

sure, and if you develop in C++ and put these in RAII classes they will be automatically.

Because they're exception safe. But it's also possible to use them in an exception unsafe way.

well, that's what I don't understand with OP's

> Considering how hard it is to write truly exception-safe C++

that's the default behaviour in C++ code, how hard can it be ?

Exception safety is hardly a "default behavior" of C++, considering such gems[1] as:

    // This is unsafe.
    sink( unique_ptr<widget>{new widget{}},
          unique_ptr<gadget>{new gadget{}} );

    // This is safe.
    sink( make_unique<widget>(), make_unique<gadget>() );
[1] https://herbsutter.com/2013/05/29/gotw-89-solution-smart-poi...

The default nowadays is basic exception safety, where nothing leaks but objects can get put in invalid states. Strong exception safety (rollback semantics) is still pretty hard.

I'm pretty sure that's what is meant by "truly exception-safe".

If your code isn't exception-safe, then carrying on after an exception may crash the program.

Without continued development of the language, C will be unable to cope with the needs of modern programming problems and programmers; as a result, it will fade into disuse.

C11 is pretty nice! C99 is too. One might think that "almost once a decade" is kind of slow for updates, but M$ have enough trouble keeping up with the current schedule. Of course TFA describes a possible direction for C2x, but they could have a more charitable attitude...

> will fade into disuse.

That's one thing I'd like C to do. It was an adequate language in 1970, but in 2018, we have a few better approaches that have a chance to turn into viable alternatives to C.

Of course, I don't see C falling into disuse any time soon. The amount of critical code written in C is enormous, without a way in sight to reasonably replace. So keeping C in a good shape is important, whatever shortcomings the language may have.

Not even that much in 1970, if you look into old papers about the systems programming languages being used at other research institutes besides AT&T.

Indeed. Having started my CS degree work in the early 80s, I got to see “things other than C/C++”.

In retrospect, I often jokingly refer to the 80s as “the x86/PC disaster”, an extinction level event for programming languages where the choices were to run either assembler, or C, on the IBM PC to build software of any size due to the limitations of the hardware.

Rather than trying to improve curly brace languages, it’s time to bury them.

For starters, hiding identifiers after arbitrarily long type expressions, instead of starting a line/block/expression with a name, is an un-fixable PITA. (Sorry, AT&T, Algol had it right)

Requiring a “break” in a case statement is a botch, instead of some kind of “or” / “set” / “range” test.

C is useful as a portable assembler, I guess. Otherwise I don’t really like any of C’s offspring all that much. I sort of like Javascript in spite of being forced to look like Java/C++, but that’s only because I learned some Lisp (alas, not Scheme) back in the day.

Microsoft does not have any trouble keeping up with C, C++ and .NET Native are the future of systems programming on Windows.

C compatibility is kept to the extent of ANSI C++ requirements.

ANSI C++14 requires C99 library compatibility and ANSI C++17 was updated for C11 library compatibility.


Herb Sutter doesn't like C. That may be the reason that the compiler group at Microsoft don't put any effort into updating the C compiler to support newer features.

They do put a great deal of work into the C++ compiler, and seem to be doing a way, way, better job than they were in the late 90s.

That may be the other reason they don't update the C compiler with new features.

The reason is official and has been communicated many times.

C related improvements will only be done to the extent required by ANSI C++, or requests from key customers that might influence roadmap.

Anyone that really deeply wants to keep using C on Windows, and even enjoys using COM directly from C (the main ABI since Windows 7 and UWP core stack), can use any other C compiler.

In fact Microsoft has suggested clang multiple times, and has helped clang devs to make it work better on Windows.

They are improving! Six years is less than fifteen.

This is the wrong way to go about it. The syntax is great, what is needed is updates to reflect modern hardware:

-Vector types. (With arbitrary sizes not just up to vec4 for like on GPUs)

-New operators like clamp, and other intrinsic. (See GLSL and modern instruction sets)

-Min/max: |< >|


-Cross product: ><

-Swizzle: a.

-Qualifiers for warping behavior.

-Standard library with cash management hints.

-Define qualifiers for padding of structs to be defined.

I would also argue you could change some things in the spec to make it easier for compilers to optimize, like making the calling a switch without a catching case undefined behavior.

As for the syntax itself, I could find being able to type multiple break commands in a row to get out of more then one loop useful, but its not a big thing.

I would probably drastically restrict the power of the pre-processor too.

(If you have too much time on your hands: https://www.youtube.com/watch?v=443UNeGrFoM)

Peter Buhr also teaches CS 343: Concurrent and Parallel Programming [0] at the University of Waterloo in a dialect of C++ that he has been working on [1], called uC++ [2].

[0] https://www.student.cs.uwaterloo.ca/~cs343/

[1] https://github.com/pabuhr/uCPP

[2] https://en.wikipedia.org/wiki/%CE%9CC%2B%2B

[on the Cforall team] One of our Master's students has incorporated the majority of the uC++ features into Cforall as well, with some neat extensions for multi-monitor locking in a way that doesn't introduce synchronization deadlocks not present in user code.

As an alumni, I firmly believe there's room for a 4th year course in-addition to CS343. There's quite a lot in advanced control flow which can be covered.

I've always found it unfortunate that the university has courses from first year all the way to fourth year in Data Structures and Algorithms (all the way up to CS466/CS666) but Control Flow is treated like a secondary citizen.

[also a CS 343 TA] I personally agree with you -- if it were up to me I'd refactor CS 343 into a pair of courses, maybe focusing on high-level concurrency constructs with a follow-up course on building that sort of runtime system.

I personally really hated the use of uC++ and would have loved to do the whole course in some MIPS dialect. I really liked how through most of second year the only language reference I really needed to look at fit on one piece of paper (https://www.student.cs.uwaterloo.ca/~cs241/mips/mipsref.pdf). The uC++ language, on the other hand, is not specified anywhere except in the enormous 600 page textbook, and even then very far from fully specified (e.g. there was a builtin function called rendezvous-something, where that string literally only appeared in the book once, and it was not defined at that place)

I'd argue that's the point of moving from a 2nd year course to a 3rd year one. You incrementally add more complexity. The ISO C++ spec is a similar heavy tome.

What a waste of tuition money.

IMHO there are better proposals for a "better C" language which fix some of the shortcomings of original C:

- http://c2lang.org/site/

- https://ziglang.org/

- https://nim-lang.org/

As long as the language is small and has good tooling, and (most importantly) can easily interoperate with C libraries (have a look at Nim for an really awesome C integration), it doesn't matter whether it is backward compatible with C.

C itself should stay what it is. A low level and simple language without surprises which is only very slowly and carefully extended. Languages that are developed "by committee" and add lots of new features quickly are usually also quickly ruined.

I've written in the past about C2[1]. Nim is aimed at bit of a higher level than a "c replacement" should. I don't really know much about zig, but from what I do know, I like it.

1: https://www.reddit.com/r/programming/comments/7ugm8e/c2_c_wi...

The biggest problem with C is not the language but the fact that the C standard library is so anemic. Where's the "Boost" for C?

C is an evolving ISO standard.[1] We are currently at C11.

> The purpose of the project is to engineer modern language features into C in an evolutionary rather than revolutionary way.

It's also the purpose of the standards body. Why not propose these changes to the standard?

[1] - https://www.iso.org/standard/57853.html

Maybe because one needs to buy a seat at the table, by making a formal proposal, pay the travel expenses on their own and get to win the hearts of others members when voting comes to be.

There is still time, right? Lots of standards update efforts have moved in and out of ISO and IETF.

If there's one thing I would do to C it is to replace the normal datatypes with u8, u16, u32, u64 (ad infinitum) and their signed brethren s8, s16, s32, s64, ...

And don't start with the stupid (u)intXX_t.

They aren't stupid, and they are both standard and available on nearly every system going. People absolutely should use stdint. h

I wish more would.

> People absolutely should use stdint.h

Definitely. However, you have to admit the names (uint32_t) could be made less verbose, like "u32".


We are talking about the smallest possible datatypes that build up everything here. They don't have to be named variable_typeFactory_type_signed_t.

You are missing the point. The question was about if I could change something in C.

I use stdint.h (and inttypes.h since fuck printf without it amirite) but u8 ... are just better.

uint8_t is really not very hard to type.

Personally I would much prefer uint16_t over u16. Confusion over type handling is a source of some of the worst bugs. Remember, uint16 isn't the same as 16 bits if you consider endian-ness.

You could just use macros or typedefs for the the (u)intXX_t types.

And therein lies the issue. Everyone does just that, causing needless clashes. If you're writing a library, you can either switch back to the standard type names in your external header file, or you include the typedefs/macros there as well. The former choice ruins some of the point of having these types, and might even mask cross-platform issues. The latter runs the risk of clashing with the user's owns typedefs, or with another library writer's typedefs. (And god help you if exactly one library writer chose to do it via macros.)

From https://plg.uwaterloo.ca/~cforall/features:

    Exponentiation Operator

    New binary exponentiation operator '\' (backslash)
    for integral and floating-point types.

    2 \ 8u; // integral result (shifting), 256
   -4 \ 3u; // integral result (multiplication), -64
I hope that’s a documentation error. Otherwise, it seems designed for the “Obfuscated C for all” contest:

   int i = f();
   int j = i \ 3u;
Does what ‘\’ does really depend on the sign of i?

The "shifting" vs "multiplication" is just an optimisation since

  x * 2 == x << 1
for any positive x. As far as I'm aware shifting a negative number is undefined, which will be why they use multiplication for negatives.

So the mailing list and the other stuff are "internal only"? How is possible for external individuals to take part?

I'm actually on the Cforall team -- we've been running fairly low-profile for the moment (it wasn't one of us that posted the homepage to HN), but plan on making a beta release of the compiler and stdlib sometime this summer.

If you're interested in working on the project though, more hands are always welcome; I'd suggest contacting our team lead, Peter Buhr, his email is pabuhr AT the university domain Cforall is hosted on.

Thank you, I'm not able to provide a lot of help but I would like to follow the discussion, perhaps the sensible thing to do at this point is to wait for the first beta, then probably the mailing list will be open to everybody. Thank you for replying.

Out of curiosity, why not do the development in public? Or is there a repo I missed?

Most programming languages do an initial period of internal development before a public release.

When I started on the project ~3 years ago, it took me about 2 weeks to work around the (then current) set of compiler bugs to make a 100-line benchmark program -- naturally a public release at that point would not have been fruitful. Today our compiler generally works, and we're looking forward to making a public release once we get a couple more features stabilized.

I think there is a difference between working on something in public and a public release. I'm always skeptical of the claims that things can't be done transparently because it's not open to input yet. Those things are orthogonal. You don't have to work in secret just to avoid some of the issues with publicity. Granted I know it happens often, I just don't think it needs to be that way.

If you want to work on something in public, you potentially have to write public-facing prose. You have to explain what the project is, where to find stuff, who to contact, documentation... set up a website, wrangle newcomers, commenters, press, etc. For a research project, that's a lot of work to do by people with limited time/budget for something that hasn't even been demonstrated to work.

Basically, day 1 transparency either requires a budget or misplaced priorities.

I specifically addressed this:

> I'm always skeptical of the claims that things can't be done transparently because it's not open to input yet. Those things are orthogonal.

You don't have to explain anything. You don't have to set up a website (case here already had a website, code was just hidden). I'm not sure where these requirements come from, but it's not true, and I'd argue you do more harm giving the impression of secret development than you do not accepting input at an early stage. We all understand the latter, but many of us are wary of the former when someone says their committed to openness and does the opposite.

i think it is more like "you can only make a first impression once"

Did they bother to ask any C developers what the pain points are? The kernel, qemu, libvirt, coreutils and gnulib developers really use C and POSIX to breaking point, and have some real problems that might be addressed in the language, but I don't see much evidence of those problems being addressed here.

Did they study existing code bases and bugs to find out how applicable these changes are to fixing real problems?

It's a research language:


You can check the various Master Theses at the bottom of the page.

In my opinion, what C is lacking isn’t primarily language features - it’s: 1) a common, readable style standard that people can agree on 2) modern agreed-upon idiomatic ways to write readable and safe code. There are certainly common C idioms but many exist for historical reasons and not because they’re necessarily the best things to do, and 3) a standard library that exemplifies the above two things, provides common functionality needed between projects, and fosters a sense of community

Efforts like those of the poster don’t really address these things.

When I tell people I like C, the response differs depending on experience level. Less experienced people who haven’t spent time in C++ either will say “Gross, pointers”. People used to C++ will say “what about templates and constructors/destructors”, and “are you going to write your own library for vectors and maps every time?” And people who are experienced (more experienced than me) and like C have generally said that the main things they miss from C++ are constructors and destructors (and mostly destructors), and templates (but only for container types), but that they can live without them.

C is a small language that can be relatively easy to write (once you pick/write a suitable stl-equivalent), and extremely easy to read (and to the extent that you can look at a line and know exactly what’s going on under the hood). IMO it should stay that way.

Take a look at zproject [0] for what I think is a good effort to standardize C style and project structure. Even if you disagree with the particular design choices, the spirit of it is what I think is needed to keep people from just assuming C is for dinosaurs.

[0] https://github.com/zeromq/zproject/

I agree. The proposal adds a bunch of ugly and unnecessary syntax without addressing any of the real issues with C

I have little experience with C but my feeling is that it's a waste of time to have to deal with memory in my program.

I don't just have to worry about the problem that I'm trying to solve but also about handling the memory properly.

Of course there are scenarios where writing a fast and lightweight app is part of the problem but for many projects that's not the case.

I'm fine with pointer arithmetic and all that but I'm not fine with the constant fear that I did some oversight and my app is going to crash and I'll have to spend time debugging what's going on.

Most programming problems are pretty trivial to solve if you're willing to ignore performance. It behooves us to realize that the main difference between mathematics and computer science is that computer scientists have to account for the several orders of magnitude difference between the runtime costs of different operations. If that weren't an obstacle, we'd basically only need one programming language, and it would like pretty much like standard mathematics.

So I interpret your comment as something like "I don't want to program, I just want to assemble mathematical statements", which is a fine and perfectly legitimate thing to do, but it also seems a bit lazy, impractical, and probably irrelevant, given the subject.

My point was about why invest time in C dealing with memory with the risk of segfaults when there many other languages like JS, Java, Go that would take care of that task for you.

Also I'm talking about projects (e.g. building a word processor), not programming problemns (e.g. invert this binary tree). In my experience projects are complex systems with many moving parts interacting and most bugs comes from the complexity of the system/problem. With C, I don't just have to think about the problem that I'm solving but if I did something wrong when dealing with memory allocation/releasing.

I don't have much experience in C except from the basics so I hope to hear from people that loves C why they choose it or if its just a constraint inherent to the problem they're dealing with.

I think I would prefer the Indiana Jones approach. "Leave it alone! It belongs in a museum!"

You would be right if OS kernels were not written in C.

Linux is pretty much deployed on a lot of hardware, it is still maintained, and it uses C.

Granted it is not the best language, but it's simple enough to do kernel development.

The same goes for cobol, it's the best paying language out there.

The (draft) ISO C11 spec is already nearly 700 pages[1]. How can anyone justify adding something like "choose" (a switch() without fall-throughs) to such a language?

Who are these people[2], and what is their mandate? Did they actually consult with the likes of Linus Torvalds, Greg KH, Mike Pall--those who actually use C every single day on mission-critical projects like OS kernels and virtual machines--and ask them what improvements to C they need?

1) http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1548.pdf

2) https://plg.uwaterloo.ca/~cforall/people

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact