C for All

zestyping · on March 23, 2018

There are a lot of features thrown into this language that don't seem worth the learning costs they incur. What are the problems you're really trying to fix? Focus on the things that are really important and impactful, and solve them; don't waste time on quirky features that just make the syntax more alien to C programmers.

* `if` / `case` / `choose` improvements look fine, though not that important.

* Exception handling semantics aren't defined.

* `with` is pointless and adds gratuitous complexity to the language.

* `fallthrough` / `fallthru` / `break` / `continue` are all just aliases for `goto`. It's not obvious to me that we really need them.

* Returnable tuples look very nice.

* Alternative declaration syntax looks like a nightmare. If we were redesigning C from the ground up, a different declaration syntax might be better, but mixing two syntaxes is a terrible, terrible idea.

* References. Why? They only add confusion.

* Can't make head or tail of what `zero_t` and `one_t` are about, or why they would be useful.

* Units (call with backquote): gratuitous syntax, unnecessary and confusing.

* Exponentiation operator: gratuitous and unnecessary.

kragen · on March 23, 2018

Yeah, Ping, I agree. It reads like they missed the key lesson of C — in Dennis Ritchie's words, "A language that doesn't have everything is actually easier to program in than some that do." And some of the things they've added vitiate some of C's key advantages — exceptions complicate the control flow, constructors and destructors introduce the execution of hidden code (which can fail), and even static overloading makes it easy to make errors about what operations will be invoked by expressions like "x + y".

An interesting exercise might be to figure out how to do the Golang feature set, or some useful subset of it, in a C-compatible or mostly-C-compatible syntax.

I do like the returnable tuples, though, and the parametric polymorphism is pretty nice.

SolarNet · on March 23, 2018

> Can't make head or tail of what `zero_t` and `one_t` are about, or why they would be useful.

I suspect it's the same problem C++ has/had (C++11 fixed it) with bools (see the safe bool idiom [0]). Basically treating a type like an integer (arithmetic object) and boolean (logical object) at the same time is problematic (especially for a "system" type meant for extending implicit system behavior). Because then I can do `if(BoolObject < 70)` when I only meant for `if(BoolObject)` to work (where "BoolObject" is some object evaluating to a bool, and by evaluating I mean coercing/casting).

Here it looks like they approached it by making 0/1 (effectively C's false/true) different types and relying on their simpler/more-powerful type system (e.g. because they don't have to worry about C++'s insane object system). Not a terrible idea if they were otherwise actually sticking to their goal of "evolving" C (most of their features are radical departures from the language like exceptions). C++11 solved it by clarifying how implicit explicit casting [sic] of rvalues works in certain keywords (which I strongly doubt anyone can say was the simpler way of solving the problem).

[0] https://en.wikibooks.org/wiki/More_C%2B%2B_Idioms/Safe_bool

dzdt · on March 24, 2018

I bike-shedded in this thread about exponentiation. But taking a step back the bigger issue is there are so many poorly justified features thrown in.

I don't have the feeling that the authors appreciate the appeal of C as a simple language that maps closely to hardware features.

This is a big random collections of extensions that piqued some implementor's fancy. There is seemingly no effort at narrowing down to the cleanest or most important ideas. It totally kills the clean, simple aesthetics of the the base C languge.

emilfihlman · on March 23, 2018

Exponentiation operator gratuitous and unnecessary?

WHAT?

kragen · on March 23, 2018

It doesn't improve readability. Compare:

   discrim = b² - 4ac;                  // Standard notation
   float discrim = pow(b, 2) - 4*a*c;   // C
   float discrim = b \ 2 - 4*a*c;       // C∀

I would argue that these are presented here in descending order of readability.

Also its typing rules are really complicated; apply it to two integers and magically you are thrown into the floating-point world where you can never be completely certain of anything, but if you use an unsigned exponent then you stay safely in integer-land.

dom96 · on March 24, 2018

The choice of operator seems very odd to me. Wouldn't `^` be significantly more readable?

userbinator · on March 24, 2018

^ is already used for bitwise xor.

earenndil · on March 24, 2018

What about then, like some other languages?

Amelorate · on March 24, 2018

Python uses * * , but that is ambigous with
int a = 1; int *b = &a; int c = a ** b; // Am I casting b to an int (returning it's address) and exponenting it or am I derefrencing b and multiplying it's result with a?

wkz · on March 24, 2018

How would this be different from &? It is also a unary operator and && is a different binary operator.

'||' and '&&' are distinct tokens in C as far as i know, i.e. not handled as two consecutive '|'s or '&'s.

So your example would unambiguously be parsed as "a to the b:th power". Whereas the other case would need explicit parens:

   int c = a * (*b);

Similar example for &:

   int a = 1;
   int b = 1 && a; /* 1 LOGICAL_AND a */
   int c = 1 & (&a); /* 1 AND address of a */

mafuy · on March 27, 2018

Just dropping in to say you are completely correct. It is called "maximum munch" and mandated by the C spec. I recently wrote a toy C compiler and was confused until I learned this.

userbinator · on March 24, 2018

I suppose ^^ might work, although a little odd because by consistency it would otherwise be the "logical XOR", a mythical operator that doesn't actually make much sense.

poizan42 · on March 24, 2018

> a mythical operator that doesn't actually make much sense.

Hmm I'm pretty sure practically every programming languages have it. It usually looks like "!=" or "<>".

The even more obscure logical XNOR is usually denoted "==" or "="

kazagistar · on March 24, 2018

Alas, this doesn't deal with the idea of truthyness as a proper logocal XOR would, so it is incorrect in many of the most popular languages, including C, where a value that is true is not always equal to another value that is true. This only works in the much more strongly typed languages, or when you force cast both sides to a boolean with something like !!

opejn · on March 24, 2018

Yes! In C its full spelling is "!a != !b".

danharaj · on March 24, 2018

Not quite..

mmcclimon · on March 24, 2018

Perl has logical xor, which is occasionally useful. I usually reach for it when argument checking, where it makes sense to have either this param or that but not both.

kbenson · on March 24, 2018

> What about then, like some other languages?

I assume there's a double asterisk there, and it's being eaten by the formatter?

earenndil · on March 24, 2018

Right yes. I forgot to escape, and now it's too late to.

vanderZwan · on March 24, 2018

Probably too much ambiguity with pointers

dzdt · on March 24, 2018

Idiomatic C would say:

  float discrim = b*b - 4*a*c;

Using pow for a small integer power is a no-no: less efficient and less accurate.

I agree that \ is an awkward choice. A Fortran-like double asterisk ∗∗ is out because of ambiguity with pointers; single caret ^ out because it is already reserved for bitwise xor. Maybe double caret ^^ or asterisk-caret ∗^ could be used? That would read okay :

  double discrim = b^^2 - 4*a*c;
  double discrim = b*^2 - 4*a*c;

jcelerier · on March 24, 2018

> Using pow for a small integer power is a no-no: less efficient and less accurate.

Using pow for a small integer power compiles into the exact same code: https://godbolt.org/g/CjoHdJ

stochastic_monk · on March 24, 2018

Only for 1 or 2, unless you turn on --fast-math.

tejasmanohar · on March 25, 2018

  Using pow for a small integer power is a no-no: less efficient and less accurate.

I think you missed the point.

vanderZwan · on March 23, 2018

It looks like most people here are so eager to jump on the "this feature is good, this sucks, overal I'm not impressed"-bandwagon (with the typically unwarranted strong opinions that programmers always have when it comes to this) that they didn't bother to explore the rest of the website in more detail. Go to "people" page and you see that it's a language implemented by professors, PhDs and master students from the Programming Language group at Waterloo[0][1]. Scroll down and you'll see that a number of these features came from the master thesis of a student:

    Alumni
    Ph.D.
    
    Glen Ditchfield, 1992
        Thesis title: Contextual Polymorphism
    
    Masters
    
    Thierry Delisle, 2018.
        Thesis title: Concurrency in C∀.
    Rob Schluntz, 2017.
        Thesis title: Resource Management and Tuples in C∀.
    Rodolfo Gabriel Esteves, 2004.
        Thesis title: Cforall, a Study in Evolutionary Design in Programming Languages.
    Richard Bilson, 2003
        Thesis title: Implementing Overloading and Polymorphism in Cforall
    David W. Till, 1989
        Thesis title: Tuples In Imperative Programming Languages.
    
    USRA
    
    Andrew Beach, Spring 2017.
        Line numbering, Exception handling, Virtuals

So basically, it's a research language, more-or-less developed one student at a time.

[0] https://plg.uwaterloo.ca/~cforall/people

[1] https://plg.uwaterloo.ca/

enriquto · on March 23, 2018

I'm all for the evolution of C, but this list...

1) has some downright idiotic things (exceptions, operator overloading)

2) has a few reasonable, but mostly inconsequential things (declaration inside if, case ranges)

3) is missing a few real improvements (closures, although it is not clear whether the "nested routines" can be returned)

bdamm · on March 23, 2018

Agree 100%. Improvements to C would be things like removing "undefined behavior", not adding more syntax sugar. If anything, C's grammar is already too bloated. (I'm looking at you, function pointer type declarations inside anonymized unions inside parameter definition lists.)

geezerjay · on March 23, 2018

> Improvements to C would be things like removing "undefined behavior"

This nonsense again. I don't get this "undefined behavior" cliche. It seems it became fashionable for some people to parrot it like a mantra as a form of signaling. Undefined behavior just refers to something that is not covered by the international standard, and therefore doesn't exist nor should be used, but an implementation may offer implementation-specific behavior.

derefr · on March 23, 2018

When people talk about "removing undefined behavior", they usually mean requiring that compile-time-detectible undefined behaviors be converted into explicit errors.

For example, there are quite a few people who would like to see a C where you can't actually write this:

    // int x, y;
    if(++x < y) { ... }

...because, well, the behavior of integer overflow is undefined in C, so that code could technically do anything, even though it seems perfectly innocent, especially when coming from a checked language.

Of course, you can't do anything in the C standard to require that this code work as-is, because the C standard applies to architectures where mutually-exclusive things happen under integer overflow. But you can always just disallow it completely, and require that people use intrinsics that are explicit about what overflow behavior they expect (where that behavior reduces to plain output on target architectures that follow it, and to a shim on target architectures that don't. You know, like floating-point support, or atomics.)

userbinator · on March 24, 2018

...because, well, the behavior of integer overflow is undefined in C, so that code could technically do anything

No compiler is going to go out of its way to compile an increment into the machine's usual increment instruction and an additional overflow check that does whatever, just because it can. It's going to compile it into the machine's usual increment instruction and what happens on overflow is what happens naturally.

It's as absurd as claiming that even "x + y" can invoke undefined behaviour, because while the standard allows it, any compiler that compiles such an addition to anything other than the machine's addition instruction (i.e. with implementation-defined effects), to speak nothing of adding the additional(!) checks to deliberately do something else, is clearly not benefiting anyone.

"In theory, there is no difference between theory and practice. In practice, there is." Pure fearmongering, IMHO.

MaulingMonkey · on March 24, 2018

> No compiler is going to go out of its way to compile an increment into the machine's usual increment instruction and an additional overflow check that does whatever, just because it can

What people are actually worried about is when the compiler starts removing - not adding - seemingly unrelated code in a hard to reason about fashion. And compilers absolutely will go out of their way to do this in the name of optimization and performance. And it will do this because it got smart enough to prove that the "unrelated" code can't run without first technically invoking undefined behavior, at which point it can jump to the wild conclusion that it must never actually execute (or that it can remove the code even if it does, because it's legal for the compiler to do anything after invoking undefined behavior - including not execute that code!)

Sometimes the removed code is important security checks, leading to CVEs, hotpatches, etc. - this is not theoretical, and is not remotely new at this point: https://www.grsecurity.net/~spender/exploits/cheddar_bay/exp...

It also makes reporting compiler bugs annoying, as you first have to definitively prove to yourself and the compiler guys that you've actually got a compiler bug, rather than a compiler "feature" of aggressive optimization within the letter of the C++ standard. It's only out of pure stubbornness that https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84658 got reported upstream, I was assuming it was UB in our codebase most of the way down and thus INVALID as a compiler bug...

But perhaps CVEs and expected behavior being borderline indistinguishable from compiler bugs to most C and C++ programmers I know is just "fear mongering" as you say. IMNSHO, it's not \o/

teraflop · on March 24, 2018

It's not about the compiler doing "extra checks" to deliberately do something different. The real issue is with aggressive optimizations.

In order to get maximum performance, the compiler is allowed to assume that the programmer doesn't invoke undefined behavior. In other words, it can replace code with something that is equivalent in the presence of UB, but does something totally different in the absence of UB. See e.g. https://blog.regehr.org/archives/767 for some examples of how this can go wrong. (My favorite is the third one.)

jhomedall · on March 24, 2018

Modern C compilers do some very tricky things with undefined behavior.

See https://blogs.msdn.microsoft.com/oldnewthing/20140627-00/?p=... for a very detailed example.

See https://blog.regehr.org/archives/1307 for some strict aliasing examples.

roywiggins · on March 24, 2018

Compilers deciding to elide code paths that contain undefined behavior is weird, especially when it chooses to silently elide your checks for division by zero or overflow. It's not weird that actually dividing by zero can do anything; it's weird that by having a possible division by zero can allow the compiler to decide that (divisor==0) is false and ignore it.

geezerjay · on March 23, 2018

> they usually mean requiring that compile-time-detectible undefined behaviors

I don't believe that's the case for plenty of reasons, such as:

- compilers already do that ( yeah, it's one of those RTFM things. See for example GCC's undefined behavior sanitizer)

- the standard already specifies exactly what it is left undefined, thus it's a compiler-related issue (see point above)

Let's face it: some people mindlessly parrot the"undefined behaviour" mantra just for show.

derefr · on March 23, 2018

> compilers already do that ( yeah, it's one of those RTFM things)

I believe 99% of what people care about the C standard doing, re: UB handling, is requiring compilers to make certain behaviours the default, rather than hidden behind different flags that C newbies who don't understand UB (who thus code most of the bad C!) won't ever set.

jcelerier · on March 23, 2018

> - compilers already do that ( yeah, it's one of those RTFM things. See for example GCC's undefined behavior sanitizer)

well, no they don't. UBSan is at run-time because most UB is impossible to catch at compile-time.

knome · on March 23, 2018

Implementation defined behavior is not the same thing as undefined behavior.

Undefined is out of the scope of the language entirely. Using a non-existent index into an array, for example. While you might reasonably expect the program will just look past the end, there is not guarantee it will do so. Optimizing compilers, in particular will assume such a thing cannot happen, and can assume a code branch that does something like this is impossible to reach and discard it entirely.

No one will "fix" such an optimization bug, because the code behind it is valid for ASTs that may have been put into that form from conforming code generated by macros and branches that wouldn't be called. There's nothing to fix.

You're telling it to do something impossible, and it's assuming it can't happen.

msla · on March 23, 2018

There's also behavior which is undefined for hardware reasons.

An example is what the C standard calls "trap representations": Bit patterns which fit into the space occupied by a specific type, but which will cause a hardware trap (exception, interrupt, what have you) if you actually store them in a variable of that type. The only type which cannot have trap representations is unsigned char. Basically, what it amounts to is this: C compilers don't compile to a runtime, they compile to raw machine code with, perhaps, a standard library. If you do something the hardware doesn't like when your program runs, well, the C compiler is long gone by that point and the C standard makes no guarantees.

More prosaically, storing to a location beyond the end of an array might not cause a segfault. It might corrupt some other array, it might cause a hardware crash, it might even corrupt the program's machine code. Because C is explicitly a language for embedded hardware, with no MMUs, no W^X protection, and no OSes, the C standard can say very little about such things.

tlb · on March 23, 2018

You're mixing up "undefined behavior" and "implementation-defined" behavior. Implementation-defined behavior is fine. "Undefined behavior", as the term is used in the C spec, means that literally anything can happen. The compiler is allowed to assume that UB never happens, so if it does the program can produce random results.

See http://en.cppreference.com/w/cpp/language/ub

geezerjay · on March 23, 2018

> You're mixing up "undefined behavior" and "implementation-defined" behavior. Implementation-defined behavior is fine. "Undefined behavior", as the term is used in the C spec, means that literally anything can happen.

Actually I didn't. My point was rather obvious: the whole point of the standards specifying UB is precisely to let implementations define the behavior themselves.

loup-vaillant · on March 24, 2018

> the whole point of the standards specifying UB is precisely to let implementations define the behavior themselves.

This used to be the case. Signed integer overflow for instance is undefined because some CPUs go bananas when you try that. Other platform performed 2's complement just fine, and we used to be able to rely on this.

No longer.

See, the standard doesn't say "implementation defined". It doesn't say "undefined on platforms that go bananas, implementation defined otherwise". It says "undefined" period.

Signed integer overflow is undefined on all platforms, even your modern x86-64 CPU. Compiler writers interpreted it as a licence to assume it never happens, to help optimisations. For instance:

  int x = whatever;
  x += small_increment;
  if (x < 0) { // check for overflow
      abort(); // security shut down
  }
  proceed(x);  // overflow didn't happen, we're safe!

Here's what the compiler thinks:

  int x = whatever;
  x += small_increment;
  if (x < 0) { // only true if signed overflow -> false
      abort(); // dead code
  }
  proceed(x);

Then the compiler simply deletes your security check:

  int x = whatever;
  x += small_increment;
  proceed(x);

Don't listen to Chandler Carruth, nasal demons are real. Some undefined behaviours can encrypt your whole hard drive, assuming they're exploitable by malicious inputs.

userbinator · on March 24, 2018

This is the sort of emotionally-charged fearmongering around UB that really makes any discussion pointless. That example is wrong. Integers can be signed. If a compiler cannot prove x >= 0, then it simply cannot remove that code.

Now, if you used

    unsigned int x = whatever;
    ...
    if(x < 0)

There would be an obvious case for removing that if.

teraflop · on March 24, 2018

A very simple test case demonstrates that GCC can remove tests in the presence of signed overflow, even in ways that change a program's behavior.

    $ cat undefined.c
    #include <limits.h>
    #include <stdio.h>
    #include <stdlib.h>

    int main() {
        int x = INT_MAX;
        if (x+1 > x) {
            printf("%d > %d\n", x+1, x);
        } else {
            printf("overflow!\n");
        }
    }
    
    $ gcc --version
    gcc (Ubuntu 7.2.0-8ubuntu3.2) 7.2.0
    Copyright (C) 2017 Free Software Foundation, Inc.
    This is free software; see the source for copying conditions.  There is NO
    warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
    
    $ gcc undefined.c && ./a.out
    overflow!
    
    $ gcc -O3 undefined.c && ./a.out
    -2147483648 > 2147483647

userbinator · on March 24, 2018

Yes, that example is well-known but different; here, the compiler is assuming that x + 1 will always be greater than x, which is entirely something else than the parent's assertion of assuming that x + small_increment will always be positive.

loup-vaillant · on March 24, 2018

The difference doesn't matter, you would know better if you weren't clinging so hard to your beliefs. Here's the "difference":

  $ cat undefined.c
  #include <limits.h>
  #include <stdio.h>
  #include <stdlib.h>

  int main() {
      int x = INT_MAX;
      if (x+1 < 0) {
          printf("%d < 0\n", x+1);
      } else {
          printf("overflow!\n");
      }
  }

  $ gcc --version
  gcc (Ubuntu 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
  Copyright (C) 2015 Free Software Foundation, Inc.
  This is free software; see the source for copying conditions.  There is NO
  warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

  $ gcc undefined.c && ./a.out
  -2147483648 < 0

  $ gcc -O3 undefined.c && ./a.out
  overflow!

The security check is gone all the same.

yorwba · on March 24, 2018

Your "small_increment" needs to be at least large enough to turn the smallest negative int into a non-negative number, and that makes it unrepresentable as a signed int itself.

There are cases where the compiler removes misguided overflow checks, since "perform UB, then check whether it happened" doesn't actually work, but your example is not such a case.

loup-vaillant · on March 24, 2018

> "perform UB, then check whether it happened" doesn't actually work

This raises an interesting point, because in many cases, assuming 2's complement and wrapping around, checking for overflow after the fact, as opposed to preventing it from happening in the first place, is actually easier. (And actually works if you use the `-fwrap` flag.)

The right thing should be easier to do than the wrong thing. It's a shame this is not the case here.

loup-vaillant · on March 24, 2018

I was of course assuming that `whatever` was positive, and the compiler knew it. See this sibling thread: https://news.ycombinator.com/item?id=16664546

aw1621107 · on March 23, 2018

> let implementations define the behavior themselves

That's literally the definition of implementation-defined behavior.

Undefined behavior really means undefined; in terms of the C language, there are no constraints on behavior. Sure, you might get a result one way on one implementation, but if you rely on that you're technically writing a dialect of C, and need to let the compiler know using flags.

UncleMeat · on March 23, 2018

Implementation defined behavior does the same thing for a given implementation. The compiler can produce totally different code every time you compile a program with UB. It can even affect code that is totally unrelated to the UB.

mattkrause · on March 24, 2018

I'm not sure I buy that second part.

Legalistically, I guess it could (the behavior isn't defined, after all), but typically the optimizer makes some valid-only-if-the-code-is deductions and things snowball from there....

hermitdev · on April 2, 2018

That's not the point, though. In the case of undefined behavior, a compiler doesn't have to define the behavior or even act consistently (most evident by the behavior of the compiler's optimizer).

This is entirely different than implementation defined in that a conforming compiler has to document the behavior they implement and do it consistently.

kbenson · on March 24, 2018

I'll just refer you to a comment I wrote last week, which I covers a specific case where undefined behavior and how a compiler chose to handle it caused a security problem in Cap'N'Proto.[1] I think it's about as condensed as I've seen it explained, while linking to further information.

1: https://news.ycombinator.com/item?id=16596409

userbinator · on March 24, 2018

I've noticed the same thing. A strongly adversarial relationship between compiler writers and users, justified by religious adherence to The Holy Standard and a complete lack of understanding of how people actually want and expect the language to behave in practice.

Undefined behavior just refers to something that is not covered by the international standard, and therefore doesn't exist nor should be used, but an implementation may offer implementation-specific behavior

Indeed. Even the standard itself, to quote its definition of undefined behaviour (emphasis mine):

"behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this International Standard imposes no requirements NOTE Possible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message)."

The fact that the standard "imposes no requirements" should not be taken as carte blanche to completely ignore the intent of the programmers and do entirely unreasonable things "just because you can", yet unfortunately quite a few in the compiler/programming language community think that way.

It's why I'm very encouraging of more programmers writing their own compilers and exploring new techniques, to move away from that suffocating and divisive culture. Compilers should be helping users, not acting aggressively against them just because it's allowed by one standards committee.

pjmlp · on March 24, 2018

The thing is that you only see this compiler writer culture among C and C++ devs.

Other language communities that put correctness before performance at any cost, don't share this mentality, including the compiler writers.

sedachv · on March 23, 2018

> I don't get this "undefined behavior" cliche.

The problem is that you can do everything in portable C that you can do with undefined behavior in C, and often in a more straightforward fashion.[1] The compiler won't tell you that you are doing it wrong; there are many examples, tutorials, and books that encourage you to do the wrong thing. The undefined behavior will work until you switch to a different system. Why allow the wrong thing to continue to happen?

[1] A very good example of this is endianness:

https://news.ycombinator.com/item?id=16189110

https://commandcenter.blogspot.se/2012/04/byte-order-fallacy...

fjsolwmv · on March 24, 2018

You are conflating undefined with implementation defined. If you use undefined behavior the compiler might simply delete your code since it's not defined.

sedachv · on March 24, 2018

You and a lot of people in this discussion seem confused the other way. The amount of things in C that are implementation-defined is relatively small and most has to do with character encodings and byte size (storage unit). The kind of behavior people are pointing to is definitely labeled as undefined, and there is a lot of it. Take a look at Harbison and Steele for example.

tytytytytytytyt · on March 23, 2018

Or it's something undesirable and you parroting back the definition of UB doesn't add anything.

blackguardx · on March 23, 2018

I think many people overlook that "undefined behavior" mainly limits portability. Invoking undefined behavior won't cause your program to do random things, it just might not do the same thing when you switch to a different compiler. This is still a big problem, but not as big as people make it out to be, in my opinion. I think undefined behavior problems in the standard could be fixed, but compilers would need special flags for backwards compatibility with programs that rely on it.

UncleMeat · on March 23, 2018

> Invoking undefined behavior won't cause your program to do random things

According to the spec, it literally can. The compiler is free to replace your entire program with unrelated functionality. It can ever do a different thing each time it compiles the program.

There are implementation specific behaviors (# of bits in a char), which are different.

khedoros1 · on March 23, 2018

Undefined behavior is undefined even using the same version of the same compiler. If it does produce the same repeatable behavior, consider it luck.

Implementation-defined behavior limits portability in the way you describe, but UB's not the same thing. IB should behave the same way in the same implementation. UB doesn't have to.

aw1621107 · on March 23, 2018

> "undefined behavior" mainly limits portability

I had the impression that some things are UB because they can't specify a single behavior that would be efficient across all platforms C targets.

fjsolwmv · on March 24, 2018

No that's implementation-defined. Undefined means code that is logically incorrect, but too expensive for the compiler/runtime to check and handle (reject) it. A simple example is array out of bounds access. It's too expensive to demand the compiler prevent it or guarantee to trigger some kind of signal at runtime. But if you are willing to pay the cost , you can use a language or compiler that is safer and promises to throw an exception or provides safe statically sizd arrays

bhasi · on March 23, 2018

> function pointer type declarations inside anonymized unions inside parameter definition lists

Could you please provide a code snippet of this kind? Hard for me to visualize otherwise. Thanks.

pjmlp · on March 24, 2018

Sure, lets do it in pieces

    typedef int (*my_func_ptr_t)(const void*,const void*);

    typedef union { my_func_ptr_t func_ptr; int* data; } my_union_t;

    void my_func (my_union_t param);

    // Now exploiding it

    void my_func (union { int (*func_ptr)(const void*,const void*); int* data; } param);

Of course this is very basic example, but there is hope to make it into an IOCC entry.

pmarreck · on March 23, 2018

Can you explain why exceptions and operator overloading are "idiotic" things? Are you from the Go school of boilerplate-error-checking-code design, or something?

UncleMeat · on March 23, 2018

Exceptions work great in garbage collected languages. Reasoning about exceptional control flow w.r.t. manual memory management is a total nightmare.

sedachv · on March 23, 2018

> Reasoning about exceptional control flow w.r.t. manual memory management is a total nightmare.

Exceptions make manual memory management easier because a proper exception system has unwind-protect[1]. Exceptions are just movements up the stack - exceptions combine naturally with dynamic scoping for memory allocation (memory regions/pools). This kind of memory management was used in some Lisp systems in the 1980s, and made its way into C++ in the form of RAII. By extending the compiler you can add further memory management conveniences like smart pointers to this scheme.

Now if you want to talk about something that actually makes manual memory management a total nightmare, look at the OP's suggestion for adding closures to C.

[1] http://www.lispworks.com/documentation/HyperSpec/Body/s_unwi...

tomsmeding · on March 23, 2018

C does not have RAII-like memory management in any way. Exceptions work beautifully with memory management like that, but if it's not there, you can't just say it should work because memory management should work like that.

So basically you're saying, before adding exceptions, add RAII-like memory management, and then actually add exceptions. I like both features, but am not sure how you'd wedge RAII into C. Any ideas on that?

pjmlp · on March 24, 2018

> C does not have RAII-like memory management in any way.

Yes it does, as language extension on gcc and clang.

It is called cleanup attribute.

https://gcc.gnu.org/onlinedocs/gcc/Common-Variable-Attribute...

https://clang.llvm.org/docs/LanguageExtensions.html#non-stan...

EpicEng · on March 24, 2018

That's not C, _the language_, those are compiler features.

pjmlp · on March 25, 2018

I explicitly mentioned they are extensions.

tomsmeding · on March 24, 2018

Thanks! I learned something. Didn't know about that.

sedachv · on March 23, 2018

> C does not have RAII-like memory management in any way.

C does not have memory management in any way period. The C standard library does. How you get to something with dynamic scoping like RAII in C is to use a different library for managing memory. For example Thinlisp[1] and Ravenbrook's Memory Pool System[2] both provide dynamically-scoped region/pool allocation schemes.

[1] https://github.com/vsedach/Thinlisp-1.1

[2] https://www.ravenbrook.com/project/mps/

BruceIV · on March 23, 2018

[On the Cforall team] For what it's worth, one of the features Cforall adds to C is RAII.

The exception implementation isn't done yet, but it's waiting on (limited) run-time type information, it already respects RAII.

RhysU · on March 24, 2018

I have often wanted c99 with destructors for RAII purposes.

enriquto · on March 24, 2018

you do not need destructors if you put your stuff on the stack

jcelerier · on March 24, 2018

just putting stuff on the stack in C won't magically call `fclose` or `pthread_mutex_unlock`, unlike destructors

seabrookmx · on March 24, 2018

I might have missed this.. but how is Cforall implemented?

A new GCC or LLVM frontend, or is it a transpiles-to-C implementation ala. Nim or Vala?

BruceIV · on March 24, 2018

Transpiles-to-(GNU-)C -- it was first written before LLVM, if we were starting the project today it would likely be a Clang fork.

seabrookmx · on March 24, 2018

Clang harks back to 2007 and LLVM 2003.. is this a research project that was recently taken back up?

I was curious about the implementation because I've had rough experiences with Vala and Nim's approach. Unlike with "transpiles-to-js" languages, transpiling to C has some tooling gaps (debugging being the big one). I admittedly don't have a ton of experience with either language but I couldn't find a plugin that gave me a step-through debugger for something like CLion or VS Code. You can debug the C output directly but this will turn off newcomers and assumes the C output is clean.

BruceIV · on March 26, 2018

The initial implementation was finished in '03, and we revived the project somewhere around '15, so your guess about a research project that was recently taken back up is correct.

We intend to write a "proper compiler" at some point (probably either a Clang fork or a Cforall front-end on LLVM), but it hasn't been a priority for our limited engineering staff yet. I think we are getting a summer student to work on our debugging story (at least in GDB -- setting it up so it knows how to talk to our threading runtime and demangle our names), and improving our debugging capabilities has been a major focus of our pre-beta-release push.

baybal2 · on March 24, 2018

What I will like is more strict compile time checks. Most C pros have to rely on external tooling for that.

BruceIV · on March 24, 2018

It's maybe not quite what you're looking for, but Cforall's polymorphic functions can eliminate nearly-all the unsafety of void-pointer-based polymorphism at little-to-no extra runtime cost (in fact, microbenchmarks in our as-yet-unpublished paper show speedup over void-pointer-based C in most cases due to more efficient generic type layout). As an example:

    forall(dtype T | sized(T))
    T* malloc() {  // in our stdlib
        return (T*)malloc(sizeof(T)); // calls libc malloc
    }

    int* i = malloc(); // infers T from return type

baybal2 · on March 24, 2018

Excuse me for my lamerism, but can you tell me what is a polymorphic function?

My idea was that if it is better to do as much compile time checks as possible before you introduce run-time checks. Does that void pointer protection run faster that code that was checked at compile time? How?

BruceIV · on March 24, 2018

A polymorphic function is one that can operate on different types[1]. You would maybe be familiar with them as template functions in C++, though where C++ compiles different versions of the template functions based on the parameters, we pass extra implicit parameters. The example above translates to something like the following in pure C:

    void* malloc_T(size_t sizeof_T, size_t alignof_T) {
        return malloc(sizeof_T);
    }

    int* i = (int*)malloc_T(sizeof(int), alignof(int));

In this case, since the compiler verifies that int is actually a type with known size (fulfilling `sized(T)`), it can generate all the casts and size parameters above, knowing they're correct.

[1] To anyone inclined to bash my definition of polymorphism, I'm mostly talking about parametric polymorphism here, though Cforall also supports ad-hoc polymorphism (name-overloading). The phrasing I used accounts for both, and I simplified it for pedagogical reasons.

maxxxxx · on March 23, 2018

Good point.

adrift · on March 23, 2018

Sum types are the new trend.

dmitrygr · on March 23, 2018

exceptions because in embedded contexts they may not always be a good idea (and C targets such contexts). overloading because it is too easy to abuse and as such it gets abused a lot by those who do not know better. The rest of us are then stuck decoding what the hell "operator +" means when applied to a "serial port driver" object

lloeki · on March 24, 2018

> 3) is missing a few real improvements (closures, although it is not clear whether the "nested routines" can be returned)

Ah, I wish Blocks[0] would have made to into the C language as a standard†... Although you can use them with clang already:

    $ clang -fblocks blocks-test.c # Mac OS X
    $ clang -fblocks blocks-test.c -lBlocksRuntime # Linux

Since closures are poor man's object, I had some fun with them to fake object-orientedness[1].

† or at least that the copyright dispute between Apple and the FSF for integration into GCC would have been resolved (copyright transferred to the FSF being required in spite of a compatible license).

[0]: https://en.wikipedia.org/wiki/Blocks_%28C_language_extension...

[1]: https://github.com/lloeki/cblocks-clobj/blob/master/main.c#L...

reza_n · on March 23, 2018

Constructs like closures come at a cost. Function call abstraction and locality means hardware cannot easily prefetch, instruction cache misses, data cache misses, memory copying, basically, a lot of the slowness you see in dynamic languages. The point of C is to map as close to hardware as possible, so unless these constructs are free, better off without them and sticking to what CPUs can actually run at full speed.

zzzcpan · on March 23, 2018

Closures are logical abstractions and cost nothing, since they are logical. Naive runtime implementations of closures can of course be a bit slower than native functions, but so can be everything.

alerighi · on March 24, 2018

Clousure costs a lot if we are talking of real closures, that capture variables from the scope where they are defined, because you need to save somewhere that information, so you need to alloc an object with all the complexity associated.

And it can easily get very trick in a language like C where you don't have garbage collection and you have manually memory management, it's easy to capture things in a closure and then deallocate them, imagine if a closure captures a struct or an array that is allocated on the stack of a function for example.

I think we don't need closures in C, the only thing that I think we would need is a form of syntactic for anonymous function, that cannot capture anything of course, it will do most of the things that people uses closure for and doesn't have any performance problems or add complexity to the runtime.

steveklabnik · on March 24, 2018

> so you need to alloc an object

Not always! Rust and C++ closures don't need to allocate in every case. I can speak more definitively about Rust's, but as long as you aren't trying to move them around in certain ways, there's no allocation, even if you close over something.

Consider this sum function, which also adds in an extra factor on each summation:

  pub fn sum(nums: &[i32]) -> i32 {
      let factor = 5;

      nums.iter().fold(0, |a, b| a + b + factor)
  }

The closure here closes over factor. There's zero allocations being done here.

If you want to return a closure, you may need to allocate. Rust will let you know, and the cost will be explicit (with Box). That's where my sibling's comment comes into play.

zzzcpan · on March 24, 2018

If a closure cannot be optimized out, i.e. the scope of the closure outlives the scope of the function it captures a variable from, than this closure is equivalent to a heap allocated struct, which cannot be allocated on the stack either if it outlives its scope. So the cost is still the same.

dmitrygr · on March 23, 2018

Couldn't agree more. and i'll add another:

the suggested syntax is ridiculous. What is this punctuation soup?

   void ?{}( S & s, int asize ) with( s ) { // constructor operator
   void ^?{}( S & s ) with( s ) {          // destructor operator
   ^x{};  ^y{};                              // explicit calls to de-initialize

Dinux · on March 23, 2018

This has been tried many times before, and eventually all these attempts die a lonely death. Why use extensions anyway? If one desired the luxury of modern scripting languages, switch to C++, Rust, Go or one of the other alternatives the article mentions.

pjmlp · on March 23, 2018

Because regardless how some of us might dislike C and its security related issues, the truth is that no one is ever going to rewrite UNIX systems in other language, nor the embedded systems where even C++ has issues gaining market share.

So if one finally manages to get a safer C variant that finally wins the hearts of UNIX kernels and embedded devs, it is a win for all, even those that don't care about C on their daily work.

Until it happens, that lower layer all IoT devices and cloud machines will be kept in C, and not all of them will be getting security updates.

Dinux · on March 23, 2018

I do not question the usefulness of C, I use it in my daily work. What I am saying is that most C developers that use the language day-in-day-out know quite well what they are doing, and don't need yet another non-standard way of writing the code. Safety is a good point, but the initiative doesn't even mention the word, and there is no reason to assume the C-for-All extension targets safety at all

s73v3r_ · on March 23, 2018

"What I am saying is that most C developers that use the language day-in-day-out know quite well what they are doing"

I'm going to strongly disagree with that statement.

saagarjha · on March 23, 2018

I might have agreed with you if this was C++. C is a small enough language that most programmers can understand most of it.

s73v3r_ · on March 23, 2018

Toyota was full of programmers that apparently "understood what they were doing."

ori_b · on March 23, 2018

Fair point, but changing the language won't change the amount of global state. and the associated complexity, nor will it make subsystem supervision work correctly. Changing the language will not prevent unbounded recursion or the associated stack overflows and subsystem failures. Changing the language will not fix mis-analysis of task switching overhead. And changing the language will not fix manufacturing issues with the PCBs.

As far as I'm aware, one of the very few toolchains that even try to improve on this over C are Ada/SPARK.

vvanders · on March 23, 2018

> but changing the language won't change the amount of global state

Global mutable state is marked as Unsafe in Rust.

> nor will it make subsystem supervision work correctly

Erlang is built specifically around this concept.

Perfect is the enemy of good here, throwing out a whole language due to one case doesn't help anyone.

ori_b · on March 23, 2018

> Global mutable state is marked as Unsafe in Rust.

It's also the simplest way to avoid dynamic allocation and the associated OOM issues. So, short of doing static analysis to bound heap usage at compile time, that makes things worse.

And Toyota already got the static analysis wrong for their stack usage. At least globals will fail to compile if they won't fit.

cbdumas · on March 23, 2018

Are you referring to the "unintended acceleration" scandal from ~10 years ago? If so the NHTSA investigated[0] that and found no flaws in electronics. The problem was essentially people pushing the gas when they thought they were on the brakes. Pedal "misapplication" I think it's called in the report.

[0] https://www.transportation.gov/briefing-room/us-department-t...

abainbridge · on March 24, 2018

> The problem was essentially people pushing the gas when they thought they were on the brakes

Oh dear no. Certainly not.

Read https://users.ece.cmu.edu/~koopman/pubs/koopman14_toyota_ua_...

My favourite part from is, "Watchdog kicked by a hardware timer service routine".

A watchdog timer is a piece of hardware that decrements a counter every microsecond or similar. The control system's main loop, running on the CPU, "kicks" the watchdog by setting the counter to a value like 1000 each iteration. The result is that if the CPU fails to execute the main loop often enough, the watchdog will "fire". This a) tells you that you have a bug and b) typically reboots the system so it has a chance to recover.

Toyota used a timer service routine to kick the watchdog. This defeats the purpose of the watchdog. The control software can happily get stuck or crash and the watchdog will not notice. The fact that an engineer added this "feature" tells you that the watchdog was firing in development. That should have been addressed by fixing the buggy software, not by disabling the test.

The fact that the disabled watchdog made it into the production release is unforgivable.

seandougall · on March 23, 2018

In that investigation they seemed to place the blame more on sticky pedals and floor mats, not operator error.

That wasn't the final word, though. I believe this is what the GP was referring to:

> When NASA software engineers evaluated parts of Toyota’s source code during their NHTSA contracted review in 2010, they checked 35 of the MISRA-C rules against the parts of the Toyota source to which they had access and found 7,134 violations. Barr checked the source code against MISRA’s 2004 edition and found 81,514 violations.

...

> Their descriptions of the incredible complexity of Toyota’s software also explain why NHTSA has reacted the way it has and why NASA never found a flaw it could connect to a Toyota’s engine going to a wide open throttle, ignoring the driver’s commands to stop and not set a diagnostic trouble code. For one, Barr testified, the NASA engineers were time limited, and did not have access to all of the source code. They relied on Toyota’s representations – and in some cases, Toyota misled NASA.

http://www.safetyresearch.net/blog/articles/toyota-unintende...

saagarjha · on March 23, 2018

Bad practices can carry over to any language. They’re clearly incompetent at using C, but that doesn’t mean that they won’t be incompetent in another language.

pjmlp · on March 23, 2018

Even ANSI/ISO C working group acknowledges that “thrust the programmer” doesn’t quite work.

gmueckl · on March 23, 2018

This is because trusting the programmer is fundamentally wrong, no matter the programming language. In any good development process the actual coding is the least amount of work - for a reason.

pjmlp · on March 23, 2018

This was the statement I was referring to.

<quote>

Spirit of C:

a. Trust the programmer.

b. Do not prevent the programmer from doing what needs to be done.

c. Keep the language small and simple.

d. Provide only one way to do an operation.

e. Make it fast, even if it is not guaranteed to be portable.

The C programming language serves a variety of markets including safety-critical systems and secure systems.

While advantageous for system level programming, facets (a) and (b) can be problematic for safety and security.

Consequently, the C11 revision added a new facet:

=> f. Make support for safety and security demonstrable.

</quote>

http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2139.pdf

gmueckl · on March 24, 2018

Okay, this makes me rethink my previous statement. There are two kinds of trust here: A low level programming language should not impose restrictions on the code that prevent the programmer from doing what needs to be done, even if it looks wrong. This is how I read the 'Spirit of C" that you quoted. And certain applications would be impossible to write without it. But you need a development process to make sure that your system does exactly the right thing. So the quote should read "trust the process" rather than "trust the programmer".

pjc50 · on March 23, 2018

The trouble with a "safer C variant" is that it must remove features, or at least more heavily constrain programs to a safer subset of the language. This makes it not backwards-compatible.

I think the only successful "subset of C" is MISRA.

duneroadrunner · on March 23, 2018

SaferCPlusPlus[1], for example, is a safe subset of C++ that has compatible safe substitutes for C++'s (and therefore C's) unsafe elements. So migrating existing C/C++ code generally just requires replacing variable declarations, not restructuring the code.

For C programs, one strategy is to provide a set of macros to be used as replacements for unsafe types in variable declarations. These macros will allow you, with a compile-time directive, to switch between using the original unsafe C elements, or the compatible safe substitutes (which are C++ and require a C++ compiler).

The replacement of unsafe C types with the compatible substitute macros can be largely automated, and there is actually a nascent auto-translator[2] in the works. (Well, it's being a bit neglected at the moment :)

Custom conventions using macros to improve code quality are not that uncommon in organized C projects. Right? But this one can (optionally, theoretically) deliver complete memory safety. So you might imagine, for example, a linux distribution providing two build versions, where one is a little slower but memory safe.

[1] shameless plug: https://github.com/duneroadrunner/SaferCPlusPlus

[2] https://github.com/duneroadrunner/SaferCPlusPlus-AutoTransla...

pjmlp · on March 23, 2018

Maybe Frama-C as well.

fao_ · on March 23, 2018

I remember reading a paper from around 2007 that asserted that most of MISRA did not catch or significantly prevent major bugs in code, indeed it asserted that much of the standard was useless. I am failing to find it now, as I cannot remember what terms I used, and I am not at a library computer and therefore I cannot search behind paywalls beyond abstracts.

zzzcpan · on March 23, 2018

Cannot say that this is unexpected, but I was interested to find some papers, presumably these two: Assessing the Value of Coding Standards: An Empirical Study [1], Language subsetting in an industrial context: A comparison of misra c 1998 and misra c 2004 [2]

[1] http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.559...

[2] http://leshatton.org/Documents/MISRA_comp_1105.pdf

pjmlp · on March 23, 2018

Interesting, thanks for the papers.

kerkeslager · on March 23, 2018

What makes you think that a safer C variant would win the hearts of UNIX kernels and embedded devs any more than C++ (which started as just a C variant).

pjmlp · on March 23, 2018

Because most UNIX kernel devs are religiously against the idea of ever touching C++, even if their beloved C compilers are implemented in C++.

pcwalton · on March 23, 2018

I highly suspect it's not that Unix C diehards are against the idea of touching C++—it's that they're against the idea of using anything but C. I don't think anything can win over those kernel developers.

nwmcsween · on March 23, 2018

Its the complexity of the language vs the benefits it provides. C has many pitfalls (UB) but its simple. IMO a language that could migrate C programmers would have dependent types and an effect system with a nice syntax.

noam87 · on March 24, 2018

Idris?

It can compile small enough to run on an Arduino: https://github.com/stepcut/idris-blink

pjmlp · on March 23, 2018

Yep you are quite right.

Other OSes not tied to UNIX culture were always more open to reach out for C++, even if constrained to a certain subset.

int0x80 · on March 23, 2018

Using a compiler partially written in c++ is very different from writting and mantaining a kernel (or whatever) in c++.

nickpsecurity · on March 23, 2018

Maybe. The L4 people did write one of theirs in C++. L4-style kernrls are among the fastest in existence. Some even fit in the L1 cache of the CPU.

http://www.l4ka.org/projects/pistachio/pistachio-whitepaper....

I can see your point if you're saying compiler use lets them avoid a language they just dont want to use. Which they couldnt if using it for an OS.

pjmlp · on March 23, 2018

Other OSes, not impregnated with UNIX culture, have a different view on C++'s use on kernel space.

OS X and its descendents is the only exception.

Which goes back to NeXTSTEP using Objective-C and offering UNIX compatibility only as a path to bring software into the system, battling against SGI and Sun market space.

mrpippy · on March 23, 2018

Even OS X's C++ usage is kind of a historical oddity. OPENSTEP drivers were written in ObjC for an API called DriverKit, but OS X/Darwin/xnu replaced that with the C++ IOKit. The rest of the kernel is still C. I wrote a comment a few years ago explaining why I think that decision was made:

https://news.ycombinator.com/item?id=10006411

pjmlp · on March 23, 2018

Thanks for the hint.

BoorishBears · on March 23, 2018

Simplicity, no STL or templating, no OOP connotations

I don’t necessarily think it would, but if it did, those would all be reasons

BruceIV · on March 23, 2018

[actually on the Cforall team] This is basically our pitch -- the last 30 years of language design features applied to a language that is not only source-compatible with C (like C++), but actually maintains the procedural paradigm of C (unlike C++) -- idiomatic C code should require minimal to no change to make it idiomatic Cforall code, and skilled C programmers should be able to learn the extension features orthogonally, rather than needing to figure out complex interactions between, say, templates and class inheritance. We are working in the same space as C++, and do draw inspiration from C++ where useful, but with the benefit of decades of watching C++ try things to see where the important features are.

There's also some really neat language-level concurrency support; work is ongoing on new features and a shorter summary, but you can see one of our Master's student theses for details: https://uwspace.uwaterloo.ca/handle/10012/12888

aidenn0 · on March 23, 2018

the existence of exceptions seems to bely the idea of minimal changes to make idiomatic C code idiomatic Cforall code.

While C does have longjmp and friends, usage of them is hardly idiomatic, so most C code assumes no non-local tranfer of control happens when calling functions. Coding with non-local transfer of control and without require very different idioms.

wolfgke · on March 23, 2018

It would probably be much smaller than the gigantic C++ standard.

kerkeslager · on March 23, 2018

Okay, so the creators of this safer C promising that it isn't going to grow to the same size as C++? I guess if they're promising that, that makes sense.

C++ was a C variant once.

emilfihlman · on March 23, 2018

Because it's not C++, it's C.

It doesn't require them/us to change much, just add these flags and the compiler will warn you about things that are unsafe.

I've always found it quite sad that no one is interested in bettering C enough to push relevant changes through.

pjmlp · on March 23, 2018

Because there is this myth that any good programmer is able to write safe C.

Only newbies make memory corruption errors.

Yet even Dennis acknowledged correct code mattered, and Johnson created lint in 1979!

Largely ignored until clang and its analyzers came into the scene.

dralley · on March 23, 2018

counterexample: RedoxOS

Although it's more along the lines of Plan9 - a unix-like system that ignores the bits of POSIX that really suck.

pjmlp · on March 23, 2018

If it doesn’t support 100% POSIX it isn’t UNIX.

Also how would you make memcpy() safe in a POSIX implementation on Redox?

IAmLiterallyAB · on March 23, 2018

Almost nothing is 100% POSIX. Most Linux distros aren't. Who cares as long as the important bits are there.

aidenn0 · on March 23, 2018

Then most linux distributions aren't UNIX, as they do tend to deviate from POSIX in places.

nine_k · on March 23, 2018

It's GNU/Linux, and GNU = GNU is Not Unix.

So it says Not Unix right on th tin.

boomlinde · on March 23, 2018

To be perfectly fair, if it isn't UNIX it isn't UNIX.

Gibbon1 · on March 23, 2018

> Also how would you make memcpy() safe

Easy, don't have memcpy().

zzzcpan · on March 23, 2018

Which means that minimal requirements to win kernel and embedded devs is to integrate well with the rest of the C ecosystem, including myriad of C compilers and to be really well suited for low level work. This excludes pretty much all ideas, but meta languages that produce C code. Might even be necessary to promote the language itself not as a new language, but as a meta preprocessor for C to avoid alienating developers. But realistically this is not feasible nor necessary. There are much more feasible ideas to improve safety, than forcing half of the world to learn a lot of new things and change.

pjmlp · on March 23, 2018

Unfortunately not all systems are Solaris running on SPARC, with memory protection enabled.

Something akin to Intel MPX.

s73v3r_ · on March 23, 2018

They're not going to rewrite any of them in this dialect, either.

snarfy · on March 23, 2018

You can use c++14 on embedded devices just fine, and people do.

saagarjha · on March 23, 2018

You can, as long as you stay away from templates and lambdas and STL containers; in short, most of the reasons you’d use C++.

aurelian15 · on March 23, 2018

STL, RTTI and Exceptions should be avoided on embedded platforms (talking about 8 bit µCs here). I've extensively used both templates and lambdas on 8 bit AVRs (both the Tiny and Mega series); actually, writing templated code for µCs is a great way to avoid overhead stemming from function pointers etc. while still having well maintainable code.

michaelcampbell · on March 23, 2018

> Because regardless how some of us might dislike C and its security related issues, the truth is that no one is ever going to rewrite UNIX systems in other language

Now will they in this.

zombieprocesses · on March 23, 2018

> So if one finally manages to get a safer C variant that finally wins the hearts of UNIX kernels and embedded devs

A safer variant wouldn't be C. What makes C great for OS development is that it is just a step above assembly and you as a developer are given tremendous amount of power to do good and evil. C#/Java are programming languages with training wheels and it's great for application development. But for low level coding required for OS, network stacks, databases, etc, you really have to take the training wheels off.

I suppose you can try and make the C type system more stringent, but then it wouldn't be C. And considering they are aiming for backwards compatibility with existing C and its immense code infrastructure, they will have to keep the "flaws" in c for all.

Time would be better spent making the libraries/kernel/etc sturdier but if they can pull it off and win the hearts and minds of OS developers, then so be it.

Also, people have been trying to sideline C for decades. Each attempt has only reinforced C's standing and reminded us why C is so essential for OS development. Anyone remember the ill-fated attempt by Sun with their JVM centered JavaOS?

xamuel · on March 23, 2018

The only way to make C safe without losing performance would be to accompany your C code with a formal proof that it avoids undefined behavior, and use a compiler which refuses to compile the code if the formal proof doesn't validate.

saagarjha · on March 23, 2018

Which would be essentially impossible for any language like C.

nickpsecurity · on March 24, 2018

KCC is an executable, formal semantics for C that does something like that. Runtime Verification Inc uses it for their bug-hunting tools.

https://github.com/kframework/c-semantics

http://fsl.cs.illinois.edu/pubs/ellison-rosu-2012-popl.pdf

golergka · on March 23, 2018

A full time C#, can confirm. It's an awesome business logic langauge, but the hot path eventually gets rewritten in a very C-like style, with all Linq, exceptions and allocations thrown out.

pjmlp · on March 23, 2018

The new 7.x features are going to make it easier.

golergka · on March 23, 2018

Some of us are still stuck in 3.5

Thiez · on March 24, 2018

Can you explain why? I assume those c# programs are used for backend stuff on servers, surely it isn't hard to get a 4.6+ runtime installed?

golergka · on March 25, 2018

https://docs.unity3d.com/Manual/ScriptingRuntimeUpgrade.html

I can switch to 4.6, but don't want the risk of an experimental feature yet.

Thiez · on March 25, 2018

Right, forgot about Unity, that makes sense :)

pjmlp · on March 23, 2018

Microsoft already proven twice that languages with training wheels can be used for writing OSes.

Google is using languages with training wheels to write core components of Fucshia (TCP/IP stack and file system tools are written in Go), as well as the new Android GPU debugger (also in Go).

naasking · on March 23, 2018

Looks pretty ambitious. My take from skimming:

* switch, if, choose and case extensions look good.

* I can see the justification for labelled break/continue, but looks pretty hairy. Might discourage rethinking and refactoring to something simpler.

* I'm wary of exceptions.

* I don't like the 'with' clauses.

* Weird to add syntax just for mutexes, but they integrate concurrency/coroutines later, so maybe it make sense.

* Tuples are generally useful, but C11's unnamed structs are generally good enough, ie. instead of [int, char] you can return "struct { int x0; char x1 }" or something.

* New declaration syntax is welcome, but the old syntax probably isn't going away, so I'm not sure it's a good idea.

* Constructors/destructors are good. Syntax looks weird though.

* Overloading is very welcome.

* Not sure about operators, but they have their uses.

* Polymorphism is welcome, though it looks a bit cumbersome, and it should come with a monomorphisation guarantee for C.

* Traits seem like too much for a C-like language. I can see the uses, and the compiler can optimize this well, but they're probably too powerful.

* Coroutines are cool.

* Streams look interesting, but the overloading of | will probably be confusing.

DSMan195276 · on March 23, 2018

I'm more or less in agreement, but I just though it was worth adding that the tuple's could actually have a lot of merit, I think I'd like to see them (Though I'm not sure the syntax is perfect parsing wise. It might be smart to prefix them, like `tuple [int, char]` or something.).

It seems like anonymous struct's fill the void, but a big problem with anonymous struct's is their types are never equal to any other, even if all the members are the exact same. So that means that if you declare the function as returning `struct { int x0; char x1; }` directly, it's actually mostly unusable because it's impossible to actually declare a variable with the same type as the return type. Obviously, the fix is to forward declare the `struct` ahead of time in a header file somewhere and then just use that type name, but that gets annoying really fast when you end-up with a lot of them. The tuples would allow you to achieve the same thing, but with a less verbose syntax and would allow them to be considered the same type even without forward declaring them.

naasking · on March 23, 2018

> So that means that if you declare the function as returning `struct { int x0; char x1; }` directly, it's actually mostly unusable because it's impossible to actually declare a variable with the same type as the return type.

Are you sure about that? I remember playing with this last year and structural equality seemed to work when returning structures from functions. I was using clang, so it could conceivably have been an extension... (edit: some online C compilers do indeed return an error in this case)

If that's the case, then just make anonymous structs employ structural type equality and you have better tuples.

DSMan195276 · on March 24, 2018

`gcc` definitely throws an error. It tells you something like "struct <anonymous> was expected but struct <anonymous> was found". It's a pretty fantastic error message /s

> If that's the case, then just make anonymous structs employ structural type equality and you have better tuples.

Yeah, that would work, I'd be fine with that. I don't think it's quite as good as a dedicated syntax though, just because the `struct` syntax is a lot more verbose then a concise tuple syntax could be, and defining `struct`s inline is pretty clumsy.

naasking · on March 24, 2018

Yes it's more verbose, but avoids adding new primitive to the language for something that is probably not too common. I'm also not a fan of tuples because the fields aren't named. I mean, which field in a return type of [int, int] do I want exactly?

At least anonymous structs would name the fields and so the type serves also as documentation.

castratikron · on March 23, 2018

GNU C is probably my favorite extension of C. There's a lot of good stuff in there. The vector extensions make it really easy to write platform agnostic SIMD code.

https://gcc.gnu.org/onlinedocs/gcc/C-Extensions.html

ddevault · on March 23, 2018

Please don't use GNU C, or any other non-standardized version of C. A huge part of the reason C that is so widespread is because it's a well defined standard implemented by many compilers for many platforms. GNU C is defined by its implementation, which is awful.

jcelerier · on March 23, 2018

> Please don't use GNU C, or any other non-standardized version of C.

Everything standard was once non-standard ; if no one uses it it will never be standardised and we will be left with a poor status quo. For instance, there wouldn't be int8_t, etc... if people weren't using non-standard macros beforehand. Likewise for atomics, threads, etc.

exDM69 · on March 23, 2018

I disagree with this. A lot of GNU C works on both, GCC and Clang which covers most platforms out there.

Those extensions are useful and allow better portability across architectures. E.g. SIMD extensions is much better than writing two implementations with NEON and SSE intrinsics.

pcwalton · on March 23, 2018

Please do use GNU C if you're going to use C. The viral nature of the Linux kernel has forced GNU C to be an important de facto standard. Take advantage of that!

pjmlp · on March 24, 2018

Google has taken the effort to make Linux compile with clang, as part of their efforts to wipe gcc from Android toolchain.

There is a LLVM talk about it.

However I do agree with you.

clang and gcc cover most of the systems that matter today and their C extensions definitely make C a safer language.

emilfihlman · on March 23, 2018

There is nothing wrong with using GCC extensions.

Also, porting C is not that hard and does not require you to touch internals that much.

alerighi · on March 24, 2018

Why not ? If I don't care about portability because for example I'm writing a software that it's meant to be used only on Linux because it uses Linux specific libraries or system calls and I know that there gcc it's the standard I use the extension if they can simplify my code ?

blattimwind · on March 23, 2018

I've used nested functions more often than I'd like to admit, because dealing with insane callback-heavy APIs is made a lot easier with them.

s73v3r_ · on March 23, 2018

I've never really thought to use nested functions for that. I guess I've never really thought of what nested functions were for.

Roboprog · on March 23, 2018

Those of us who had nested functions/procedures and “procedural types” in Pascal (or Algol, but that’s beyond my experience) back in the day have :-)

Then K&R came, and took our nice toys away.

Fortunately, the GNU Pascal compiler needed nested subroutines, so they exposed the feature in their C compiler, as well.

pjmlp · on March 24, 2018

And now even C# has them. :)

On C++ we can fake them with lambdas.

gmueckl · on March 23, 2018

Welcome to the lambda party ;). Passing nested functions as pointers is how lambdas often work behind the scene.

Gibbon1 · on March 23, 2018

I love using nested functions when I have to write state machine code. It's a hella a lot better than the old school way using macros to do the same thing.

kerkeslager · on March 23, 2018

Also, the compare-and-set operations enabling lock-free code is pretty cool.

hsivonen · on March 23, 2018

Considering how hard it is to write truly exception-safe C++ and considering how major C++ code bases don't allow exceptions, adding exceptions to C does not seem like a good idea.

codemac · on March 23, 2018

I've always liked the idea of djb's boringcc[0], except with different definitions of undefined based on what users were using C currently with. This would allow people to "upgrade" into boringcc with their current code bases. So with a single invocation of a compiler, you couldn't use more than one set of defined undefined behaviors.

[0]: https://groups.google.com/forum/m/#!msg/boring-crypto/48qa1k...

platinumrad · on March 23, 2018

I would love a gcc optimization level, like -Og which only applies optimizations that don't interfere with debugging information, where all undefined behavior is specified.

Does anyone know if undefined behavior is specified in CompCert? Or does CompCert simply not allow you to write programs with undefined behavior?

golergka · on March 23, 2018

Whether exceptions are good or bad depends on what error handling strategy your product needs. For some software, it's better to try and recover no matter what. For others, complete failure is preferrable to operating with invalid state. Exceptions can be a blessing or a curse depending on what you need. Having them in your toolbox is certainly an advantage over having no choice.

xamuel · on March 23, 2018

>Having them in your toolbox is certainly an advantage over having no choice

I disagree. Dependencies, or coworkers, will use them despite your decision not to use them. When a dependency does use them, chances are the documentation is poor or non-existent.

"Recover no matter what" doesn't require exceptions. A common C idiom is to call a function like f(input, *err), where err points to memory where f can write error diagnostic info. Clunky, but I like how it makes the "exceptions" somewhat self-documented in the function signature.

jcelerier · on March 23, 2018

> Considering how hard it is to write truly exception-safe C++

Is "writing truly exception-safe" something that necessary ? for me, the biggest benefit of exceptions is that I can have some code throw from anywhere and display a nice error pop-up message to my user which informs me of what went wrong and revert what was currently happening during the processing of the current event, since the "top-level" event loop is wrapped in a try-catch block. Often enough, the user can then just resume doing whatever he was working on.

humanrebar · on March 23, 2018

> Is "writing truly exception-safe" something that necessary?

If you want your connections cleanly terminated, your temporary files removed, and your database transactions invalidated, yes.

jcelerier · on March 23, 2018

> If you want your connections cleanly terminated, your temporary files removed, and your database transactions invalidated, yes.

sure, and if you develop in C++ and put these in RAII classes they will be automatically.

humanrebar · on March 23, 2018

Because they're exception safe. But it's also possible to use them in an exception unsafe way.

jcelerier · on March 23, 2018

well, that's what I don't understand with OP's

> Considering how hard it is to write truly exception-safe C++

that's the default behaviour in C++ code, how hard can it be ?

yongjik · on March 23, 2018

Exception safety is hardly a "default behavior" of C++, considering such gems[1] as:

    // This is unsafe.
    sink( unique_ptr<widget>{new widget{}},
          unique_ptr<gadget>{new gadget{}} );

    // This is safe.
    sink( make_unique<widget>(), make_unique<gadget>() );

[1] https://herbsutter.com/2013/05/29/gotw-89-solution-smart-poi...

pubby · on March 23, 2018

The default nowadays is basic exception safety, where nothing leaks but objects can get put in invalid states. Strong exception safety (rollback semantics) is still pretty hard.

colanderman · on March 23, 2018

I'm pretty sure that's what is meant by "truly exception-safe".

pjc50 · on March 23, 2018

If your code isn't exception-safe, then carrying on after an exception may crash the program.

jessaustin · on March 23, 2018

Without continued development of the language, C will be unable to cope with the needs of modern programming problems and programmers; as a result, it will fade into disuse.

C11 is pretty nice! C99 is too. One might think that "almost once a decade" is kind of slow for updates, but M$ have enough trouble keeping up with the current schedule. Of course TFA describes a possible direction for C2x, but they could have a more charitable attitude...