
Tell HN: C Experts Panel – Ask us anything about C - rseacord
Hi HN,<p>We are members of the C Standard Committee and associated C experts, who have collaborated on a new book called Effective C, which was discussed recently here: <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=22716068" rel="nofollow">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=22716068</a>. After that thread, dang invited me to do an AMA and I invited my colleagues so we upgraded it to an AUA. Ask us about C programming, the C Standard or C standardization, undefined Behavior, and anything C-related!<p>The book is still forthcoming, but it&#x27;s available for pre-order and early access from No Starch Press: <a href="https:&#x2F;&#x2F;nostarch.com&#x2F;Effective_C" rel="nofollow">https:&#x2F;&#x2F;nostarch.com&#x2F;Effective_C</a>.<p>Here&#x27;s who we are:<p>rseacord - Robert C. Seacord is a Technical Director at NCC Group, and author of the new book by No Starch Press “Effective C: An Introduction to Professional C Programming” and C Standards Committee (WG14) Expert.<p>AaronBallman - Aaron Ballman is a compiler frontend engineer for GrammaTech, Inc. and works primarily on the static analysis tool, CodeSonar. He is also a frontend maintainer for Clang, a popular open source compiler for C, C++, and other languages. Aaron is an expert for the JTC1&#x2F;SC22&#x2F;WG14 C programming language and JTC1&#x2F;SC22&#x2F;WG21 C++ programming language standards committees and is a chapter author for Effective C.<p>msebor - Martin Sebor is Principal Engineer at Red Hat and expert for the JTC1&#x2F;SC22&#x2F;WG14 C programming language and JTC1&#x2F;SC22&#x2F;WG21 C++ programming language standards committees and the official Technical Reviewer for Effective C.<p>DougGwyn - Douglas Gwyn is Emeritus at US Army Research Laboratory and Member Emeritus for the JTC1&#x2F;SC22&#x2F;WG14 C programming language and a major contributor to Effective C.<p>pascal_cuoq - Pascal Cuoq is the Chief Scientist at TrustInSoft and co-inventor of the Frama-C technology.  Pascal was a reviewer for Effective C and author of a foreword part.<p>NickDunn - Nick Dunn is a Principal Security Consultant at NCC Group,  ethical hacker, software security tester, code reviewer, and major contributor to Effective C.<p>Fire away with your questions and comments about C!
======
nicoburns
Are there any plans to "clean up C"? A lot of effort has been put into
alternative languages, which are great, but there is still a lot of momentum
with C, and it seems that a lot of improvements that could be done in a
backwards compatible way and without introducing much in the way of
complexity. For example:

\- Locking down some categories of "undefined behaviour" to be "implementation
defined" instead.

\- Proper array support (which passes around the length along with the data
pointer).

\- Some kind of module system, that allows code to be imported with the
possibility of name collisions.

~~~
msebor
There are "projects" underway to clean up the spec where it's viewed as either
buggy, inconsistent, or underspecified. The atomics and threads sections are a
coupled of example.

There are efforts to define the behavior in cases where implementations have
converged or died out (e.g., twos complement, shifting into the sign bit).

There have been no proposals to add new array types and it doesn't seem likely
at the core language level. C's charter is to standardize existing practice
(as opposed to invent new features), and no such feature has emerged in
practice. Same for modules. (C++ takes a very different approach.)

~~~
nabla9
> no such feature has emerged in practice

Arrays with length constantly emerge among C users and libraries. They are
just all incompatible because without standardization there is no convergence.

~~~
ATsch
typedef struct {uint8_t *data; size_t len;} ByteBuf; is the first line of code
I write in a C project.

~~~
mobilemidget
Could you add some extra information why this is so helpful or handy to have?
Think it will benefit readers that are starting out with C etc.

~~~
saagarjha
In C, dynamically-sized vectors don’t carry around size information with them,
often leading to bugs. This struct attempts to keep the two together.

------
beefhash
Now that C2x plans to make two's complement the only sign representation, is
there any reason why signed overflow has to continue being undefined behavior?

On a slightly more personal note: What are some undefined behaviors that you
would like to turn into defined behavior, but can't change for whatever
reasons that be?

~~~
cataphract
Signed overflow being undefined behavior allows optimizations that wouldn't
otherwise be possible

Quoting [http://blog.llvm.org/2011/05/what-every-c-programmer-
should-...](http://blog.llvm.org/2011/05/what-every-c-programmer-should-
know.html)

> This behavior enables certain classes of optimizations that are important
> for some code. For example, knowing that INT_MAX+1 is undefined allows
> optimizing "X+1 > X" to "true". Knowing the multiplication "cannot" overflow
> (because doing so would be undefined) allows optimizing "X*2/2" to "X".
> While these may seem trivial, these sorts of things are commonly exposed by
> inlining and macro expansion. A more important optimization that this allows
> is for "<=" loops like this:

> for (i = 0; i <= N; ++i) { ... }

> In this loop, the compiler can assume that the loop will iterate exactly N+1
> times if "i" is undefined on overflow, which allows a broad range of loop
> optimizations to kick in. On the other hand, if the variable is defined to
> wrap around on overflow, then the compiler must assume that the loop is
> possibly infinite (which happens if N is INT_MAX) - which then disables
> these important loop optimizations. This particularly affects 64-bit
> platforms since so much code uses "int" as induction variables.

~~~
gwd
So in a corner case where you have a loop that iterates over all integer
values (when does this ever happen?) you can optimize your loop. As a
consequence, signed integer arithmetic is very difficult to write while
avoiding UB, even for skilled practitioners. Do you think that's a useful
trade-off, and do you think anything can be done for those of us who think
it's not?

~~~
andrepd
No, it's exactly the opposite. Without UB the compiler must assume that the
corner case may arise at any time. Knowing it is UB we can assert `n+1 > n`,
which without UB would be true for all `n` except INT_MAX. Standardising wrap-
on-overflow would mean you can now handle that corner case safely, at the cost
of missed optimisations on everything else.

~~~
rbultje
I/we understand the optimization, and I'm sure you understand the problem it
brings to common procedures such as DSP routines that multiply signed
coefficients from e.g. video or audio bitstreams:

for (int i = 0; i < 64; i++) result[i] = inputA[i] * inputB[i];

If inputA[i] * inputB[i] overflowed, why are my credit card details at risk?
The question is: can we come up with an alternate behaviour that incorporates
both advantages of the i<=N optimization, as well as leave my credit card
details safe if the multiplication in the inner loop overflowed? Is there a
middle road?

~~~
qppo
Another problem is that there's no way to define it, because in that example
the "proper" way to overflow is with saturating arithmetic, and in other cases
the "proper" overflow is to wrap. Even on CPUs/DSPs that support saturating
integer arithmetic in hardware, you either need to use vendor intrinsics or
control the status registers yourself.

~~~
jononor
One could allow the overflow behavior to be specified, for example on the
scope level. Idk, with a #pragma ? #pragma integer-overflow-saturate

------
rwmj
Not a question, a request: Please make __attribute__((cleanup)) or the
equivalent feature part of the next C standard.

It's used by a lot of current software in Linux, notably systemd and glib2. It
solves a major headache with C error handling elegantly. Most compilers
already support it internally (since it's required by C++). It has predictable
effects, and no impact on performance when not used. It cannot be implemented
without help from the compiler.

~~~
rseacord
My idea was to add something like the GoLang defer statement to C (as a
function with some special compiler magic). The following is an example of how
such a function could be used to cleanup allocated resources regardless of how
a function returned:

    
    
      int do_something(void) {
        FILE *file1, *file2;
        object_t *obj;
        file1 = fopen("a_file", "w");
        if (file1 == NULL) {
          return -1;
        }
        defer(fclose, file1);
      
        file2 = fopen("another_file", "w");
        if (file2 == NULL) {
          return -1;
        }
        defer(fclose, file2);
    
        obj = malloc(sizeof(object_t));
        if (obj == NULL) {
          return -1;
        }
        // Operate on allocated resources
        // Clean up everything
        free(obj);  // this could be deferred too, I suppose, for symmetry 
      
        return 0;
      }

~~~
rwmj
Golang gets this wrong. It should be scope-level not function-level (or
perhaps there should be two different types, but I have never personally had a
need for a function-level cleanup).

Edit: Also please review how attribute cleanup is used by existing C code
before jumping into proposals. If something is added to C2x which is
inconsistent with what existing code is already doing widely, then it's no
help to anyone.

~~~
rseacord
Yes, we have discussed adding this feature at scope level. A not entirely
serious proposal was to implement it as follows:

    
    
      #define DEFER(a, b, c)  \
         for (bool _flag = true; _flag; _flag = false) \
         for (a; _flag && (b); c, _flag = false)
    
      int fun() {
         DEFER(FILE *f1 = fopen(...), (NULL != f1), mfclose(f1)) {
           DEFER(FILE *f2 = fopen(...), (NULL != f2), mfclose(f2)) {
             DEFER(FILE *f3 = fopen(...), (NULL != f3), mfclose(f3)) {
                 ... do something ...
             }
           }
         }
      }
    

We are also looking at the attribute cleanup. Sounds like you should be
involved in developing this proposal?

~~~
a1369209993
Apropos of this, I'll toss in: please support do-after statements (and also
let statements).

    
    
      do foo(); _After bar();
      /* exactly equivalent to (with gcc ({})s): */
      ({ bar(); foo(); });
      #define DEFER(a, b, c) \
        _Let(a) if(!b) {} else do {c;} _After
    

(This is in fact a entirely serious proposal, though I don't actually expect
it to happen.)

------
hyc_symas
The standard string library is still pretty bad. This would have been a much
better addition for safe strcpy.

Safe strcpy

    
    
        char *stecpy(char *d, const char *s, const char *e)
        {
         while (d < e && *s)
          *d++ = *s++;
         if (d < e)
          *d = '\0';
         return d;
        }
    
        main() {
          char buf[64];
          char *ptr, *end = buf+sizeof(buf) ;
    
          ptr = stecpy(buf, "hello", end);
          ptr = stecpy(ptr, " world", end);
        }
    
    

Existing solutions are still error-prone, requiring continual recalculation of
buffer len after each use in a long sequence, when the only thing that matters
is where the buffer ends, which is effectively a constant across multiple
calls.

What are the chances of getting something like this added to the standard
library?

~~~
doublesCs
What's wrong with:

    
    
        *p += sprintf(*p, "hello");
        *p += sprintf(*p, "world");

~~~
spc476
Well, that should be `snprintf()` to start with, but even with that, there are
issues. The return type of `snprintf()` is `int`, so it can return a negative
value if there was some error, so you have to check for that case. That out of
the way, a positive return value is (and I'm quoting from the man page on my
system) "[i]f the output was truncated due to this limit then the return value
is the number of characters which would have been written to the final string
if enough space had been available." So to safely use `snprintf()` the code
would look something like:

    
    
        int size = snprintf(NULL,0,"some format string blah blah ...");
        if (size < 0) error();
        if (size == INT_MAX)
          error(); // because we need one more byte to store the NUL byte
        size++;
        char *p = malloc(size);
        if (p == NULL)
          error();
        int newsize = snprintf(p,size,"some format string blah blabh ... ");
        if (newsize < 0) error();
        if (newsize > size)
        {
          // ... um ... we still got truncated?
        }
    

Yes, using NULL with `snprintf()` if the size is 0 is allowed by C99 (I just
checked the spec).

One thing I've noticed about the C standard library is that is seems adverse
to functions allocating memory (outside of `malloc()`, `calloc()` and
`realloc()`). I wonder if this has something to do with embedded systems?

~~~
SAI_Peregrinus
Not just embedded systems, also OSes. C's standard library should generally
work without the existence of a heap. After all, you have to create the heap
using C before you can allocate from it.

~~~
saagarjha
malloc is a required part of ISO C, though.

~~~
flatfinger
Functions like malloc are only required for hosted implementations. Many
operating systems are built using freestanding implementations.

Further, on many platforms, one should avoid using malloc() unless portability
is more important than performance or safety. Some operating systems support
useful features like the ability to allocate objects with different expected
lifetimes in different heaps, so as to help avoid fragmentation, or arrange to
have allocations a program can survive without fail while there is still
enough memory to handle critical allocations. Any library that insists upon
using "malloc()" will be less than ideal for use with any such operating
system.

------
clarry
Open up WG14 mailing list for non-members?

It's hard to appreciate what's going on at WG14 (or take part) when you can
see the results only from afar, with none of the surrounding discussion.

I recently read Jens Gustedt's blog on C2x where he casually recommended this
as a way to get involved: "The best is to get involved in the standard’s
process by adhering to your national standards body, come to the WG14 meetings
and/or subscribing to the committee’s mailing list."

Afaict (from browsing the wg14 site), the mailing list and its archives are
not open to access.

[https://webcache.googleusercontent.com/search?q=cache:TnEGL4...](https://webcache.googleusercontent.com/search?q=cache:TnEGL4_UNK4J:https://gustedt.wordpress.com/2018/11/+&cd=13&hl=en&ct=clnk&gl=fi)

EDIT: In general, how is one supposed to approach wg14 with ideas or need for
clarification on the standard's wording / interpretation?

~~~
AaronBallman
> In general, how is one supposed to approach wg14 with ideas or need for
> clarification on the standard's wording / interpretation?

I'm currently working on an update to the committee website to clarify exactly
this sort of thing! Unfortunately, the update is not live yet, but it should
hopefully be up Soon™.

Currently, the approach for clarifications and ideas both require you to find
someone on the committee to ask the question or champion your proposal for
you. We hope to improve this process as part of this website update to make it
easier for community collaboration.

------
tzs
If an old timer who used to be good with C wanted to use C again, would they
have to learn a whole bunch of weird new stuff or could they pretty much use
it like they did back in the stone age (i.e., the 20th century)?

Back in the '80s and '90s I was pretty good at C. I don't think there was
anything about the language or the compilers than that I did not understand. I
used C to write real time multitasking kernels for embedded systems, device
drivers and kernel extensions for Unix, Windows, Mac, Netware, and OS/2\. I
did a Unix port from swapping hardware to paging hardware, rewriting the
processes and memory subsystems. I tricked a friend into writing a C compiler.
I could hold my own with the language lawyers on comp.lang.c.

Somewhere in there I started using C++, but only as a C with more flexible
strings, constructors, destructors, and "for (int i = ...)", and later added
STL containers to that.

Sometime in the 2000s, I ended up spending more and more time on smaller
programs that were mostly processing text, and Perl became my main tool. Also
I ended up spending a lot of helping out less experiences people at work who
were doing things in PHP, or JavaScript, or Java. My C and C++ trickled to
nothing.

I've occasionally looked at modern C++, but it is so different from what I was
doing back in '90s or even early '00s I sometimes have to double check that
I'm actually looking at C++ code.

Is modern C like that, or is it still at its core the same language I used to
know well?

~~~
DougGwyn
The main editing needed to bring "old C" source code up to snuff using a
"modern C" compiler is to make sure that the standard header-defined types are
used. No more assuming that a lot of things are, by default, int type. A
second, related editing pass is to make sure all functions are declared as
prototypes, no longer K&R style; K&R style is slated to be deprecated by the
next version of the Standard. (There are some rare uses for non-prototyped
functions, but evidently the committee thinks there is more benefit in forcing
prototypes.)

~~~
defectbydesign
So the ISO committee breaks the backward compatibility of C in behalf of
modernity... but there is C++ guys!

A little effort and you could make C deprecated. ;-)

This makes me think that there are as many C++ gurus than Go(ogle) gurus who
want to kill C to be the new Java which brings you a bad coffee from a dirty
kitchen.

------
ux
Is there any plan to deal with the locale fiasco at some point?

Some hints on what I'm referring to can be found here:
[https://github.com/mpv-
player/mpv/commit/1e70e82baa9193f6f02...](https://github.com/mpv-
player/mpv/commit/1e70e82baa9193f6f027338b0fab0f5078971fbe)

Unrelated, but I also miss a binary constant notation (such as 0b10101)

~~~
eqvinox
I haven't read most of that rant, but a thread-local setlocale() would be a
godsend. Not sure if that's ISO C or POSIX though.

~~~
wahern
POSIX has added _l variants taking a locale_t argument to all the relevant
string functions. I can see how per-thread state would be convenient, but it's
not a comprehensive solution. With the _l variants you can write your own
wrappers that pass a per-thread locale_t object.

------
eqvinox
What's the best way to deal with "transitive const-ness", i.e. utility
functions that operate on pointers and where the return type should
technically get const from the argument?

(strchr is the most obvious, but in general most search/lookup type functions
are like this...)

Add to clarify: the current prototype for strchr is

    
    
      char *strchr(const char *s, int c);
    

Which just drops the "const", so you might end up writing to read-only memory
without any warning. Ideally there'd be something like:

    
    
      maybe_const_out char *strchr(maybe_const_in char *s, int c);
    

So the return value gets const from the input argument. Maybe this can be done
with _Generic? That kinda seems like the "cannonball at sparrows" approach
though :/ (Also you'd need to change the official strchr() definition...)

~~~
DougGwyn
Many uses of strchr do write via a pointer derived from a non-const
declaration. When we introduced const qualifier it was noted that they were
actually declaring read-only access, not unchangeability. The alternative was
tried experimentally and the consequent "const poisoning" got in the way.

~~~
coliveira
I believe C is doing the right thing. Const as immutability is a kludge to
force the language to operate at the level of data structure/API design,
something that it cannot do properly.

~~~
moonchild
Have you ever used a high-level statically-typed language, e.g. haskell?

------
clarry
1\. Are there any plans for standardizing empty initializer lists?

    
    
        struct foo { int a; void *p; };
    
        struct foo f = {0}; // legal C, f->p initialized like a static variable
        struct foo f = {}; // not legal but supported by gcc
    

To me it would make sense that there is no need to specify a value for any of
the members that are intended to be initialized exactly like static variables
(and the first member is not special so I shouldn't have to explicitly assign
a zero?). However the syntax currently demands at least one initializer.

\--

2\. I recall seeing a proposal for allowing declarations after case labels:

    
    
        switch (foo) {
        case 1:
            int var;
            // ...
        }
    

This is currently not allowed and you'd have to wrap the lines after case in
braces, or insert a semicolon after the case label. Is this making it to c2x?

\--

3\. I've run into some recent controversy w.r.t. having multiple functions
called main (and this has come up in production code). In particular, I ran
into a program programs that has a static main() function (with parameters
that are not void or int and char _[]), which is not intended to be_ the* main
function that is the program's entry point.

gcc warns about this because the parameters disagree with what's prescribed
for the program entry point. It's not clear to me whether this is intended to
be legal or not.

\--

4\. Looking at the requirements for main brings up another question: it says
how main should be defined (no static or extern keyword). However, the
definition could be preceded by a static declaration, which then affects the
definition that follows:

 _If the declaration of an identifier for a function has no storage-class
specifier, its linkage is determined exactly as if it were declared with the
storage-class specifier extern._

 _For an identifier declared with the storage-class specifier extern in a
scope in which a prior declaration of that identifier is visible, if the prior
declaration specifies internal or external linkage, the linkage of the
identifier at the later declaration is the same as the linkage specified at
the prior declaration._

Therefore, it is possible to have a main function with internal linkage and a
definition that exactly matches the one given in the spec:

    
    
        static int main(int, char *[]);
    
        int main(int argc, char *argv[]) { /* ... */ }
    

As one might guess, this program doesn't make it through the linker when
compiled with gcc. Is this supposed to be legal? Should the spec perhaps
require main to have external linkage, and then allow other functions called
main with internal linkage (and parameters that do not match what is required
of the external one)?

EDIT: ---

Are the fixes w.r.t. reserved identifiers going to make it in c2x? Can I
finally have a function called toilet() without undefined behavior?

------
rmind
A lot C programmers prefer to keep structures within the C source file
("module"), as a poor man's encapsulation. For example:

component.h:

    
    
        struct obj;
        typedef struct obj obj_t;
    
        obj_t *obj_create(void);
        // .. the rest of the API
    

component.c:

    
    
        struct obj {
            int status;
            // .. whatever else
        };
    
        obj_t *
        obj_create(void)
        {
            return calloc(1, sizeof(obj_t));
        }
    

However, as the component grows in complexity, it often becomes necessary to
separate out some of the functionality (in order to re-abstract and reduce the
complexity) into a another file or files, which also operate on "struct obj".
So, we move the structure into a header file under #ifdef __COMPONENT_PRIVATE
(and/or component_impl.h) and sprinkle #define __COMPONENT_PRIVATE in the
component source files. It's a poor man's "namespaces".

Basically, this boils down to the lack namespaces/packages/modules in C. Are
you aware of any existing compiler extensions (as a precedent or work in that
direction) which could provide a better solution and, perhaps, one day end up
in the C standard?

P.S. And if C will ever grow such feature, I really hope it will _NOT_ be the
C++ 'namespace' (amongst many other depressing things in C++). :)

~~~
pascal_cuoq
I am sorry I do not have an answer to your question. It's a very valid one and
I would be interested in any pointer to an answer.

What I _can_ say while we are on the subject, is that I have seen C code (most
often C code that started its life in the 1990s, to be fair) that instead of
showing an abstract struct in the public interface, showed a different struct
definition.

Please don't do this. Yes, when compiling nowadays, eventually every
compilation unit ends up as object files passed to a linker that doesn't know
about types, but this is undefined behavior. It makes it difficult to find
undefined behavior in the rest of the code because there is a big undefined
behavior right in the middle of it.

~~~
beefhash
Wait, doesn't this mean that the BSD sockets API is inherently dependent on
UB, casing different socket types to each other and sometimes only using the
first few members, or am I misunderstanding you?

~~~
pascal_cuoq
Yes and no.

The thing I am describing is when you link a compilation unit using:

    
    
      struct internal_state { int dummy; } state;
    

with another compilation unit that defined the same state differently:

    
    
      struct internal_state {
         int actual_meaningful_member_1;
         unsigned long actual_meaningful_member_2; } state;
    

As far as I know, BSD socked do not do this. Zlib was doing this
([https://github.com/pascal-cuoq/zlib-
fork/blob/a52f0241f72433...](https://github.com/pascal-cuoq/zlib-
fork/blob/a52f0241f72433b69fd558100a32d927d9571e20/zlib.h#L1740) ), but I have
had the privilege of discussing this with Mark Adler, and I think the no-
longer-necessary hack was removed from Zlib.

BSD sockets probably have a different kind of UB, related to so-call “strict
aliasing” rules, unless they have been carefully audited and revised since the
carefree times in which they were written. I am going to have to let you read
this article for details (example st1, page 5): [https://trust-in-soft.com/wp-
content/uploads/2017/01/vmcai.p...](https://trust-in-soft.com/wp-
content/uploads/2017/01/vmcai.pdf)

~~~
loeg
BSD sockets are weird in that the first struct's (sockaddr) size wasn't big
enough, so APIs all take a nominal pointer to sockaddr but may require larger
storage (sockaddr_storage) depending on the actual address.

    
    
      /*
       * Structure used by kernel to store most
       * addresses.
       */
      struct sockaddr {
              unsigned char   sa_len;         /* total length */
              sa_family_t     sa_family;      /* address family */
              char            sa_data[14];    /* actually longer; address value */
      };
    
    
      /*
       * RFC 2553: protocol-independent placeholder for socket addresses
       */
      #define _SS_MAXSIZE     128U
      #define _SS_ALIGNSIZE   (sizeof(__int64_t))
      #define _SS_PAD1SIZE    (_SS_ALIGNSIZE - sizeof(unsigned char) - \
                                  sizeof(sa_family_t))
      #define _SS_PAD2SIZE    (_SS_MAXSIZE - sizeof(unsigned char) - \
                                  sizeof(sa_family_t) - _SS_PAD1SIZE - _SS_ALIGNSIZE)
      
      struct sockaddr_storage {
              unsigned char   ss_len;         /* address length */
              sa_family_t     ss_family;      /* address family */
              char            __ss_pad1[_SS_PAD1SIZE];
              __int64_t       __ss_align;     /* force desired struct alignment */
              char            __ss_pad2[_SS_PAD2SIZE];
      };

~~~
wahern
struct sockaddr_storage is insufficient as well. A Unix domain socket path can
be longer than `sizeof ((struct sockaddr_un){ 0}).sun_path`. That's a major
reason why all the socket APIs take a separate socklen_t argument. Most people
just assume that a domain socket path is limited to a relatively short string,
but it's not (except possibly Minix, IIRC).

~~~
asveikau
> A Unix domain socket path can be longer than `sizeof ((struct sockaddr_un){
> 0}).sun_path`

Hm, I didn't realize this, or if I knew this I had forgotten. It makes sense
because sun_path is usually pretty small, I believe 108 chars is the most
common choice, and typically file paths are allowed to be much longer.

Do you have a citation for this behavior? I can't seem to find it, though I'm
not looking very hard.

I guess you are right that any syscall taking a struct sockaddr * also has a
length passed to it... Some systems have sa_len inside struct sockaddr to
indicate length, but IIRC linux does not. I've often thought that length
parameter was sort of redundant, because (1) some platforms have sa_len, and
(2) even without that, you should be able to derive length from family. But
your Unix domain socket example breaks (2). Without being able to do that, I
start to imagine that the kernel would need to probe for NUL chars terminating
the C string anytime it inspects a struct sockaddr_un, rather than block-
copying the expected size of the structure -- that would be needlessly
complicated.

~~~
wahern
So I just reran some tests on my existing VMs and it turns out I remembered
wrong. Here's the actual break down:

* Solaris 11.4: .sun_path: 108; bind/connect path maximum: 1023. Length seems to be same as open. Interestingly, open path maximum seems to be 1023 (judged by trying ls -l /path/to/sock), although I always thought it was unbounded on Solaris.

* MacOS 10.14: .sun_path: 104, bind/connect path maximum: 253. Length can be bigger than .sun_path but less than open path limit.

* NetBSD 8.0: .sun_path: 104, bind/connect path maximum: 253. Same as MacOS.

* FreeBSD 12.0: .sun_path: 104, bind/connect path maximum: 104.

* OpenBSD 6.6: .sun_path: 104, bind/connect path maximum: 103 (104 \- 1).

* Linux 5.4: .sun_path: 108, bind/connect path maximum: 108.

* AIX 7.1: .sun_path: 1023, bind/connect path maximum: 1023. Yes, .sun_path is statically sized to 1023! And like Solaris, open path maximum seems to be 1023 (as judged by trying ls -l /path/to/socket). Thanks to Polar Home, polarhome.com, for the free AIX shell account.

Note that all the above lengths are _exclusive_ of NUL, and the passed
socklen_t argument did not include a NUL terminator.

For posterity: on all these systems you can still create sockets with long
paths, you just have to chdir or use bindat/connectat if available. My test
code confirmed as much. And AFAICT getsockname/getpeername will only return
the .sun_path path (if anything) used to bind or connect, but that's a more
complex topic (see
[https://github.com/wahern/cqueues/blob/e3af1f63/PORTING.md#g...](https://github.com/wahern/cqueues/blob/e3af1f63/PORTING.md#getsockname-
and-getpeername-on-af_unix-socket))

~~~
asveikau
Linux also has the unusual extension of: if sun_path[0] is NUL, the path is
not a filesystem path and the rest of the name buffer is an ID. I don't
remember if that can have embedded NULs in that ID. I believe so.

------
floatms
1\. How likely are named constants of any types to be included in C2x? I'm
referring to the idea of making register const values be usable in constant
expressions.

2\. Is there, or was there ever a proposal to make struct types without a tag
be structurally typed? This would not break backwards compatibility as far as
I can see, and would make these types much more useful as ad-hoc bags of data.
Small example:

    
    
      struct {size_t size; void *data;} data = get_data();
      int hash = hash_data(data);
    

I believe there was at least one proposal about error handling that more or
less relied on the above to be valid semantically.

3\. Is there any interest in making the variadic function interface a bit
nicer to use? I would like to bring back an old feature and have an intrinsic
to extract a pointer from the variadic parameter list, so that we can iterate
over it ourselves (or even index directly).

    
    
      void *arg_ptr = va_ptr(last);
    

More out there would be a parameter that would be implicitly passed to a
variadic function to indicate the number of arguments.

    
    
      void variadic(..., va_size count) {
      
      }
    
      variadic(10, 20, 30); // count would be three

~~~
uecker
(disclaimer: also a WG14 member)

1\. I want this too.

2\. Here is my proposal: [http://www.open-
std.org/jtc1/sc22/wg14/www/docs/n2366.pdf](http://www.open-
std.org/jtc1/sc22/wg14/www/docs/n2366.pdf)

3\. Yes, variadic functions should be improved.

~~~
flatfinger
What is really missing from the C "aliasing" rules is a recognition that an
access to a pointer/lvalue which is visibly freshly derived from another is,
depending upon the form of derivation, either a definite or potential access
to the former [in the former case, anything that couldn't be accessed by the
original couldn't be accessed by the derived pointer/lvalue; in the latter
case, the derived pointer/lvalue might access things the original could not].

I think the authors of C89 most likely thought that principle was sufficiently
obvious that there was no need to expressly state it. Were it not for the
Standard's rule forbidding it, an implementation might plausibly have ignored
the possibility of an `int` being accessed via an `unsigned _`, but I don 't
think the authors of the Standard imagined that a non-obtuse compiler writer
wouldn't allow for the possibility that something like:

    
    
        void inc_float_bits(float *f)
        {
          ((unsigned*)f)+=1;
        }
    

might affect the stored value of a `float`.

The present rules, as written, have absurd corner cases. Given something like:

    
    
        union U { float f[2]; unsigned ui[2]; } uu;
    

the Standard would, so far as I can tell, treat as identical the functions
test1, test2, and test3 below:

    
    
        float test1(int i, int j)
        {
          uu.f[i] = 1.0f;
          uu.ui[j] += 1;
          return uu.f[i];
        }
    
        float test2(int i, int j)
        {
          *(uu.f+i) = 1.0f;
          *(uu.ui+j) += 1;
          return *(uu.f+i);
        }
    
        float evil(unsigned *ui, float *f)
        { *f = 1.0f; *ui += 1; return *f; }
    
        int test2(int i, int j)
        {
          evil(uu.f+i, uu.ui+j);
        }
    

If a dereferenced pointer to union member type isn't allowed to access the
union, the first example would be UB regardless of i and j, but that would
imply that non-character arrays within unions are meaningless. If such
pointers are allowed to access union objects, then test2 (and the evil
function within it) would have defined behavior even when i and j are both
zero.

BTW, I think any quality compiler should recognize the possiblity of type
punning in the first two, though the Standard doesn't actually require either.
Neither clang or gcc, however, recognizes the possibility of type punning in
the second even though the behavior of the [] operators in the first are
_defined* as equivalent to the second.

------
rseacord
Many of your remaining questions have devolved into "When will I see my
favorite feature xyz appear in the C Standard?" The answer in most cases is
"that depends on how long it takes you to submit a proposal". Take a look at
[http://www.open-
std.org/jtc1/sc22/wg14/www/wg14_document_log...](http://www.open-
std.org/jtc1/sc22/wg14/www/wg14_document_log.htm) for previous proposals and
review the minutes to see which proposals have been adopted. In general, the
committee is not going to adopt proposals for which there is insufficient
existing practice or haven't been fully thought out. There are cases where
people have come to a single meeting with a well-considered proposal that was
adopted into the C Standard. I wrote about one such case here:
[https://www.linkedin.com/pulse/alignment-requirements-
memory...](https://www.linkedin.com/pulse/alignment-requirements-memory-
management-functions-robert-seacord/) Alternatively, you can approach someone
on the committee and ask us to champion a proposal for you. It is likely that
we'll agree or at least provide you with feedback on your proposal.

~~~
sramsay
Thanks to one and all for this AMA! The massive number of comments testifies
to the continuing interest in C, and I think we're all grateful to all of you
for your expertise, your patience, and your even-handed responses.

~~~
flatfinger
Is the present purpose of the Standard to:

1\. define a highly-extensible abstraction model, which implementations
intended for various purposes should be expected to extend to suit those
purpose, or

2\. define an abstraction model which is sufficiently complete that programs
can do everything that would need to be done, without need for extensions?

Reading the C89 and C99 Rationale documents, it's clear that those standards
were intended to meet the former purpose. The way some compilers treat
"Undefined Behavior", however, suggests that the maintainers view the Standard
as aimed toward the latter purpose.

During the 1980s and 1990s, it was generally cheaper and easier for
implementations to extend the Standard's abstraction model by specifying that
many actions would be processed "in a documented fashion characteristic of the
environment" than it would have been to do anything else, so there was no need
to worry about whether the Standard allowed programmers to specify when such
behavior was required. That no longer holds true, however.

While it would be reasonable to deprecate code which relies upon such
treatment without explicitly demanding it, such deprecation would only make
sense if there were a means of demanding such treatment when required. For the
Committee to provide such means, however, it would have to reach a consensus
as to the purposes for which the Standard's abstraction model is meant to be
suitable. Are you aware of any such consensus?

------
hawski
What do you think about Zig language [0] and if you have any opinions on it,
what distinguishing features would you like to see adopted in the C world?

[0] [https://ziglang.org/](https://ziglang.org/)

------
0x09
Not about the language exactly, so maybe not fair game, but: how did you all
find yourselves joining ISO? And maybe more generally, what's the path for
someone like a regular old software engineer to come to participate in the
standardization process for something as significant and ubiquitous as the C
programming language?

~~~
AaronBallman
Great question!

Joining the committee requires you to be a member of your country's national
body group (in the US, that's INCITS) and attend at least some percentage of
the official committee meetings, and that's about it. So membership is not
difficult, but it can be expensive. Many committee members are sponsored by
their employers for this reason, but there's no requirement that you represent
a company.

I joined the committees because I have a personal desire to reduce the amount
of time it takes developers to find the bugs in their code, and one great way
to reduce that is to design features to make harder to write the bugs in the
first place, or to turn unbounded undefined behavior into something more
manageable. Others join because they have specific features they want to see
adopted or want to lend their domain expertise in some area to the committee.

~~~
johannes1234321
Related to that: C++ standards body seems to be quite open allowing non-
members to participate (outside official votes, while respecting them when
looking for consensus) is it just due to my limited observation or is the C
group less open? Any plans in that regard?

~~~
msebor
Most of us on the committee would like to see more participation from other
experts. The committee's mailing list should be open even to non-members.
Attendance by non-members at meetings might require an informal invitation (I
imagine a heads up to the convener should do it).

~~~
DougGwyn
I think that's right. These days, much of the discussion occurs through study
subgroups (like the floating-point guys) and the committee e-mailing list.

------
commandersaki
A few years ago I came across this article Pointers Are More Abstract Than You
Might Expect In C [1].

I followed the article which attempted to interpret the C standard and come to
a conclusion. The conclusion is:

> The takeaway message is that pointer arithmetic is only defined for pointers
> pointing into array objects or one past the last element. Comparing pointers
> for equality is defined if both pointers are derived from the same
> (multidimensional) array object. Thus, if two pointers point to different
> array objects, then these array objects must be subaggregates of the same
> multidimensional array object in order to compare them. Otherwise this leads
> to undefined behavior.

Based on the above, I arrived at the conclusion after reading this that
comparing two distinct malloc()'d pointers for equality itself is undefined
behaviour since malloc() is likely to return pointers to distinct objects that
are not part of a sub-aggregate object.

I know this is incorrect, but I don't know why I'm wrong.

[1]: [https://stefansf.de/post/pointers-are-more-abstract-than-
you...](https://stefansf.de/post/pointers-are-more-abstract-than-you-might-
expect/)

~~~
pascal_cuoq
The only thing that is not defined is comparing a pointer one-past-the-end to
a pointer to the very beginning of a toplevel object. Apart from this rule,
pointers of course do not need to be derived from the same object in order to
be compared with == and !=.

&a + 1 == &b is unspecified: it may produce 0 or 1, and it may not produce the
same result if you evaluate it several times.

Similarly, if both the char pointers p and q were obtained with malloc(10),
after they have been tested for NULL, all these operations are valid:

    
    
      p == q (false)
      p + 1 == q (false)
      p + 1 == q + 1 (false)
      p + 10 == q + 1 (false)
    

Only p+10 == q and p == q+10 are unspecified (of the comparisons that can be
built without invoking UB during the pointer arithmetic itself).

I have no idea what led that person to (apparently) write that &a==&b is
undefined. This is plain wrong. I do not see any ambiguity in the relevant
clause
([https://port70.net/~nsz/c/c11/n1570.html#6.5.9p6](https://port70.net/~nsz/c/c11/n1570.html#6.5.9p6)
). Yes, the standard is in English and natural languages are ambiguous, but
you might as well claim that a+b is undefined because the standard does not
define what the word “sum” means
([https://port70.net/~nsz/c/c11/n1570.html#6.5.6p5](https://port70.net/~nsz/c/c11/n1570.html#6.5.6p5)
).

~~~
azinman2
Why is this undefined if it’s all just pointers to addresses in memory,
regardless if the memory is valid for that object or not?

~~~
pascal_cuoq
Here is an example I have at hand that shows that when you are using an
optimizing compiler, there is no such thing as “just pointers to addresses in
memory”. There are plenty more examples, but I do not have the other ones at
hand.

[https://gcc.godbolt.org/z/Budx3n](https://gcc.godbolt.org/z/Budx3n)

~~~
pgy
Please correct me if I am wrong, but I think here the optimization is possible
because "* p = 2" is UB, because the compiler can assume that "p" points to
invalid memory. For this assumption, the compiler must know that "realloc"
invalidates its first argument.

How does it know that? The definition of "realloc" lives in the source of
"libc.so", so the compiler should not be able to see into it. Its declaration
in "malloc.h" does not have any special attributes. Does the standard and/or
the compiler handles "realloc" differently from other functions?

edit:

It looks like clang inserts a "noalias" attribute to the declaration of
"realloc" in the LLVM IR, so it seems it does handle "realloc" specially.

    
    
        declare dso_local noalias i8* @realloc(i8* nocapture, i64) local_unnamed_addr #3

------
MaxBarraclough
Does the following code fragment cause undefined behaviour?

    
    
        unsigned int x;
        x -= x;
     

There's a lengthy StackOverflow thread where various C language-lawyers
disagree on what the spec has to say about trap values, and under what
circumstances reading an uninitialised variable causes UB. I'd appreciate an
authoritative answer. Thanks for dropping by on HN!

[https://stackoverflow.com/q/11962457/](https://stackoverflow.com/q/11962457/)

~~~
msebor
Yes, it's undefined. It involves a read of an uninitialized local variable.
Except for the special case of unsigned char, any uninitialized read is
undefined.

~~~
emilfihlman
>Except for the special case of unsigned char, any uninitialized read is
undefined.

Could you expand on this?

~~~
msebor
An object of any type, initialized or not, can be read by an lvalue of
unsigned char (or any character type). That lets functions like memcpy (either
the standard one or a hand-rolled loop) copy arbitrary chunks of memory.

There's some debate about the effects of reading an uninitialized local
variable of unsigned char (like whether the same value must be read each time,
or whether it's okay for each read to yield a different value).

This special exemption doesn't extend to any other types, regardless of
whether or not they have padding bits or trap representations that could cause
the read to trap. Few types do, yet the behavior of uninitialized reads in
existing implementations is demonstrably undefined (inconsistent or
contradictory to invariants expressed in the code of a test case), so any
subtleties one might derive from the text of the standard must be viewed in
that light.

~~~
MaxBarraclough
Thanks for your answers. A related question: this article [0] appears to
single out _memcpy_ and _memmove_ as being special regarding effective type.
Is it accurate? It seems to be at odds with your suggestion that there's
nothing stopping me writing my own memcpy provided I'm careful to use the
right types.

[0]
[https://en.cppreference.com/w/c/language/object#Effective_ty...](https://en.cppreference.com/w/c/language/object#Effective_type)

~~~
AaronBallman
I think that may be inaccurate -- IIRC, in C, you can do type punning via a
union but not memcpy, and in C++ you can do type punning via memcpy but not a
union and this incompatibility drives me nuts because it makes inline
functions in a header file shared between C and C++ really messy. (Moral of
the story: don't pun types.)

~~~
pascal_cuoq
The C standard also allows to use memcpy to do type punning:

    
    
        If a value is copied into an object having no declared type using memcpy or memmove,
        or is copied as an array of character type, then the effective type of the modified
        object for that access and for subsequent accesses that do not modify the value is
        the effective type of the object from which the value is copied, if it has one
    

Simply memcpy into a variable (as opposed to dynamically allocated memory).

[https://port70.net/~nsz/c/c11/n1570.html#6.5p6](https://port70.net/~nsz/c/c11/n1570.html#6.5p6)

~~~
AaronBallman
I must be remembering incorrectly then, thank you!

------
rvp-x
A lot of you seem to be working on commercial solutions to C's insecurity.
Does this feel like a conflict of interest to you?

~~~
rseacord
Good question, but not at all! I've been working as hard as I can for the past
15 years to improve C Language security as have other security-minded members
of the committee. Generally speaking, we are in the minority as performance is
still the major driver for the language. Any security solution that introduces
> 5% overhead, for example, is a nonstarter. I think we all understand that
are jobs are completely safe no matter what security improvements we can get
adopted.

The committee works a lot lobbyist. A minority of people with a large
financial interest in the technology (such as compiler writers) have undue
influence because they participate in the process. I always encourage C
language users to take a more active role, but they usually don't. Cisco is an
example of user community that actively takes part in C Standardization.

~~~
pjmlp
I guess this is why vendors like Apple, Oracle, ARM and Google end up going
the hardware memory tagging route instead.

------
WalterBright
I wrote about a simple addition to C that could eliminate most buffer
overflows:

[https://www.digitalmars.com/articles/C-biggest-
mistake.html](https://www.digitalmars.com/articles/C-biggest-mistake.html)

I.e. offering a way that arrays won't automatically decay to pointers when
passed as a function parameter.

~~~
quelsolaar
Arrays are pointers. If they aren't pointers then you need to copy the data
when you are giving an array as a function parameter. that's a lot slower.
Being able to prepare an set of data in an array and then giving a pointer to
a function is very useful. You could add a second type of array on top of what
you have in C that includes more stuff, but if that's what you want you can
implement that yourself with a struct.

~~~
napsy
An array is not a pointer. These are completely different data types. For
example, you can't apply pointer arithmetic to arrays without casting them to
pointers.

~~~
WalterBright
That's right. They are converted to pointers when passed to a function, even
if the function declares the parameter as an array.

~~~
napsy
They're not converted but can be implicitly casted to pointer types.

~~~
_kst_
No, they're converted. There is no such thing as an "implicit cast". And it's
not specific to arguments in function calls.

Array types and pointer types are distinct.

An expression of array type is, in most but not all contexts, implicitly
converted (really more of a compile-time adjustment) to an expression of
pointer type that yields the address of the 0th element of the array object.
The exceptions are when the array expression is the operand of a unary &
(address-of) or sizeof operator, or when it's a string literal in an
initializer used to initialize an array (sub)object. (The N1570 draft
incorrectly lists _Alignof as another exception. In fact, _Alignof can only
take a parenthesized type name as its operand.)

If you do:

    
    
        int arr[10];
        some_func(arr);
    

then arr is "converted" to the equivalent of &arr[0] -- not because it's an
argument in a function call, but because it's not in one of the three contexts
listed above in which the conversion doesn't take place.

Another rule that causes confusion here is that if you define a function
parameter with an array type, it's treated as a pointer parameter. For
example, these declarations are exactly equivalent:

    
    
        void func(int arr[]);
        void func(int arr[42]); // the 42 is quietly ignored
        void func(int *arr);
    

Suggested reading: [http://www.c-faq.com/](http://www.c-faq.com/),
particularly section 6, "Arrays and Pointers".

A conversion converts a value of one type to another type (possibly the same
one). The term "cast" refers only to an explicit conversion, one specified by
a cast operator (a parenthesized type name preceding the expression to be
converted, like "(double)42"). An implicit conversion is one that isn't
specified by a cast operator.

~~~
saagarjha
A little-known but useful C feature is static array indices, as in:

    
    
      void foo(int array[static 42]);
    

which means you can't pass in an array of less than 42 elements (and the
compiler can warn you if it notices you are).

------
cperciva
When will C gain a mechanism for "do not leave this sensitive information
laying around after this function returns"? We have memset_s but that doesn't
help when the compiler copies data into registers or onto the stack.

~~~
pascal_cuoq
This is an entire language extension, as you note. The last time various
people interested in this were in the same room (it was in January 2020 in a
workgroup called HACS), what emerged was that the Rust people would try to add
the “secret” keyword to the language first, since their language is still more
agile than C, while the LLVM people would prepare LLVM for the arrival of at
least one front-end that understand secret data.

Is this enough to answer your question? I can look up the names of the people
that were involved and communicate them privately if you are further
interested.

~~~
stephencanon
Also worth noting that a language extension may not be sufficient for all
cases. E.g. the OS stores register state on a context switch; do you also need
a flag for the system to zero any memory used for this purpose following the
state restore, or is it OK to trust that it won’t leak through some mechanism?
For some applications, there may be contractual or regulatory requirements to
have an erasing mechanism for copies like this as well.

~~~
cperciva
I want to use this in the OS kernel too. ;-)

------
clarry
Why can't I have flexible array members in union? Consider this:

    
    
        struct foo {
            enum { t_char, t_int, t_ptr, /* .. */ } type;
            int count;
    
            union {
                char c[];
                int i[];
                void *p[];
                /* .. */
            };
        };
    
    

This isn't allowed, since flexible array members are only allowed in structs
(but the union here is exactly where you'd put a flexible array member if you
had only one type to deal with).

Furthermore, you can't work around this by wrapping the union's members in a
struct because they must have more than one named member:

    
    
        struct foo {
            enum { t_char, t_int, t_ptr } type;
            int count;
    
            union { /* not allowed! */
                struct { char c[]; };
                struct { int i[]; };
                struct { void *p[]; };
            };
        };
    

But it's all fine if we either add a useless dummy variable or move some prior
member (such as _count_ ) into these structs:

    
    
        struct foo {
            enum { t_char, t_int, t_ptr } type;
            int count;
    
            union { /* this works but is silly and redundant */
                struct { int dumb1; char c[]; };
                struct { int dumb2; int i[]; };
                struct { int dumb3; void *p[]; };
            };
        };
    

Of course, you could have the last member be

    
    
        union { char c; int i; void *p; } u[];
    

but then each element of u is as large as the largest possible member which is
wasteful, and u can't be passed to any function that expects to get a normal,
tightly packed array of one specific type.

------
rbultje
I'd love your opinion on the abundance of "undefined behaviour" (as opposed to
implementation-defined, or some new incantation such as "unknown result in
variable but system is safe") for relatively trivial things such as signed
(but not unsigned) integer overflows. I've heard that this is to allow for
non-twos-complement implementations. However, in practice, you notice that
most people use ugly workarounds which lead to ugly code that (because of e.g.
casting to unsigned and allowing the same overflow to happen anyway) only work
correctly on twos-complement anyway. Is this intended to be addressed in the
future in some way?

~~~
stephencanon
> (because of e.g. casting to unsigned and allowing the same overflow to
> happen anyway) only work correctly on twos-complement anyway

Unsigned arithmetic never overflows, and guarantees two's-complement behavior,
because unsigned arithmetic is always carried out modulo 2^n:

> A computation involving unsigned operands can never overflow, because a
> result that cannot be represented by the resulting unsigned integer type is
> reduced modulo the number that is one greater than the largest value that
> can be represented by the resulting type. (6.2.5, Types)

Doing the computation in unsigned always does the "right thing"; the thing
that one needs to be careful of with this approach is the conversion of the
final result back to the desired signed type (which is very easy to get subtly
wrong).

~~~
shawnz
Wrapping around the modulus to me is an "overflow", although maybe the spec
doesn't use the word that way

~~~
GuB-42
There is also a difference in x86 assembly, and probably others.

For unsigned operations the carry flag is used, and for signed operations, the
overflow flag is used.

~~~
kwillets
Most compilers will translate unsigned (x + y < x) to CF usage.

------
rseacord
So what do people think about having a feature in the C language akin to the
defer statement in GoLang?

The GoLang defer statement defers the execution of a function until the
surrounding function returns. The deferred call's arguments are evaluated
immediately, but the function call is not executed until the surrounding
function returns. It looks like an interesting mechanism for cleaning up
resources.

~~~
NickDunn
It could be very useful for cleaning resources. I've never used GoLang, but
can see how that could be useful in various circumstances. As we're talking
about C, I suspect a feature like that, with the potential to make things
safer, would also enable the unwary to shoot themselves in the foot more
easily.

------
BeeOnRope
When deciding on the behavior of some operation that maps to hardware [1], how
do you weight the existing hardware behaviors?

For example, if all past, current and contemplated hardware behaves in the
same way, I assume that the standard will simply enshrine this behavior.

However, what if 99% of hardware behaves one way and 1% another? Do you set
the behavior to "undefined" to accommodate the 1%? At what point to you decide
that the minority is too small and you'll enshrine the majority behavior even
though it disadvantages minority hardware?

\---

[1] Famous examples include things like bit shift and integer overflow
behavior.

~~~
rseacord
I would say that the committee does pay attention to hardware variations, even
when there are no examples of existing hardware that implement a feature (for
example, a trap representation for integers other than _Bool). Some of the
thinking is that "if it was ever implemented in hardware, it could be again).
I'm not crazy about this thinking, and I largely think that language features
for which there are no existing hardware implementations should be eliminated
and then brought back if needed. However, the C Committee is much smaller than
the C++ committee so there is a labor shortage. More people getting involved
would certainly help.

We have dropped support for sign and magnitude and one's complement
architectures from C2x (a decision Doug Gwyn does not agree with). There was
some concern that Unisys may still use a one's complement architecture, but
that this may only be in emulation nowadays.

~~~
rseacord
Some example of hardware variation (since you mentioned shifting and
overflow):

\- signed integer overflow or division by zero occurs, a division instruction
traps on x86, while it silently produces an undefined result on PowerPC \-
left-shifting a 32-bit one by 32 bits yields 0 on ARM and PowerPC, but 1 on
x86; \- left-shifting a 32-bit one by 64 bits yields 0 on ARM, but 1 on x86
and PowerPC

~~~
BeeOnRope
On x86 it's actually mixed: scalar shifts behave as you describe, but
vectorised logical shifts flush to zero when the shift amount is greater than
the element size!

So x86 actually has both behaviors in one box (three behaviors if you could
the 32-bit and 64-bit scalar things you mentioned separately).

This is an example of where UB for simple operations actually helps even on a
single hardware platform: it allows efficient vectorization.

------
ancarda
As a C newbie, will there ever be "safe" C, i.e. no undefined behavior and
help with writing code that has less memory related crashes/bugs? For
comparison, Rust has the `unsafe { }' block which lets you mark regions of
code as being able to do funky stuff. Could we get the opposite for C, i.e.
`safe { }' and for an entire file, `#pragma safe'?

I have a love-hate relationship with C - I like it for small projects, but
anything serious I really need to write it in a more safe language. I think
GCC has some flags that can help, and I've been using tools like splint, but
something baked into the standard would be amazing.

~~~
sramsay
I'm pretty happy with C as it is, but I will admit to being surprised that a
"minimalistic Rust" hasn't risen to prominence.

I guess what I mean by that is a language that has Rust's hyperactive,
strongly opinionated compiler, borrow checker, no NULL, immutable by default,
etc, but in a language that is no more syntactically ambitious that C89. I
would be way more into a language like that than Rust.

A language that sort of _feels_ like Go, but can actually be used for low-
level systems programming.

~~~
Leherenn
I think it's going to arrive, but some time is needed to see what works in
Rust or not. D is going this way as well, so should provide another data
point.

------
oldiob
Is the committee planning on working on the preprocessor? I don't see any
reason for not boosting it. It's time for C to have real meta-programming.
Would be nice to have local macros that are scoped.

On another note:

\- Official support for __attribute__

\- void pointers should offset the same size as char pointers.

\- typeof (can't stress this one enough)

\- __VA_OPT__

\- inline assembly

\- range designated initializer for arrays

\- some GCC/Clang builtins

\- for-loop once (Same as for loop, but doesn't loop)

Finally, stop putting C++ craps into C.

~~~
jparkie
+1 for Modern Metaprogramming.

I know some people are against metaprogramming because they believe the
abstractions hide the intrinsic of how the underlying code will execute, but I
would love to write substantial tests in C without relying on FFI to Python or
C++ to perform property-based testing, complex fuzzing, and whatever. I feel
metaprogramming would be a huge boon for C tooling and developer productivity.

~~~
oldiob
In my point of view, there's a difference between abstraction created by the
language, e.g. lambdas or virtual table in C++, and abstraction created by the
programmers via the CPP.

The former is compiler dependent and you cannot know how it's implemented. The
former is simple text substitution and you're the one implementing it. I often
find myself creating small embedded languages in CPP for making abstraction,
and I know exactly what C code it's going to generate and thus the penalty if
there's any.

People that are afraid of the preprocessor simply don't understand how
powerful it's in good hands.

------
hsivonen
Does the committee have plans to deprecate (as in: give compiler license to
complain suchthat compiler developers can appeal to yhe standard when users
complain back) locale-sensitive functions like isdigit, which is useless for
processing protocol syntax, because it is locale-sensitive, and useless for
processing natural-language text, because it examines only one UTF-8 codw
unit?

~~~
DougGwyn
isdigit is likely to remain, because much existing code does use it (perhaps
in different contexts from the one you cited). If you need a different
function specification to do something different, it could be added in a
future release, but that doesn't mean that we need to force programmers to
change their existing code.

~~~
_kst_
What about giving isdigit and friends defined behavior for any argument value
that's within the range of any of char, signed char, or unsigned char?

The background (I know Doug knows this): isdigit() takes an argument of type
int, which is required to be either within the range of unsigned char, or have
the value EOF (required to be negative, typically -1).

The problem: plain char is often signed, typically with a range of -128..+127.
You might have a negative char value in a string -- but passing any negative
value other than EOF to isdigit() has undefined behavior. Thus to use
isdigit() safely on arbitrary data, you have to cast the argument to unsigned
char:

    
    
        if (isdigit((unsigned char)s[i])) ...
    

A lot of C programmers aren't aware of this and will pass arbitrary char
values to isdigit() and friends -- which works fine most of the time, but
risks going kaboom.

Changing this could raise issues if -1 is a valid character value and also the
value of EOF, but practically speaking -1 or 0xff will almost never be a digit
in any real-world character set. (It's ÿ in Unicode and Latin-1, which might
cause problems for islower and isalnum.)

~~~
emmelaich
I remember that the various is* man pages noted that most of them are only
defined if isascii() is true. So I always used e.g. (isascii(x) && ispunct())

FWIW, just looked at the man page (macos) and iswdigit() and isnumber() are
mentioned.

~~~
_kst_
isascii() is not defined by ISO C. (It is defined by POSIX, but POSIX says it
may be removed in a future version.)

I see that POSIX explicitly says that isascii(x) is "defined on all integer
values" (it should have said "all int values").

Personally I'd rather cast to unsigned char.

------
jpfr
C11 has seen new features, such as Generic Selection. Is the current language
standardization converging (just adding clarifications, removing the surface
for undefined behavior, etc.) or is C still growing with new features?

In other words, will the C standard be effectively “done” at some time in the
future?

~~~
msebor
Fixing minor bugs or inconsistencies and reducing the number and kinds of
instances of undefined behavior are some of the efforts keeping the C
committee busy.

Reviewing proposals to incorporate features supported by common
implementations is another.

Aligning with other standards (e.g., floating point) and improving
compatibility with others (C++) is yet another.

In general, when an ISO standard is done it essentially becomes dead. So for
the C standard to continue to be active (on ISO's books) it needs to evolve.

~~~
ken
It's interesting to hear the standardization perspective, because it's pretty
much the opposite of my perspective as a user.

I see the classic path of any programming language -- regardless of
standardization -- is to continuously add features until it's too big and
complex that nobody wants to deal with it any more. Then it's replaced by a
newer, simpler language that takes the important bits and drops the
unnecessary complexities. At that point, everybody sees that the older
language was barking up the wrong tree, and they stop wasting time on it.

It's not the cessation of language change that _causes_ language death --
that's merely a symptom. You can't keep a language alive simply by changing it
every year. Some people sure have tried.

Alternatively, until it's evolved so much that there is so much diversity of
implementation that simply knowing a library is written in "language X"
doesn't tell me much about how it's written, or whether I can use it in my
program which is also written in "language X".

Then again, C is the exception to every rule, so maybe we can keep piling on
features indefinitely, and people will have to use it (even if they don't like
it), for the same reason they started using it decades ago (even if we didn't
like it).

------
watergatorman
Some random thoughts:

I appreciate the original simplicity of K & R, "The C Programming Language",
2nd Edition, and the relatively simple semantics of ANSI C89/ISO C90 compared
to C99 and later.

You don't need complex parsing methods for ANSI C89/ISO C90 and you do not
need the "lexer hack" to handle the typedef-name versus other "ordinary
identifier" ambiguity.

A surprising number of colleges still teach K & R 2nd Edition C.

Whenever someone brags about using recursive-descent parsing methods, I always
ask, are they using predictive, top-down parsing, or back-tracking?

I hope C never loses sight of it's roots nor morphs into C++ under the guise
of creating a common subset, but which is really a disguised superset of C and
C++

Please prevent the ever increasing demand for new features from overwhelming
C's simplicity so it can no longer be parsed with simple methods.

------
AvImd
Is there a possibility there will be introduced a new rule saying "if the
compiler detects an UB it should abort the compilation instead of breaking the
code in the most incomprehensible way possible"?

Right now it's just scary to start a new project in C. It would be really
great if there was more emphasis on correctness of the produced code instead
of the insane optimizations.

~~~
klodolph
This can only be done at compile time in very specific cases. The huge problem
here is the compiler has no way of knowing which cases of undefined behavior
are _bugs in the program_ and which cases of undefined behavior are just
examples of unreachable code. If the compiler aborted compilation when it
detected undefined behavior, you’d be getting a lot of false positives for
unreachable code, and you’d need to solve that problem (figuring out how to
generate sensible errors and suppress them). This is not even remotely easy.

If you are concerned about safety there are ways to achieve that, like using
MISRA C, formally verifying your C, or by writing another language like Rust.

~~~
kzrdude
Good point, but could it not be required that the unreachable code would be
annotated to be unreachable? It could even have a (development only) assertion
in the location.

~~~
klodolph
That would be an immense undertaking. It’s not really just that some statement
or expression is unreachable (we have __builtin_unreachable() in GCC for stuff
like that) but that certain states are unreachable.

For example,

    
    
        int buffer_len(struct buffer *buf) {
            return buf->end - buf->start;
        }
    

There are at least three states that trigger undefined behavior: buf is not a
valid pointer, buf->end - buf->start doesn’t fit in int, and buf->end and
buf->start don't point to the same object.

I’m not sure how you would annotate this. At the function call site, you would
somehow need to show that buf is a valid pointer, and that start/end point to
same object and the difference fits in an int. It would start looking more
like Coq or Agda than C.

Honestly, I think if you really want this kind of safety, your options are to
use formal methods or switch to a different language.

There’s also this weird assumption here that the compiler detects undefined
behavior in your program and then mangles it. It’s really the opposite—the
compiler assumes that there is no undefined behavior in your program, and
optimizes accordingly. In practice you can turn optimizations off and get
something much closer to the “machine model” of C (which doesn’t really exist
anyway) but most people hate it because their code is too slow.

~~~
kzrdude
Thanks, so it's definitely easier said than done! Good explanation.

------
natch
Can you please repeat this AMA at a later date and at a time of day when
people on the west coast of the USA are awake? Alternatively, please keep it
going for a few hours if you would be able to be so generous with your time!
Thank you for doing this!

Do you also answer questions about the standard libraries? This is not so much
a C question as a library question:

I'm wondering if Apple's Grand Central Dispatch ever made it into a more
integrated role in C's libraries, or if it will forever remain an outside add-
on. And whether there is anything else at that level (level in the sense of
high versus low level) in the standard libraries that plays such a role, that
I should read up on instead of GCD.

~~~
AaronBallman
> Alternatively, please keep it going for a few hours if you would be able to
> be so generous with your time!

We're remaining active while there are still people asking questions, so the
west coast folks should hopefully have the chance to ask what they'd like.

> Do you also answer questions about the standard libraries?

Sure!

> I'm wondering if Apple's Grand Central Dispatch ever made it into a more
> integrated role in C's libraries, or if it will forever remain an outside
> add-on.

GCD has not been adopted into C yet, and I don't believe it's even been
proposed to do so by anyone (or an alternative to GCD, either).

It would be an interesting proposal to see fleshed out for the committee, and
there is a lot of implementation experience with the feature, so I think the
committee would consider it more carefully than an inventive proposal with no
real-world field experience.

~~~
wahern
GCD relies on Blocks (closures) for ergonomics, and Blocks have been proposed
to WG14, for example N1451: [http://www.open-
std.org/jtc1/sc22/wg14/www/docs/n1451.pdf](http://www.open-
std.org/jtc1/sc22/wg14/www/docs/n1451.pdf)

------
Daemon404
What has been the rationale or hinderance for not adding locale-independent
versions of various stdlib functions?

Practically every second C codebase on earth has their own implementations of
these at some point, and it remains a huge problem for e.g. writers of
libraries, where you don't know how/where your library will be used.

~~~
msebor
First, there needs to be a proposal for adding a feature (I'm not aware of one
having been submitted recently). Second, any non-trivial proposed feature
needs to have some existing user experience behind it. For libraries that
typically means implementations shipping with operating systems or compilers
(but successful third party libraries might also be considered). Finally, it
also needs to appeal to people on the committee; that can be quite challenging
as well. Many proposals that meet the first two criteria die because they
simply don't get enough support within the committee.

~~~
Daemon404
Sounds mostly like the issue is nobody has bothered to submit a proposal for
it then? (There is _so_ much in-the-wild experience and code dealing with this
issue, I cannot imagine the second point being problematic.)

On the third point, I have trouble thinking of any technical objections to
such proposal.

------
eska
There's a compiler attribute in GCC to promise that a function is pure, i.e.
free from side effects and only uses its inputs.

This is useful for parallel computations, optimizations and readability, e.g.

    
    
       sum += f(2);
       sum += f(2);
    

can be optimized to

    
    
       x = f(2);
       sum += x;
       sum += x;
    

Would the current motto of the consortium forbid adding a feature such as
marking a function as pure, that would not just promise, but also enforce that
no side effects are caused (only local reads/writes, only pure functions may
be called), and no inputs except for the function arguments are used?

~~~
kazinator
No enforcing! This is useful even when it's, strictly speaking, a lie.

Suppose I want to add some debug tracing into f():

    
    
       f.c: 42: f entered
       f:c: 43: returning 2
    

that's a side effect, right? But now the pure attribute tells a lie. Never
mind though; I don't care that some calls to f are "wrongly" optimized away; I
want the tracing for the ones that aren't.

In C++ there are similar situations involving temporary objects: there is a
freedom to elide temporary objects even if the constructors and destructors
have effects.

Even a perfectly pure function can have a side effect, namely this one:
triggering a debugger to stop on a breakpoint set in that function!

If a call to f(2) is elided from some code, then that code will no longer hit
the breakpoint set on f.

Side effect is all P.O.V. based: to declare something to be effect-free in a
conventional digital machine, you have to first categorize certain effects as
not counting.

~~~
flatfinger
Such attributes would be most useful if the semantics were that any time after
a program receives inputs that would cause a "pure" function to be called with
certain arguments, a compiler may at its leisure call the function with those
arguments as many or as few times as it sees fit.

The notion that "Undefined Behavior" is good for optimization is misguided and
dangerous. What is good for optimization is having semantics that are loose
enough to give the compiler flexibility in how it processes things, _but tight
enough to meet application requirements_.

Instead of saying that compilers can do anything they want when their
assumptions are violated, it would be far more useful to recognize what they
are allowed to do on the basis of certain assumptions. For example, given a
piece of code:

    
    
        long long test1(long long x, int mode)
        {
          while(x)
            x = slow_function_no_side_effects(x);
          return x;
        }
    
        void long test2(long long x, int mode)
        {
          x = test1(x);
          if (!mode)
            x=0;
          doSomething(x);
        }
    

It would generally be useful and safe to allow a compiler that determines that
no individual action performed by "test1()" could have any side effects may
omit the call to "test1()" if its value never ends up being used, without
having to prove that the slow function with no side effects will eventually
return zero. It is likewise useful and safe to say that if the generated code
observes either that the loop exits or that "mode" is zero, it may replace the
call "doSomething(x)" with "doSomething(0)". The fact that both optimizations
would be safe and useful individually, however, does not imply that it would
be safe and useful to allow compilers to change the code for "test2()" so that
it calls "doSomething(0)" or otherwise allow code to observe that the value of
"x" is zero when mode is non-zero, without regard for whether "test1()" would
complete.

~~~
kazinator
> flatfinger

[https://news.ycombinator.com/user?id=supercat](https://news.ycombinator.com/user?id=supercat)

?

If you contact the HN gods maybe there is a way to recover access to that
account.

------
sgawlik
When you're looking at an unfamiliar C code base for the first time, how do
you approach it? Which files do you look for? Which tools to you open up
immediately?

~~~
jhallenworld
cscope can help

~~~
clarry
Is there a vim-style cscope interface for emacs? I hate that xcscope brings up
its own persistent buffers (replacing other buffers that I had deliberately
placed on the screen). Vim, conveniently, just pops up the cscope interface
when I need to enter some input, and then hides it away. Also I don't think
xcscope works with evil's tag stack whereas in vim, I believe, you can just
return to where you were with ^T, whether using ctags or cscope.

------
ocithrowaway
A couple of (I hope easy) requests - 1\. Can we add separators in constants
(C++ does 0xFFFF'FFFF'FFFF'FFFF any other reasonable scheme is fine too?)

2\. I think many compilers already do this, but can the static initialization
rules be relaxed a bit?

    
    
      static const int a = 0;
      static const int b = a; /* This is not standard C afaik. */
    

Thank you, CodeandC

~~~
msebor
WG14 in general looks favorably at proposals to align C more closely with C++
(within the overall spirit of the language) and I'd expect (1) would viewed in
that light.

I'd also say there is consensus that (2) would be beneficial. There are some
good ideas in [http://www.open-
std.org/jtc1/sc22/wg14/www/docs/n2067.pdf](http://www.open-
std.org/jtc1/sc22/wg14/www/docs/n2067.pdf) although I don't think repurposing
the register keyword for it was very popular. Not just because it wouldn't be
compatible with C++ which deprecated register some time ago, but also because
it's novel with no implementation or user experience behind it. My impression
that this is waiting for a new proposal.

------
teleonorax
What's up with `strlcpy` and `strlcat`? Are they getting standardized?

~~~
rseacord
No one has proposed making these standard. I doubt they would gain much
support as they are similar to the Annex K Bounds Checked Interface functions
strcpy_s and strcat_s but not quite as good IMHO.

~~~
rseacord
There were a number of recent proposals to adopt various POSIX functions by
Martin Sebor into C including:

    
    
      N2353 2019/03/17 Sebor, Add strdup and strndup to C2X
      N2352 2019/03/17 Sebor, Add stpcpy, and stpncpy to C2X
      N2351 2019/03/17 Sebor, Add strnlen to C2X
    

He is lurking on this thread as well. These proposals can all be found in the
document log at [http://www.open-
std.org/jtc1/sc22/wg14/www/wg14_document_log...](http://www.open-
std.org/jtc1/sc22/wg14/www/wg14_document_log.htm)

~~~
rmind
There have been some disagreements on strlcpy/strlcat (BSD vs glibc crowd),
although by now the debate has died off and these functions are pretty widely
used. Also, while here, it would be lovely to have strchrnul() included.

~~~
loeg
glibc still refuses to add the functions because they are not required by a
standard.

------
parenthesis
Could we have variadic macros with zero arguments in the standard? I'm not
using any compiler that doesn't allow it.

~~~
pascal_cuoq
The C standard description does not allow a function that does not have at
least one normal argument before the variadic arguments.

Conceptually, something must indicate to the function how many arguments it is
supposed to request next, and with what types. Yes, you could write a function
where this information is passed through a static-lifetime variable, but in
practice the first mandatory argument is almost always used for that anyway.

~~~
david2ndaccount
You’re replying to a comment about macros, not about functions.

------
jmckinley
It is 2020. You are looking at a series of projects your company has teed up.
All are greenfield efforts - no legacy. What would be the attributes of a
project that would have you recommend C as the programming language?

~~~
quelsolaar
Anything high performance: game engine, scientific computation, deep packet
inspection, image analysis, machine learning, rendering engines, high
frequency trading.... The list is long!

~~~
tayistay
AFAIK, few seem to choose C for game engines or rendering engines. Not
familiar with the other domains.

------
pcr910303
As there are a lot of C-masters lurking in this thread:

How can one process unicode (UTF-8) properly in C? As a CJK person, I wish
there was a robust solution. Are there any standardized ways or proposals?
(Using wchar doesn't count.)

~~~
DougGwyn
UTF-8 encoding works "as is" based on byte strings (char[]). The latest
versions of the draft standard provide somewhat more support.

I recommend heading toward a future where only UTF-8 encoding is used for
multibyte characters and UCS-2 or similar for wchar_t. There is no need to
support several different encodings.

~~~
ori_b
UCS-2 is a bad choice -- it fails to represent most unicode characters. If you
meant UTF-16, that's also a bad choice, because UTF-16 is _also_ a variable
width encoding, forcing programmers to use a some for of "extra-wide char".

I'm of the opinion that wchar_t should become an alias for char32_t.

~~~
a1369209993
UTF-32 is _also_ a variable-width encoding; eg 00000044 00000308 aka "D̈".

~~~
DougGwyn
I thought it was strictly one character per 32-bit code. Anyway, whatever it
is called it is what wchar_t should be.

~~~
a1369209993
There are no fixed width encodings with range of encodable characters anywhere
near that of Unicode.

~~~
flatfinger
It's too bad Unicode wasn't designed around the concept of easily-recognizable
grapheme clusters and "write-only" [non-round-trip] forms that are normalized
in various ways. A text layout engine shouldn't have to have detailed
knowledge of rules that are constantly subject to change, but if there were a
standard representation for a Unicode string where all grapheme clusters are
marked and everything is listed in left-to-right order, and an OS function was
available to convert a Unicode string into such a form, a text-layout using
that OS routine would be able to accommodate future additions to the character
set and and glyph-joining rules without having to know anything about them.

~~~
a1369209993
You can't do that without commiting to _not_ supporting pathological text,
otherwise you're stuck adding new special cases to the layout engine every
update _anyway_.

I do have some ideas for a better encoding (like, I assume, anyone competent
with sufficient free time and interest in text encoding), but there's a lot of
reluctance to put effort into something that's already completely eclipsed by
a technically inferior but not completely unusable alternative, so I've had it
mostly shelved.

------
potiuper
Any plans to add semantics for exceptional situations such as divide by zero
and dereferencing a null pointer?
[https://blog.regehr.org/archives/232](https://blog.regehr.org/archives/232)

Or incorporating features from this 14 item list?
[https://blog.regehr.org/archives/1180](https://blog.regehr.org/archives/1180)

As it appears these have failed:
[https://blog.regehr.org/archives/1287](https://blog.regehr.org/archives/1287)

~~~
rseacord
I don't know of any plans to add semantics for divide-by-zero of dereferencing
a null pointer. I'm guessing this is not viable because there is no agreed
upon semantics among different implementations.

Making C friendlier is always a good idea, and I think the committee is
(slowly) working towards this goal. I would have to examine these papers by
John Regehr in more detail. Looking quickly at his proposals I can see why
there he couldn't find consensus for these ideas as some of them do appear
controversial.

An example of a friendly dialect of C is always is C0 (C-naught) from CMU. I
don't think I'm exaggerating when I say that this language has not "caught
on".

------
blocks_plz
Thanks for the AMA

1\. Will the Apple's Blocks extension, which allows creation of Closures and
Lambda functions, be included in C2X?

2\. Are there any plans to improve the _Generic interface (to make it easy to
switch on multiple arguements, etc.)?

~~~
AaronBallman
> 1\. Will the Apple's Blocks extension, which allows creation of Closures and
> Lambda functions, be included in C2X?

We haven't seen a proposal to add them to C2x, yet. However, there has been
some interest within the committee regarding the idea, so I think such a
proposal could have some support.

> 2\. Are there any plans to improve the _Generic interface (to make it easy
> to switch on multiple arguements, etc.)?

I haven't seen any such plans, but there is some awareness that _Generic can
be hard to use, especially as you try to compose generic operations together.

~~~
blocks_plz
1\. The reason I asked was because I remember reading the proposal as N2030[1]
and N1451[2] a while back. Were these never actually presented for voting?
(not sure how the commitee works)

[1]: [http://www.open-
std.org/jtc1/sc22/wg14/www/docs/n2030.pdf](http://www.open-
std.org/jtc1/sc22/wg14/www/docs/n2030.pdf)

[2]: [http://www.open-
std.org/jtc1/sc22/wg14/www/docs/n1451.pdf](http://www.open-
std.org/jtc1/sc22/wg14/www/docs/n1451.pdf)

~~~
AaronBallman
Ah! No, those just predate my joining the committee and haven't really come up
since they were presented.

Basically, every paper that gets submitted by an author will get some amount
of discussion time at the next available meeting as well as feedback from the
committee on the proposal. I'm not certain what feedback these papers received
(we could look through the meeting minutes for the meetings the papers were
discussed at to find out, though).

------
emreiyican
I know this opinion is unpopular and contradict with a core value of the C
standardization committee but I personally think at some point, C standard
should abandon supporting the legacy codebase. I think bool and stdint
definitions should be available as part of the standard feature set and
shouldn't need including their respective headers. These and some other
features are available at the core of every modern language but C, and C has
to provide them via other means. Is the sentiment of discontinuing legacy
support shared within the committee, by any proportion?

~~~
loeg
I'd love it if we could do away with all the headers.

Just #include <stdc.h> and be done with it. No need to remember stdio, stdint,
stdbool, limits, assert, signal.h, etc, etc.

This new header comes with a guarantee that use of identifiers in the
standard-reserved namespace will break your code. Perhaps compilers could even
enforce this preemptively.

~~~
DougGwyn
You can easily create your own stdc.h include file. Something similar was done
on Plan 9.

Note that by including the content of all the headers, you're increasing the
chance for collisions with application identifiers. You might consider that
more of a benefit than a drawback.

------
pjmlp
Microsoft's "Checked C" seems to be the last attempt to fix C security flaws.

From the outside, after Annex K adoption failure, WG14 doesn't seem to be
willing to make C safer in any way.

Are there any plans to take efforts like Checked C in consideration regarding
the future of ISO C?

------
cyber1
Ken Thompson, Rob Pike, Brian Kernighan, Russ Cox, Robert Griesemer are guys
who created Unix, B, C, Go, Utf-8, etc. Maybe it will be useful to invite
these guys(one of them) in the C Standards Committee for help to improve and
design new language features?

~~~
rseacord
I think a lot of these dudes are retired. A lot of good C people like P.J.
Plauger, John Benito, and Clark Nelson have all retired recently. Anyway, they
are all invited back. As an incentive, we typically have free coffee and
snacks at most of the meetings. :)

------
billfruit
I do find that C is difficult use for large programs. It there any thoughts
that introducing features like namespaces.

Another thing very cumbersome is to do in C is object creation; creating
instantiable objects is possible very cumbersome. Is there some feature in the
thoughy process to deal with it. To make it clear, in C we can create a data
structure like a Stack or a queue easily. But if the program needs 10 stacks
then presently no simple way of achieving it.

~~~
DougGwyn
In BRL's MUVES project, we used a 2-character prefix indicating category.
E.g., all the external identifiers for our fancy memory allocator began with
"Mm", where Mm.h documented the interface for the Mm package only.

To minimize the external identifiers, one could make just the name of a
container structure the sole entry access handle, with structure members
pointing to the functions. Then use it like:

    
    
      #include <Mm.h>
      if ((new = Mm.allo(size)) == NULL)
        Er.abort("out of memory");

~~~
jfkebwjsbx
Tip: you can use four leading spaces to write code.

    
    
        Like this

~~~
steveklabnik
You only need two!

    
    
      like this

~~~
DougGwyn
I tried, but two spaces yielded what you saw.

~~~
dang
Huh, it also needed an extra line break before the first line of code. I
didn't realize that! I've fixed it now.

~~~
steveklabnik
I didn't realize that either, but it's described in formatdoc as such. So if
you changed that behavior, probably should change the docs too.

~~~
dang
I didn't change the behavior - I just added a newline. Sorry that wasn't
clear.

------
beefhash
C has been making strides towards complete Unicode support. I've been having
trouble following along though: Am I correct in assuming that there's no
_actual_ multi-byte UTF-8 to UTF-32 Rune function and the best approximation
depends on whatever wchar_t is? How would I best handle pure Unicode input and
output scenarios on a "hostile" OS whose native character encoding is some
EBCDIC abomination or a Windows codepage?

~~~
loeg
Probably link libicu rather than rely on libc.

~~~
rurban
libicu is a 40MB mess where you need only 5Kb of it. Only case folding and one
normalization is needed, with tiny tables.

Additionally the used UNICODE_MAJOR and _MINOR are needed. They are always
years behind, and you never know which tables versions are implemented.

~~~
loeg

      -Wl,--gc-sections

------
pantalaimon
Will C eventually get something like C++' constexpr?

~~~
AaronBallman
C has some basic support for constant expressions already, but there has not
yet been a proposal to bring 'constexpr' over from C++. Personally, I would
_love_ this feature to be in C!

~~~
loeg
You and me both!

------
psherbet
I love how small of a language C is and get concerned when people recommend
adding feature x,y and z.

What's the plan for C over the next 5 - 10 years?

~~~
DougGwyn
There is no grand goal that I know of. I wish more importance were being
placed on keeping existing well-written code working, which includes continued
support for what might be considered near-obsolete. If one wanted to design a
new (not fully compatible) language, that could have lofty goals; just don't
call it "C".

------
JoshTriplett
What are the chances of typeof, or statement expressions, finding their way
into the C standard? They're already widely implemented.

~~~
msebor
Several of us discussed typeof and I'd expect a proposal for a feature along
these lines to be well received. (I recall someone even saying they're working
on one but that shouldn't stop anyone from submitting one of their own.)

~~~
JoshTriplett
I'm glad to hear that.

What about statement expressions? They're quite useful, and supported by
multiple independent compilers.

~~~
msebor
I'm not aware of recent proposals for those but we have discussed ideas along
those lines (closures: N2030, C++ lambdas, Apple Blocks: N1451, and I think
there was one from Cilk). I think there was interest but not enough support
for the details and likely also concerns from implementers.

------
packetlost
I'm about a mid-level experienced developer, and have been attempting to learn
C via a few side projects. I come from mostly Python and Go, which both have
very robust standard libraries, so I was quite surprised to find that string
parsing is _very_ poorly supported in C. Is there a reason that very common
string parsing cases are missing from the C stdlib?

------
hsivonen
What’s the current committee thinking on providing locale-independent
conversions from potentially-invalid UTF-8 to valid UTF-8, from potentially-
invalid UTF-8 to valid UTF-16, and from potentially-invalid UTF-16 to valid
UTF-8 (i.e. replacing ill-formed sequences with yhe REPLACEMENT CHARACTER)?

~~~
DougGwyn
If you changed UTF-16 to UTF-32 or UCS-4 I'd support it. I think there are
already implementations that use the replacement character for all
"impossible" codes.

~~~
hsivonen
What’s your use case for UTF-32?

~~~
DougGwyn
There are several multibyte character manipulations that are easier if there
is a uniform-sized encoding (wchar_t).

------
rudchenkos
Are any concurrency primitives planned for introduction in future C revisions?

~~~
AaronBallman
We currently have not seen papers proposing to add new concurrency primitives
for C2x, but we have been actively working on the concurrency object model and
would welcome proposals for new primitives or concurrency-related fixes.

One goal is to re-unify C with the concurrency object model used by C++ to
make std::atomic<T> and _Atomic(T) be ABI compatible as intended in C11. Some
small fixes in this area are the removal of ATOMIC_VAR_INIT, clarifying
whether library functions can use thread_local storage for internal state, and
things along those lines. However, we expect there to be more efforts in this
area as we progress the standard.

------
lemaudit
Hi,

Do you think Annex K of C11 will be widely adopted by programmers or unused?
Why aren't people adopting it?

Do you see the use of any analysis tools that are particularly effective for
finding memory safety issues?

C++ added in smart pointers to its specification. Are there any plans to do
something similar in future C specifications?

Thanks!

~~~
AaronBallman
> Do you think Annex K of C11 will be widely adopted by programmers or unused?
> Why aren't people adopting it?

So far, it's not been widely adopted. Part of the issue is that there are
specification issues relating to threads and the constraint handlers, and part
of the issue is that popular libc implementations have actively resisted
implementing the annex.

That said, I field questions about Annex K on a regular basis and there are a
few implementations in the wild, so there is user interest in the
functionality.

> Do you see the use of any analysis tools that are particularly effective for
> finding memory safety issues?

<biased opinion>I think CodeSonar does a great job at finding memory safety
issues, but I work for the company that makes this tool.</biased opinion>

I've also had good luck with the memory and address sanitizers
([https://github.com/google/sanitizers](https://github.com/google/sanitizers))
and tools like valgrind.

> C++ added in smart pointers to its specification. Are there any plans to do
> something similar in future C specifications?

We currently don't have any proposals for adding smart pointers to C. Given
that C does not have constructors or destructors, we would have to devise some
new mechanism to implement or replace RAII in C, which would be one major
hurdle to overcome for smart pointers.

~~~
hedora
I’ve had good luck (in C++) replacing the underlying memory allocator with one
that tracks leaks by allocation type (which is fast enough for production
use).

This can be done in C, but the calling code has to spell malloc and free
differently.

In debug mode, configuring malloc to poison (and add fences) on allocation and
free finds most of the remaining things.

These techniques tend to have much lower runtime overhead than valgrind
(2-digit percentages vs 5-10x), so they can be left on throughout testing and
partially enabled in production.

They find >90% of the memory bugs that I write (assuming valgrind finds 100%).
YMMV.

------
knz42
What is the story behind the removal of VLAs from C99 in later revisions?

~~~
AaronBallman
VLAs are still present in C17 and have not been removed. They are, however, an
optional feature with a truly weird (IMHO) feature testing macro. If
'__STDC_NO_VLA__' is defined to 1, then the implementation does not support
VLAs.

IIRC, this macro was added to C11 along with a batch of other "these are
optional" macros for atomics, complex, threads, etc. However, I don't recall
whether C99 adopted the features as optional features and missed the feature
testing macro, or if they were required features in C99 that we made optional
in C11.

~~~
stephencanon
Complex and VLA were required by C99, but made optional in C11. The others
were new in C11.

------
cyber1
I think C is an exceptional good language for a long time, but the world is
changing and maybe C must evolve with new trends, new researches in
programming languages.

In my view C and C++ now almost different languages with a different
philosophy of programming, different future, and different language design.

It will be sad if "modern" C++ almost replace C. Many C++ developers use
"Orthodoxy C++"
[https://gist.github.com/bkaradzic/2e39896bc7d8c34e042b](https://gist.github.com/bkaradzic/2e39896bc7d8c34e042b),
and this shows that people will be more comfortable with C plus some really
useful features(namespaces, generics, etc), but not modern C++. I very often
hear from my job colleagues and from many other people who work with C++ is
how terrible modern C++ ([https://aras-p.info/blog/2018/12/28/Modern-C-
Lamentations/](https://aras-p.info/blog/2018/12/28/Modern-C-Lamentations/),
[https://www.youtube.com/watch?v=9-_TLTdLGtc](https://www.youtube.com/watch?v=9-_TLTdLGtc))
and haw will be good to see and use new C but with some extra features. Maybe
time to start thinking about evolution C, for example:

    
    
      - Generics. Something like generics in Zig, Odin, Rust. etc.
      - AST Macros. For example Rust or Lisp macroses, etc.
      - Lambda
      - Defer statement
      - Namespaces
    

What do you think?

[https://ziglang.org/documentation/master/#Generic-Data-
Struc...](https://ziglang.org/documentation/master/#Generic-Data-Structures)

[https://odin-lang.org/docs/overview/#parametric-polymorphism](https://odin-
lang.org/docs/overview/#parametric-polymorphism)

[https://doc.rust-lang.org/rust-by-example/generics.html](https://doc.rust-
lang.org/rust-by-example/generics.html)

------
dboon
What are two or three C codebases that are elegantly and cleanly written, and
that every mid-level C programmer should read for sake of knowledge?

~~~
pascal_cuoq
I would recommend musl, although the style is a bit idiosyncratic in places:
[https://www.musl-libc.org](https://www.musl-libc.org)

Mbed TLS, since I have it in mind from another thread, is also a pretty clean
C library for the problem it tries to solve; it's a testament to its design
that we (TrustInSoft, who had not participated to its development) were able
to verify that some uses of the library were free of Undefined Behavior:
[https://tls.mbed.org](https://tls.mbed.org)

~~~
uasm
> "I would recommend musl, although the style is a bit idiosyncratic in
> places: [https://www.musl-libc.org"](https://www.musl-libc.org")

Opened a random part of musl out of sheer boredom. Here's what I see:

[https://git.musl-libc.org/cgit/musl/tree/include/aio.h](https://git.musl-
libc.org/cgit/musl/tree/include/aio.h)

A bunch of return codes #defined like so (see [https://git.musl-
libc.org/cgit/musl/tree/src/aio/aio.c](https://git.musl-
libc.org/cgit/musl/tree/src/aio/aio.c)):

#define AIO_CANCELED 0 #define AIO_NOTCANCELED 1 #define AIO_ALLDONE 2

#define LIO_READ 0 #define LIO_WRITE 1 #define LIO_NOP 2

#define LIO_WAIT 0 #define LIO_NOWAIT 1

Why weren't they using an enum instead? I wouldn't sign off on this code (and
I don't think it lives up to best practices).

~~~
pdw
musl is implementing POSIX. POSIX requires those constants to be preprocessor
defines. (Generally, musl asssumes the reader is quite familiar with the C and
POSIX standards, which makes sense since it's a libc implementation.)

------
dpipemazo
One of my favorite features recently while developing C for embedded systems
has been the --wrap linker flag that allows me to effectively test code that
interacts with hardware without modifying the source.

By passing -Wl,--wrap=some_function at link time with test code we can then
define

    
    
      __wrap_some_function
    

that will be called instead of some function. Within __wrap_some_function one
can also call __real_some_function which will resolve to the original version
if you still want to call the original one. This is especially useful if
trying to observe certain function calls in tests that interact with hardware.

Do you have any other recommendations/preferences to help with unit-testing C
code?

------
tayistay
I'm no C expert, but my two wishes for C would be:

\- Basic type inference to reduce keystrokes, and prevent ripples when
changing types. (like auto in C++)

\- Equality operators defined for structs. Perhaps even lexicographical
comparison, if I'm dreaming.

Any thoughts on either of those?

------
grok22
Things I would like C to have:

\- stricter type-checks on typedef types (useful when passing function
parameters) \- gcc's ' warn_unused_result' attribute for functions (ensure
error returns are checked) \- on-entry/on-exit qualifiers for functions (to do
things like make sure you lock/unlock semaphores for instance before
entry/exit of function) \- D language's 'scope' feature (better handling of
error path) \- loops in the c pre-processor! (better code-gen)

Any chance any of this is on the radar for the next-gen C standard? Some of
these are just ergonomics, but the first two might've have saved me some grief
a few times.

~~~
_kst_
typedef, in spite of the name, doesn't create a new type. It only creates a
new name for an existing type. Changing that would break existing code.

I wouldn't mind seeing a new feature that _does_ define a new type (one that's
identical to, but incompatible with, an existing type), but we can't call it
"typedef".

In a sense that feature already exists. You can define a structure with a
single member of an existing type. But you have to refer to the member by name
to do anything with it.

------
atum47
Yeah, I don't program much in C and I don't have a question. I'm here just to
congratulate everyone involved for this amazing thing. It's awesome to see
people take the time to help each other. Nice job!

------
overfl0w
Can memory safety be ensured in the C programming language? By static analysis
at compile time for example?

~~~
pascal_cuoq
It is possible to guarantee that a C program does not have any undefined
behavior, which includes all the memory errors that are often also security
vulnerabilities.

“Static analysis” may be the wrong name to classify the tools that work in
that area, because “static analysis” is usually used for purely automatic
tools, whereas the tools used to guarantee the absence of undefined behaviors
are not entirely automatic except for the simplest of programs.

Results of a static analyzer are often characterized in terms of “false
positives” and “false negatives”. It is a possible design choice to make an
analyzer with no false negatives. It is absolutely not impossible! (Some
people think it is fundamentally impossible because it sounds like a computer
science theorem, but it isn't one. The theorem would apply if one intended to
make an analyzer with no false positives and no false negatives—and if
computers were Turing machines.)

Analyzers designed to have no false positives are called “sound”. In practice,
this kind of analyzer may prove that a simple program is free of Undefined
Behavior if the program is a simple example of 100 lines, but for a more
realistic software component of at least a few thousand lines, the result will
be obtained after a collaborative human-analyzer process (in which the
analyzer catches reasoning errors made the human, so the result is still
better than what you can get with code reviews alone).

Here is what the result of this collaborative human-analyzer process may look
like for a library as cleanly designed and self-contained as Mbed TLS
(formerly PolarSSL): [https://trust-in-
soft.com/polarSSL_demo.pdf](https://trust-in-soft.com/polarSSL_demo.pdf)?

------
hsivonen
Does the committee have any plans to document the rationale for each kind of
Undefined Behavior?

Does the committee have any plans to make NULL pointer arguments to memcpy
non-UB when the size argument is 0?

~~~
AaronBallman
> Does the committee have any plans to document the rationale for each kind of
> Undefined Behavior?

In the C99 timeframe, we had a rationale document that was separately
maintained. My understanding (this predates my joining the committee) is that
this was prohibitively labor-intensive and so we stopped doing it for C11. I
don't know of any plans to start doing this again, even in a limited sense for
justifying UB. That said, we do spend time considering whether an aspect of a
proposal requires UB or not, so the rationale exists in the proposals and
committee minutes.

> Does the committee have any plans to make NULL pointer arguments to memcpy
> non-UB when the size argument is 0?

I have not seen such a proposal, and suspect that implementations may be
concerned about losing their optimization opportunities from such a change.
(Personally, I'd be okay losing those optimization opportunities as this does
not seem like a situation where UB is necessary.)

------
neop1x
In my opinion C is good as it is. C++ is terrible complicated mess, always
have been and adding more and more "modern" functionality isn't helping it
much. There are great standard functions, e.g. for strings in C, whereas it is
often very inconvenient or complicated to do simple things like uppercase
string in C++. I always ended up basically using C with just basic OOP
functionality from C++. But I am not writing in C/C++ daily so my opinion is
not very important...

------
emilfihlman
Why isn't there a binary prefix in the standard? Like 0b0111010?

------
radford-neal
The syntax used in the following function definition is said to be obsolescent
in C11:

int f (a, n) int n; int a[n][n]; { return a[n-1][n-1]; }

How could one define this function without using the obsolete syntax?

~~~
AaronBallman
You couldn't in that parameter order. However, you could do this: int f(size_t
n, int a[n][n]) { return a[n-1][n-1]; }

([https://godbolt.org/z/DV9c-C](https://godbolt.org/z/DV9c-C))

Btw, that definition was obsolescent in C89 too.

~~~
radford-neal
Well, yes. But putting the array argument(s) first is the more natural order,
in my opinion. And it is surely odd that only one order is allowed in this
context, when otherwise C is happy with changing the order of parameters to be
whatever you like.

Plus, of course, there may be existing code using such functions, with
parameters in the order that would become impossible if this syntax were
disallowed.

------
ori_b
What do you think of a variant on this?

[https://blog.regehr.org/archives/1180](https://blog.regehr.org/archives/1180)

~~~
pascal_cuoq
I still want to write at least one sequel to that post, on the theme “Alright,
can we make a Friendly C Compiler by disabling the annoying optimizations,
then?”.

Obviously the people who want a Friendly C Compiler do not want to disable
_all_ optimizations. This would be easy to do, but these users do not want the
stupid 1+2+16 expressions in their C programs, generated through macro-
expansion, to be compiled to two additions with each intermediate result
making a round-trip through memory.

So the question is: can we get a Friendly C Compiler by enabling only the
Friendly optimizations in an unfriendly compiler?

And for the answer to that, I had to write an entire other blog post as
preparation, to show that there are some assumptions an optimizing compiler
can do:

\- that may be used in one or several optimizations, but the compiler authors
did not really keep track of where they were used,

\- that cannot be disabled and that the compiler maintainers will not consider
having an option to disable,

\- and that are definitely unfriendly.

Here is the URL of the blog post that I had to write in preparation for the
upcoming blog post about getting ourselves a Friendly C Compiler:
[https://trust-in-soft.com/blog/2020/04/06/gcc-always-
assumes...](https://trust-in-soft.com/blog/2020/04/06/gcc-always-assumes-
aligned-pointers/) . I recommend you take a look, I think it is interesting in
itself.

You will have guessed that I'm not optimistic about the approach. We can try
to maintain a list of friendly optimizations for ourselves, though, even if
the compiler developers are not helping. This might still be less work that
maintaining a C compiler.

~~~
ori_b
> _Here is the URL of the blog post that I had to write in preparation for the
> upcoming blog post about getting ourselves a Friendly C
> Compiler:[https://trust-in-soft.com/blog/2020/04/06/gcc-always-
> assumes...](https://trust-in-soft.com/blog/2020/04/06/gcc-always-assumes..).
> . I recommend you take a look, I think it is interesting in itself._

So, it's definitely interesting -- I think a lot of odd stuff you can do
should probably be undefined. Eliminating pointer accesses after a null check
sounds A-ok to me, because your program should never dereference null.

Another interesting thought is requiring more of these things that lead to
miscompilation to produce compile time diagnostics.

------
modeless
Can you do anything to push Microsoft to implement recent C standards? Their
failure to fully implement even C99 in Visual Studio is holding the language
back.

~~~
AaronBallman
Not really -- vendors are free to ignore newer releases of the standard that
do not meet their customers needs and the committee can't do much about it.

However, as a user, you can help apply pressure on the vendor to support newer
standards. For instance, with Microsoft, you could support this feedback
request:
[https://developercommunity.visualstudio.com/idea/387315/add-...](https://developercommunity.visualstudio.com/idea/387315/add-c11-support.html)

------
pornel
Would you consider adding a built-in way to safely multiply two numbers?

Numeric overflows in things like calculation of buffer sizes can lead to
vulnerabilities.

Signed overflow is UB, and due to integer promotion signs creep in unexpected
places.

It's not trivial to check if overflow happened due to UB rules. A naive check
can make things even worse by "proving" the opposite to the optimizer.

And all of that is to read one bit that CPUs have readily available.

~~~
DougGwyn
There are a lot of arithmetic conditions for which C could generate special
code. There are div_t-related functions for the other direction. I for one
would like a good way to obtain, using some Standard C coding pattern, fast
"carry" for multiple-precision integer arithmetic.

Several places in support functions, I have coded unusually to avoid wrap-
around etc. I bet you could devise something like that for (unsigned)
multiplication.

~~~
tropo
A horrifying case was multiplication in an x86 emulator. The opcode handler
needed to multiply a pair of unsigned 16-bit values, then return a 64-bit
result.

The uint16_t got promoted to an int for the multiplication, causing undefined
behavior. (if I remember right, the result was assigned to a uint16_t as well,
making the intent clear) The compiler then assumed that the 32-bit
intermediate couldn't possibly have the sign bit set, so it wouldn't matter if
promotion to a 64-bit value had sign extension or zero extension. Depending on
the optimization level, the compiler would do one or the other.

This is truly awful behavior. It should not be permitted.

~~~
tropo
Found it:

[http://kqueue.org/blog/2013/09/17/cltq/](http://kqueue.org/blog/2013/09/17/cltq/)

~~~
flatfinger
See post above. There is no good way for compilers to handle that case, but
gcc gets "creative" even in cases where the authors of C89 made their
intentions clear.

------
bonzini
Is there any reason to keep the undefined behavior for shifts of negative
numbers, instead of making it implementation defined? Most compilers (for
twos-complement architectures at least) are not using that latitude, and I
would also guess that most programs that are written for twos-complement
arithmetic likewise not expecting undefined behavior for non-overflowing left
shifts of negative numbers. Thanks!

~~~
DougGwyn
"Implementation-defined" is a nuisance, because then you need to add code for
all the variations, which also requires a set of standard macros, etc. It is
easier and less trouble-prone to just avoid using the currently undefined
behavior.

------
commandersaki
Will Effective C cover the strict aliasing rule and also why the BSD sockets
API seems to get away with it (e.g. (sockaddr *) &sockaddr_in)?

~~~
AaronBallman
I don't think the book covers strict aliasing, at least not in detail.

------
graycat
(1) Explain just how malloc() and free() work _under the covers_ and the
implications for multi-threading, _memory leaks_ , virtual memory paging, etc.

Maybe also cover some means, algorithms, and code for reporting on the _state_
, status, etc. of the memory use by malloc() and free().

By the way, I know and have known well for longer than most C programmers have
lived JUST what the _heap_ data structure, as used in "heap sort", is. But
what is the meaning of "the heap" in C programming language documentation?

(2) Cover in overwhelmingly fine detail the "stack" and the chuckhole in the
road, _stack overflow_.

(3) Where to get a reliable package for a reasonable package of code for
handling character strings -- what I saw and worked with in C is not
reasonable.

(4) From the C programming I did, it looks like a large C program for
significant work involves some hundreds, maybe tens of thousands, of
_includes_ , _inserts_ , whatever, and what a linkage editor would call
_external references_. There must somewhere be some tools to help a programmer
make sense of all those includes and references, the resulting memory maps,
issues of locality of reference, _word boundary alignment_ , etc.

(5)How can C exploit a processor with 64 bit addressing and main memory in the
tens of gigabytes and maybe terabytes?

(6) How can C support, i.e., exploit, integers and IEEE floating point in 64
and/or 128 bit lengths?

(7) How to handle exceptional conditions with, say, non-local gotos and
without danger of memory leaks?

(8) Sorry, but far and away my favorite programming language long has been and
remains PL/I, especially for its scope of names rules, handling of aggregates
with external scope, its _data structures_ , and its exceptional conditional
handling with non-local gotos and freeing _automatic_ storage and, thus,
avoiding _memory leaks_. Of course I can't use PL/I now, but the problems PL/I
solved are still with us, also when writing C code. So, how to solve these
problems with C code?

(9) For C++, please explain how that works _under the covers_. E.g., some
years ago it appeared the C++ was defined as only a source code pre-processor
to C. Is this still the case? If so, then explaining C++ _under the covers_
should be feasible and valuable.

~~~
aw1621107
> Explain just how malloc() and free() work under the covers and the
> implications for multi-threading, memory leaks, virtual memory paging, etc.
> > > Maybe also cover some means, algorithms, and code for reporting on the
> state, status, etc. of the memory use by malloc() and free().

Strictly speaking, these are implementation details that the C standard leaves
unspecified. If you want to know how the memory allocation functions work or
methods for inspecting the state of the heap you'll need to look at a specific
implementation (e.g., glibc, musl, jemalloc, etc.) since the details can vary
wildly between implementations.

> Cover in overwhelmingly fine detail the "stack" and the chuckhole in the
> road, stack overflow.

Both these are not really specific to C, and there should be a lot of
resources you can find that explain these concepts ([0], [1] for some example
general explanations). Did you have more specific questions in mind?

> How can C exploit a processor with 64 bit addressing and main memory in the
> tens of gigabytes and maybe terabytes? > How can C support, i.e., exploit,
> integers and IEEE floating point in 64 and/or 128 bit lengths?

I think pointer/integer sizes are implementation details. C specifies pointer
behavior and minimum integer sizes (and optional fixed-width types), but the
precise widths are chosen by the implementation. In the case of floating-
point, the sizes are specified by IEEE 754 widths.

In other words, you don't really need to do anything special as long as you
pick the appropriate types as defined by your implementation.

> For C++, please explain how that works under the covers. E.g., some years
> ago it appeared the C++ was defined as only a source code pre-processor to
> C. Is this still the case?

As far as I know no (production-quality?) C++ compiler has been implemented as
a source-level preprocessor for basically the entirety of C++'s existence [2].
The very first "compiler" for C++ was Cpre, back when C++ was still the C
dialect "C with classes" (around October 1979), and that was indeed a
preprocessor. That was replaced by the Cfront front end around 1982-1983,
about when "C with classes" started gaining new features and got a new name.
Cfront is a proper compiler front end that output C code, and I think from
that point on C++ compilers used "standard" compiler tech.

[0]: [https://stackoverflow.com/questions/79923/what-and-where-
are...](https://stackoverflow.com/questions/79923/what-and-where-are-the-
stack-and-heap) [1]:
[https://en.wikipedia.org/wiki/Stack_overflow](https://en.wikipedia.org/wiki/Stack_overflow)
[2]:
[http://www.stroustrup.com/hopl2.pdf](http://www.stroustrup.com/hopl2.pdf)

~~~
graycat
Thanks.

> Did you have more specific questions in mind?

On stack overflow, my understanding was that could encounter that fatal
condition from suddenly a too deep _call stack_ , that is, too many calls
without a return. So, if the "stack" is a, say, finite resource, then the
programmer should know in the code how much of that resource is being used and
act accordingly.

For a preprocessor for C++, I IIRC at one point the definition of C++ was in
terms of a preprocessor -- I was just thinking of the definition, that is, get
a more explicit definition of C++. I've always understood that always or
nearly so C++ implementation was usual _compilation_. The issue is that at
least at one time it seemed difficult to be precise about C++ semantics, that
is, what the code would do and how it would do it. Maybe now C++ is
beautifully documented.

~~~
aw1621107
> So, if the "stack" is a, say, finite resource, then the programmer should
> know in the code how much of that resource is being used and act
> accordingly.

And this is true, but IIRC statically determining stack bounds for arbitrary
programs is not an easy problem to solve, especially if you call into opaque
third-party libraries.

> For a preprocessor for C++, I IIRC at one point the definition of C++ was in
> terms of a preprocessor

I wouldn't know about defining C++ in terms of transformations to C, and
searching for that is more difficult. I would guess that the abandonment of
the preprocessor approach to compilation would also have meant the abandonment
of defining C++ in terms of C, especially once C++ really started picking up
features.

> The issue is that at least at one time it seemed difficult to be precise
> about C++ semantics, that is, what the code would do and how it would do it.
> Maybe now C++ is beautifully documented.

C++ has had a formal specification since 1998, which might count as
documentation for you.

~~~
flatfinger
If the Standard were to make recursion an optional feature, many programs'
stack usage could be statically verified. Indeed, there are some not-quite-
conforming compilers which can statically verify stack usage--a feature which
for many purposes would be far more useful than support for recursion.

------
Uptrenda
What would you say to people who claim that writing "secure C code" is
impossible [not me but I'm curious what you all think]?

~~~
AaronBallman
I'd ask them if they really meant "impossible" or just "harder than I wish it
was".

I've typically found that the tradeoffs between security, performance, and
implementation efforts are usually more to blame for why writing secure C code
is a challenge. There are a ton of tools out there to help with writing secure
code (compiler diagnostics, secure coding standards, static analyzers,
fuzzers, sanitizers, etc), but you need to use all the tools at your disposal
(instead of only a single source of security) which adds implementation cost
and sometimes runtime overhead that needs to be balanced against shipping a
product.

This isn't to suggest that the language itself doesn't have sharp edges that
would be nice to smooth over, though!

------
bythckr
Hi Team "C",

I am a beginner level programmer and C is not one of the languages for which I
have even bothers to write a "hello world" for. That is my level.

As the people that "runs" C, why do we need C? Forget the legacy systems. With
fancy languages like Go, Rust, Elixir, Python and the millions others. Of
course, the "offsprings" like C++ & C#.

What was the use case that C was designed for (I have read from sources like
Wikipedia, would love to hear straight from source)? In 2020, how relevant is
C? If someone is going to write a system/application today, why consider C? Do
you think, C will be relevant in 5 yrs (I know 1 yr in computing is like 10
yrs for humans)? With all your combined experience in computing over the years
and as the members of a team that is guiding a valuable thing like "C". What
is your advice/wisdom/thought for us?

------
wcarey
I'm teaching C to high schoolers as their first language, which is quite the
adventure. Do you have any good advice or resources on how to introduce the
way C treats the function stack and heap allocated memory? Most of my students
struggle (naturally) with making sense of function scoped identifiers and
pass-by-value semantics.

~~~
pascal_cuoq
This service has been designed to try out small self-contained C examples
online (in a manner reminiscent of Compiler Explorer):

[https://taas.trust-in-soft.com/tsnippet/](https://taas.trust-in-
soft.com/tsnippet/)

One advantage is that it identifies a LOT of undefined behaviors during
execution for which traditional compilation and execution only give puzzling
results.

One drawback is that some of the undefined behaviors it identifies are
obscure, and for others the message may be unusual. For instance, using a
standard function without including the appropriate header may result in a
warning about the mismatch between the type in the header and the type of the
arguments the standard function was applied to after arguments promotions.

Overall, you may still find it useful for teaching.

~~~
wcarey
Thanks! Definitely an interesting tool. Two of my students are fascinated by
the idea of undefined behavior right now (having run into it in practice; the
idea that off-by-one errors sometimes crash their program and sometimes behave
"normally" is really odd to them), so I'll point them at this to play with.

------
Tronic2
char effectively behaves as a signed type, making it unsuitable for binary
operations (e.g. UTF-8 manipulation). I/O functions deal with char pointers,
so using unsigned type like uint8_t requires casting back and forth. Is there
any way out of this problem, and am I already breaking the aliasing rules with
that cast?

~~~
emilfihlman
There are no aliasing differences between uint8_t and char as far as I know.

~~~
hsivonen
In practice not. In theory, it’s implementation-defined whether yhere are
differences.

~~~
emilfihlman
At least from what I've heard that's because stdint values are optional.

6.2.5p17 The three types char, signed char, and unsigned char are collectively
called the character types. The implementation shall define char to have the
same range, representation, and behavior as either signed char or unsigned
char. 48)

and

5.2.4.2.1 says that width of char, signed char and unsigned char are the same
(8).

~~~
radford-neal
I don't think it's anything to do with uint8_t being optional. It's because a
char might have more than 8 bits.

~~~
flatfinger
A conforming implementation could extend the language with an 8-bit type
__nonaliasingbyte which has no special aliasing privileges, and define uint8_t
as being synonymous with that type.

On the other hand, the Standard should never have given character types
special aliasing rules to begin with. Such rules would have been unnecessary
if the Standard had noted that an access to an lvalue which is freshly visibly
derived from another _is_ an access to the lvalue from which it is derived.
The question of whether a compiler recognizes a particular lvalue as "freshly
visibly derived" from another is a Quality of Implementation issue outside the
Standard's jurisdiction.

------
flatfinger
Rather than trying to come up with "compromise aliasing rules", the Standard
needs to recognize that different tasks require different features, and
allowing all possible optimization opportunities that would be useful for some
tasks would make an implementation totally unsuitable for others.

I would suggest that the Standard define directives to demand three modes,
with the proviso that a compiler may reject code which demands a mode it
cannot accommodate:

1\. clang/gcc mode, which would be adjusted to match the way clang and gcc
actually behave, as well as anything they want to do but their interpretation
of the Standard woudln't allow.

2\. precise mode, which behaves as though all loads and stores of objects
whose address are taken behave according to a precise memory-based abstraction

3\. sequence-based mode, which would allow compilers to hoist, defer,
consolidate, and eliminate loads and stores in cases where they honor data
dependencies that are visible in the code sequence, but would require that
compilers recognize visible dependencies which clang and gcc presently ignore,
and would also require that the definition of "based on" used by "restrict"
recognize that any pointer formed by adding or subtracting an integer from
another pointer by recognized as "at least potentially based on" the former,
even in corner cases where clang and gcc would ignore that.

Recognizing mode #1 would avoid allow clang and gcc to keep using their
aliasing logic with programs that can tolerate it. Mode #2 would ensure that
all programs that have trouble with that logic could have defined behavior by
adding a directive demanding it. Mode #3 would allow most of the same useful
optimizations as mode #1, but work with a wide range of programs that would
presently require `-fno-strict-aliasing`.

If one recognizes the need for different modes, the effort required to
describe all three modes would be tractable, compared to the obviously-
intractable problem of reaching consensus about how one mode that would need
to serve all purposes.

------
cyber1
How close C Standard Committee works with Linux Kernel Developers? Is Linux
Kernel development influence on C standard?

~~~
AaronBallman
There's not an official collaboration between the committee and the kernel
developers (that I'm aware of), but we do have people on the committee who
need to support Linux kernel development (such as GCC maintainers), so there
is some level of indirect influence there.

------
stwcx
One feature of C which I do not use often is enums. Support for constants
beyond the range of an int is not portable. And I also try to avoid is putting
enums inside structs, because there is no portable way to enforce the size or
the alignment of the enum's base type.

Will this be addressed in future revisions of the C standard?

------
Lucasoato
Is there any new programming language that you particularly love? Do you like
the way programming is evolving?

~~~
pascal_cuoq
As a member of the development team for a C static analyzer, I use OCaml,
which is also my favorite programming language, but that is because I'm from
the generation in which it was the new thing (I learnt it when it had the same
level of maturity as Rust, at a time when Rust didn't exist). It helps that
it's perfect for writing compilers and static analyzers.

There are a lot of problems that seem a good match for Rust, and Rust is first
in my list of programming languages I will never find the time to learn but
wish I could.

~~~
artursapek
Why won't you ever find time? It should only take a good 20 hours of reading
and playing with code before you start to grok it.

~~~
rseacord
I spent the early part of my career bragging about how many programming
languages I knew, and the later part of my career complaining about how I
don't know any of them well enough.

~~~
artursapek
I certainly wouldn't go for quantity there, but if you really want to learn
Rust you should. It brings some groundbreaking new ideas to programming and is
more than "just another language".

------
gautamcgoel
Curious what the committee members think of the new competitors to C, e.g. Go,
Rust, and Zig. Any comments?

~~~
VWWHFSfQ
go isnt a competitor to c

~~~
pjmlp
F-Secure apparently thinks otherwise,

[https://www.f-secure.com/en/consulting/foundry/usb-
armory](https://www.f-secure.com/en/consulting/foundry/usb-armory)

As does Google,

[https://github.com/google/gvisor](https://github.com/google/gvisor)

[https://github.com/google/gapid](https://github.com/google/gapid)

Naturally if one is talking about specific uses cases like IoT with a couple
of KBs, MISRA-C, or UNIX kernels, then yes Go is not a competitor.

------
RandNOx
\- Which differences between the C abstract machine and actual modern
CPUs/hardware have proven most difficult to deal with in the language?

\- Are you planning any addition regarding modeling of how modern CPUs work
(e.g. pipelines, branches, speculative execution, cache lines, etc)?

PS: Thank you for doing this!

~~~
AaronBallman
> \- Which differences between the C abstract machine and actual modern
> CPUs/hardware have proven most difficult to deal with in the language?

For me, I think it's 'volatile' because, by its nature, you can't describe
what it means in the abstract machine very well. For instance, consider a
proposal to add something like a "secure clear" function for clearing out
sensitive data. The natural inclination is to pretend that data is volatile so
the optimizer won't dead-code strip your secure clear function call, but that
leaves questions about things like cache lines, distributed memory, etc.

> \- Are you planning any addition regarding modeling of how modern CPUs work
> (e.g. pipelines, branches, speculative execution, cache lines, etc)?

Maybe? ;-) We tend to talk about features at a higher level of abstraction
than the hardware because hardware changes at such a rapid pace compared to
the standards process. So we largely leave hardware-specific considerations as
a matter of QoI for implementers.

However, that doesn't mean we wouldn't consider proposals for more concrete
things like a defensive attribute to help mitigate speculative execution
attacks.

------
shric
Is there a rule that any new proposals must already be a feature in an
existing major implementation?

~~~
AaronBallman
Yes, the C2x charter has this requirement: [http://www.open-
std.org/jtc1/sc22/wg14/www/docs/n2086.htm](http://www.open-
std.org/jtc1/sc22/wg14/www/docs/n2086.htm)

~~~
shric
Thanks, so from "Only those features that have a history and are in common use
by a commercial implementation should be considered", this precludes stuff
that may only exist in clang, gcc, glibc, etc.? If so, why?

~~~
AaronBallman
I wouldn't read into "commercial" there, I think we meant "production-quality"
instead. (We should fix that!)

Basically, we prefer seeing features that real users have used as opposed to
an experimental branch of a compiler that doesn't have usage experience.
Knowing it can be implemented is one thing, but knowing users want to use it
is more compelling.

------
jpizza
Hello,

First off thank you so much for taking the time to answer questions.

As a new programmer starting with C I am trying to learn how to go from a
beginner to an intermediate any recommendations of projects to help learn C?

It is difficult for me to find projects that I see are "valuable" for a lack
of a better term.

Thank you!

~~~
DougGwyn
One possibility is to modify some existing program to include an additional
new feature. You should soon develop a sense for what works well versus what
causes problems.

------
quelsolaar
A few proposals:

Why not mandate a warning every time the compiler detects and makes use of UB?
It would solve SO many issues. If you are looking to improve security of C
programs, then letting the user know what the compiler does should be number
one.

Try to convert as many UB's to Platform specific, as possible would also be a
big help.

I would love to see native vector types. Its time. Vector types are now more
common in hardware then float was when it was included in the C spec. Time to
make it a native type. Hoping the compiler does the vectorization for you is
not good enough.

Allow for more then one break.

for(i = 0; i < n; i++) for(j = 0; j < n; j++) if(array[i][j] == x) break
break;

is equal to:

for(i = 0; i < n; i++) for(j = 0; j < n; j++) if(array[i][j] == x) goto found;
found :

~~~
clarry
> Why not mandate a warning every time the compiler detects and makes use of
> UB? It would solve SO many issues.

Because that's hardly ever what happens, except when it actually does, and
compilers do an increasingly good job of issuing diagnostics in that case. If
you actually mandated it, no compiler today would come close to being
standards compliant. This comes close to making the language unimplementable.

The most common issue with UB and optimizations is not that "compiler detects
UB and does something with it," it's that compiler analyzes and optimizes code
_with the assumption that UB doesn 't actually happen._ It doesn't know
whether it does (and in general, it is impossible to tell whether it would
happen -- it's something that might or might not happen at run time, and
proving it one way or another amounts to solving the halting problem), it just
assumes it doesn't.

And if one mandated compilers to report every time they make an optimization
that is valid under the assumption that the program is well behaved, then you
would never finish reading compiler output. Or you would turn off
optimizations.

~~~
quelsolaar
They need to do better then remove NULL checks silently. You can read all
about Linus rants on this. Every time the compiler breaks things they blame
the C standard for letting them do what ever. Thats whats wrong with C today.
The C standard hasn't put its foot down.

~~~
clarry
I want my compiler to remove redundant checks (without any noise), and that is
why I pass it an optimization flag. If you don't want such optimizations, then
maybe you should not ask the compiler to make them.

~~~
quelsolaar
This attitude is terrible! Its an attitude that says that unless you know
exactly every pit fall in the language by heart you have no place writing
code. I guess you dont use a debugger either because you never write bugs
right? And you think that every software that helps the user is for noobs
right?

There is an endless list of bugs that have been produced by very competent C
programmers, because the compiler has silently removed things for some very
shaky reasons.

~~~
clarry
Huh? I just want performant code. That's why I write C, and that's why I use
an optimizing compiler, and that's why I ask my compiler to optimize.

I also want to write code that is reasonably generic. Thus, it will have
checks and branches that cover important corner cases; they are required for
completeness and correctness. But very often, all of these checks turn out to
be redundant in a specific context, and an optimizing compiler can figure it
out, and eliminate these checks for me.

So I don't manually need to go and write two or three versions of each
function like do_foo and assume_x_is_not_null_and_do_foo and
assume_y_is_less_than_int_max_minus_sizeof_z_and_do_foo and make damn sure not
to call the wrong one.

I just write one version, with the right checks in place, and if after macro
expansion, inlining, range analysis, common subexpression elimination, and
other inference from context, with C's semantics at hand, the compiler can
figure out that some of these checks are redundant, then it will optimize them
out.

I ask for it, and I'm glad compiler developers deliver it. You don't need to
ask for it. Just turn off these optimizations (or, rather, don't enable them)
if you prefer slow and redundant code.

~~~
spc476
Why can't the following be a warning?

    
    
        int foo(bar *x)
        {
          x->blah = 0;
          if (x == NULL) ... 
          ...
        }
    

And produce something like "NULL check removed---pointer used before check"?

~~~
clarry
In theory? No reason.

In practice it's a special case of a more widely applicable optimization where
you actually do want to remove redundant checks. So someone has to go out of
their way to figure out a rule that makes the compiler warn but only in cases
where a human reader finds the optimization surprising and undesirable. It's a
fuzzy thing and can easily lead to lots of false positives and noise (and more
whining because it didn't warn in a situation that someone considered
surprising).

I think that kind of logic can easily become a support & maintenance
nightmare, so I'm not surprised that compiler developers take their time and
are conservative when it comes to adding such things. I would probably just
ask you to either stop dereferencing NULL pointers, or turn off the
optimization if you want to dereference NULL pointers and eat your cake too.

------
magicbanana
Is there a chance to ever see C++-template-like features appear in C?

For instance, a lot of redundant code (or ugly macro business) could be neatly
replaced by function templates. Even just template functions with only POD
values allowed would be a great readability improvement.

~~~
MiKom
It's already there. It's called C++ templates

------
emilfihlman
Will you ever add / have you considered adding sane formatting options for
fixed length variables in printf? Say %u32 or %s64 ?

Have you considered adding access to structure members by index or by string
name? Have you considered dynamic structures?

~~~
rmind
Just FYI -- there are macros for the fixed-length types, e.g.:

    
    
        printf("U32: %" PRIu23 ", U64: " PRId64, (uint32_t)1, (int64_t)2);
    

Perhaps not as handy as %u32 or %s64, but it's here.

~~~
emilfihlman
Yeah, and the issue is with those macros exactly. It makes writing code on
them really damn annoying and it relies on C constant string concatenation,
breaks the flow quite a lot.

~~~
_kst_
Which is why I usually convert to intmax_t or uintmax_t, or to some type that
I know is wide enough:

    
    
        uint64_t foo = ...;
        printf("foo = %ju\n", (uintmax_t)foo);
        /* OR */
        printf("foo = %llu\n", (unsigned long long)foo);

------
wpietri
As experts, where do you see C going? In particular, given the many languages
now out there built on decades of learnings from C, where will C have unique
strengths? What projects starting today and hoping to run for 20 years should
definitely pick C?

~~~
rseacord
I don't really see C going anywhere. It's not going away, and it's not going
to evolve into Java. It's going to remain especially useful for memory
constrained and performance critical applications such as IoT and embedded.

~~~
wpietri
That sounds reasonable, but the resource-constrained space seems to me to be
an ever-shrinking share of the field. So is it fair to say you see C becoming
a specialist niche language going forward?

------
emilfihlman
Thank you for taking time to take questions!

Have you ever considered or will you consider deprecating char, int, long,
(s)size_t, float, double and etc in favour of specific length types?

Will you ever add / have you considered adding [su]\d+ and f\d+ as synonyms
for those mentioned stdint.h?

Since char is signed on most platforms, arm eabi being an exception and even
there it's really just a matter of compile time flags, will you ever just drop
char from being able to be either and just say it's signed, as int is also
signed?

Will you ever define / have you considered defining signed overflow behaviour?

~~~
rseacord
I don't think we'll ever deprecate char, int, long, float, double, or size_t.
ssize_t is not part of the C Standard, and hopefully never will be as it is a
bit of an abomination. The main driver behind the evolution of the C Standard
is not to break existing code written in C, because the world largely runs on
C programs.

C does provide fixed width types like uint8_t, uint16_t, uint32_t, and
uint64_t. These are optional types because they can't be implemented on
implementations that don't have the appropriate word sizes. We also have
required types such as

uint_least16_t uint_least32_t uint_least64_t uint_least8_t

~~~
tropo
Those types should not be optional. CHAR_BIT needs to be 8. It is clearly
possible to implement the types even on a 6502 or Alpha. From the early days
of pre-ANSI C, the language supported types for which the hardware did not
have appropriate word sizes. There was a 32-bit long on the 16-bit PDP-11
hardware.

I would go beyond that, requiring all sizes that are a multiple of 8 bits from
8-bit through 512-bit. This better supports cryptographic keys and vector
registers.

~~~
saagarjha
> CHAR_BIT needs to be 8.

Why?

~~~
souprock
Everything breaks if it isn't.

I was on an OS development team in the 1990s. We were using the SHARC DSP,
which was naturally a word-addressed chip. Endianness didn't exist in
hardware, since everything was whatever size (32, 40, or 48 bits) you had on
the other end of the bus. Adding 1 to a hardware pointer would move by 1 bus
width. The chip vendor thought that CHAR_BIT could be 32 and sizeof(long)
could be 1.

We couldn't ship it that way. Customers wanted to run real-world source code
and they wanted to employ normal software developers. We hacked up the
compiler to rotate data addresses by 2 bits so that we could make CHAR_BIT
equal to 8.

That was the 1990s, with an audience of embedded RTOS developers who were
willing to put up with almost anything for performance. People are even less
forgiving today. If strangely sized char couldn't be a viable product back in
the 1990s, it has no chance today. It's dead. CHAR_BIT is 8 and will forever
be so.

~~~
emilfihlman
This was a really interesting and enlightening comment and a small story!
Thank you!

------
kazinator
Here is a library suggestion: a "m" mode for fopen.

"m" is the same as "w", but does not truncate the file. In POSIX terms, it
doesn't add O_TRUNC to the flags.

There is "r+", of course; but "r+" requires that the file exists already. In
POSIX terms, "r+" does not include the O_CREAT flag.

fopen("foo", "m") creates the file if it does not exist, and opens it for
writing. The stream is positioned at the beginning of the file without
truncating it.

We can sort of emulate it with fopen("foo", "a"), then fclose, then open with
"r+".

------
0xDEEPFAC
Dear god, is the precedence of the "&" operator ever going to be fixed?

~~~
rseacord
I can't imagine it will ever be changed, since this would be a breaking change
to the language.

~~~
0xDEEPFAC
I disagree that this would be a "breaking" change as many people have already
resorted to using extra () and such a change might actually may "fix" broken
code which makes the reasonable assumption that things like == have a higher-
order.

[https://ericlippert.com/2020/02/27/hundred-year-
mistakes/](https://ericlippert.com/2020/02/27/hundred-year-mistakes/)

int x = 0, y = 1, z = 0;

int r = (x & y) == z; // 1

int s = x & (y == z); // 0

int t = x & y == z; // 0 UGH

~~~
DougGwyn
If you're using parentheses, as has been recommended for decades, there is no
problem. Otherwise, it is likely that such a change would adversely impact
previously working code. There just isn't a pressing need to change it.

~~~
0xDEEPFAC
Besides the fact that its unintuitive and could lead to low-level or hard-to-
find bugs?

It seems to me that C would benefit greatly to iron over its many
inconsistencies and exactly the kind of thing people expect in new revisions
of the language.

Also, I dont see how it would impact previous working code when compilers
already do things like allow selections between versions of languages a la
C99, C2x, etc. Users could just avoid the new version if they don't feel like
changing.

~~~
DougGwyn
I don't think most users of C want things changing underfoot. Keeping track of
all the version combinations is infeasible, especially when you consider that
an app and its library packages are likely to have been developed and tested
for a variety of environments. To the extent that existing correct code has to
be scanned and revised when a new compiler release comes out, one of the
primary goals of standardization has failed.

~~~
0xDEEPFAC
I disagree with your view of standardization - as restricting changes to be
additions to the runtime seems pointless as users could easily use other
(often more optimized) libraries.

But, I do see the benefit of having a language "frozen in time" which never
really changes and can be mastered painlessly without having to refresh on new
versions. Perhaps C is special/sacred in this regard.

------
nchelluri
Hello, just a quick note; I wanted to buy the book so I went to the website
and when I picked my country as Canada it started giving me a strange list of
provinces (definitely not Canadian) so I abandoned the process for now.

~~~
billpollock
I've asked our Operations Manager to look into this issue. Thanks for bringing
this to our attention. We'll get it sorted out. Please email info@nostarch.com
so that they can help troubleshoot.

------
Tronic2
Why is the struct tm* returned by localtime() not thread-local like errno and
other similar variables are (at least in implementations)? Do you have any
plans to improve calendar support for practical uses?

~~~
pascal_cuoq
Both question would get better answers if they were asked to a panel of
experts on POSIX (which could including members of the POSIX standardization
committee).

For the first one, I can attempt a guess: maybe it was feared that making the
result of localtime thread-locale would break some programs? You could build
such a program on purpose, although I am not clear how frequently one would
write one by accident.

Anyway, localtime_r is the function that one should use if one is concerned by
thread-safety. A more likely answer is that no Unix implementation bothered to
fix localtime because the proper fix was for programs to call localtime_r.

------
ellis0n
Hi, I'm 20 years dev and I love C. C is my second language after assembler. A
good days with Turbo C with 20 MB hard drive and 8086 without IT marketing and
viruses. I working on a real-time reverse-debugger for new programming
platform. It's possible to debug C code and prevent NULL and memory
exceptions. I create my language based on C and removed all keywords, and it
works perfectly. I want to make a gcc backend to my programming language and
all features will be available for any C-programs.

How I can find help for this?

------
hellofunk
I really like the relative simplicity of C compared to C++ and recently wrote
a project in C, but eventually rewrote it in C++ for just a few seemingly
trivial reasons that nonetheless were important time savers. I'd love to know
if the C standard, as can run on GPUs also, will ever evolve to offer:

1) namespaces, so function names don't need to be 30 characters to avoid
naming collision

2) guaranteed copy elision or RVO -- provides greater confidence for common
idioms and expressivity compared to passing out parameters

------
gnachman
Since 1999, a lot of undefined behavior has been added to the language to
improve compilers’ ability to optimize. For example, pointer aliasing rules.
How have you measured the benefit?

------
AvImd
What is your vision of C, its future and its past? What was it supposed to
become and did it become that thing? What is it now? What will it involve into
in the near and far future?

~~~
msebor
The C charter and the C committee's job is to standardize existing practice.
That means codifying features that emerge as successful in multiple
implementations (compilers or libraries), and that are in the overall spirit
of the language.

------
jfim
Out of curiosity, if there was anything you could change about C, and not have
to worry about breaking existing code or any other practical concern, what
would it be, and why?

------
DougGwyn
Back from lunch. Any West Coasters?

------
BeeOnRope
When deciding on standardized behavior for C operations or data representation
that may favor some hardware over others [1], who argues the side of the
various hardware vendors, if they have no members on the standardization
committee?

Is it fair to assume that hardware-related decisions occur in an environment
where members who are sponsored by vendors argue their employers case, rather
an a neutral one?

\---

[1] E.g., because some hardware's behavior may more naturally implement the
operation.

~~~
AaronBallman
> When deciding on standardized behavior for C operations or data
> representation that may favor some hardware over others [1], who argues the
> side of the various hardware vendors, if they have no members on the
> standardization committee?

The C committee has a number of implementation vendors on it (GCC, Clang, IBM,
Intel, sdcc, etc) and these folks do a good job of speaking up about the
hardware they have to support (and in some cases, they're also the hardware
vendor). If needed, we will also research hardware from vendors who have no
active representation on the committee, but this is usually for more broad
changes like "can we require 2's complement?".

> Is it fair to assume that hardware-related decisions occur in an environment
> where members who are sponsored by vendors argue their employers case,
> rather an a neutral one?

In my experience, the committee members typically do a good job of
differentiating between "this is my opinion" and "this is my employer's
opinion" during discussions where that matters. However, at the end of the
day, each committee member is there representing some constituency (whether
it's themselves or their company) and votes their own conscience.

~~~
BeeOnRope
Thanks for your quick and honest answer.

------
troukistan
Is there a way to append/extend a MACRO value ?

For example, I have a arbitrary number of includes, each of them declare a
struct that need to be listed later on.

    
    
      #define MOD_LIST // start with an empty list
    
      #include "mod/a.c"
      // MOD_LIST is: a,
    
      #include "mod/b.c"
      // MOD_LIST is: a,b,
      
      Module modules[] = {
        MOD_LIST
      }

------
ender1235
Hi I took an amazing course in college that focused heavily on C. Do you have
any recent examples of small side projects you’ve worked on using C?

~~~
DougGwyn
How about a Sudoku solver? Send me a request via e-mail.

~~~
dang
Doug, the email address in your account is private by default, but you can
make it public by putting it in the About field of your profile at
[https://news.ycombinator.com/user?id=DougGwyn](https://news.ycombinator.com/user?id=DougGwyn).

ender1235, if you don't see an email address there, email hn@ycombinator.com
and I'll put you in touch.

~~~
DougGwyn
Okay, check my About text. I'll soon remove it, to avoid getting a lot of
spam.

------
Koshkin
Why not keep C a simple little language with fast compile times and delegate
all "enhancements" (such as 'cleanup') to C++?

~~~
lioeters
From reading all their comments up to now, my feeling is that's exactly their
plan.

When asked, "Where do you think C is going?", one of them said, "I don't see
it going anywhere." I took that as a good thing, meaning, they're concerned
about backward compatibility, compiler performance, and only adding features
when there's a wide concensus in implementation - which is a high enough
hurdle that avoids the feature bloat of C++.

Overall, I felt the "conservatism" refreshing, to keep the language small.

On the other hand, there are several common feature requests I see in this
thread that probably will never be part of the language, since it moves slow
relative to other languages.

------
baybal2
Hello, I coded in C as a high schooler. Now, 16 years later, I have to code C
again semiprofessionally after a very long break.

Big question, how to start programming in C on a high professional level for
somebody self schooled in it? Is there a way to cut the corner, without having
to go through 10+ years trial and error to gain experience?

Anything for somebody ready to sit, study, and practice for a few hours a day?

~~~
Nemerie
There was a nice discussion recently
[https://news.ycombinator.com/item?id=22519876](https://news.ycombinator.com/item?id=22519876)

~~~
lioeters
I'm in a similar situation as the parent comment, wanting to re-learn C after
more than a decade (or two). Thanks for the link to a recent discussion! For
the parent, here are some of the recommended books:

Head First C - Griffiths and Griffiths

Expert C Programming: Deep C Secrets - Peter van der Linden

Modern C - Jens Gustedt

C Programming: A Modern Approach - K. N. King

21st Century C: C Tips from the New School - Ben Klemens

Understanding and Using C Pointers - Richard Reese

C Interfaces and Implementations: Techniques for Creating Reusable Software -
David R. Hanson

The Standard C Library - P. J. Plauger

------
smlckz
another proposal:

    
    
        _If, _Ifdef, _Ifndef

inside function macros

for example:

    
    
        #ifdef SOME_CONST
        #define WHATEVER(w, h, a, t, e, v, e, r) \
            ... common part ... \
            ... for SOME_CONST ... \
            ... common part continued ...
        #else
        #define WHATEVER(w, h, a, t, e, v, e, r) \
            ... common part ... \
            ... when SOME_CONST not defined ... \
            ... common part continued ...
        #endif
    

With _Ifdef, the above could be written like:

    
    
        #define WHATEVER(w, h, a, t, e, v, e, r) \
        ... common part ... \
        _Ifdef(SOME_CONST, \
            (... for SOME_CONST ...) , \
            (... when SOME_CONST is not defined ...)
        ) \
        ... common part continued ...
    

With these, one could also do:

    
    
        #define FACTORIAL(n) _If(n == 0, 1, (n) * FACTORIAL(n))
        int f = FACTORIAL(6);
    

turns into: int f = (6) * (5) * (4) * (3) * (2) * (1) * 1;

That would be very useful, I think. It might help with code duplication in
function macros.

Maybe _Switch/_Case thereafter.

~~~
clktmr
Why not write is like this:

    
    
        #ifdef SOME_CONST
        #define HAS_SOME_CONST \
            ... for SOME_CONST ...
        #else
        #define HAS_SOME_CONST \
            ... when SOME_CONST not defined ...
        #endif
        
        #define WHATEVER(w, h, a, t, e, v, e, r) \
            ... common part ... \
            ... HAS_SOME_CONST ... \
            ... common part continued ...

------
polishdude20
What is your favorite language other than C and why?

~~~
pascal_cuoq
I answered a similar question in another thread:
[https://news.ycombinator.com/item?id=22866242](https://news.ycombinator.com/item?id=22866242)

------
rs23296008n1
Modern C language features:

\- Why no sized text strings?

\- Why is there no hash data type?

\- Where's the linked list?

\- Why no package management as part of ecosystem?

What is the modern rationale?

Caveats:

\- I'm not implying any need for object-orientation (OOP)

\- I'm fully aware I can write these myself and can access third party
libraries that have each laboriously implemented their own versions.

\- I'm interested in why these are not native C constructs in 2020. I
appreciate why not in 1980.

------
seamyb88
Thoughts on Gnome glib, gobject, vala etc?

I tend to use glib for my (academic) code for pretending C is a high-level
language. It also seems to make up for implementation-dependent functions in C
and many portability issues. Also, IMO, vala > C++.

My question is, really, are there any other tools for high-level C programming
and do you know of any disadvantages of the Gnome stack?

------
faehnrich
I've been waiting for a book on C from No Starch Press, so I'm really excited
for this one.

This might not be too deep a question on the C language in regards to this
book, but I've been wondering, why did you decide to have an eldritch horror
as the book's cover?

~~~
rseacord
It's a longish story, but people do seem to like the cover. We started
equating the idea of C == Sea, so we had some early drawings of the robot
riding various undersea creatures including a giant squid. I thought that
looked overly phallic, so I suggested the robot ride Cthulhu instead, an
unofficial mascot of NCC Group.

~~~
faehnrich
I like how Cthulhu is shown as kind of a guide for the robot.

The C==Sea brings to mind the book Expert C Programming: Deep C Secrets.

~~~
beardedwizard
Deep c secrets, a classic.

------
loeg
Has Annex K been axed yet, and if not, why not?

~~~
rseacord
It has not. The C Committee has taken two votes on this, and in each case, the
committee has been equally divided. Without a consensus to change the
standard, the status quo wins.

Sounds like you don't care for Annex K. What don't you like about it?

~~~
loeg
I think my complaints are summed up nicely in some of your coauthors' report:

[http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1967.htm](http://www.open-
std.org/jtc1/sc22/wg14/www/docs/n1967.htm)

(1) runtime constraint handler callbacks are a terrible API.

(2) The additional boilerplate doesn't buy us anything — the user can still
specify the wrong size.

(3) The Annex invents a feature out of whole cloth, rather than standardizing
existing practices. There are no performant real-world implementations that
anyone uses. Microsoft's similar functionality is non-standard.

------
emilfihlman
Have you considered adding multiplexing capability to the standard? It would
be great to have a directly portable one.

~~~
DougGwyn
We would need a specific proposal and assurance that nearly all computers can
efficiently provide that service. It is more likely in the POSIX standard.

~~~
emilfihlman
Though it's interesting that threads were added to the standard. Perhaps
though they filled a niche that wasn't as well filled as
select/poll/epoll/kqueue/etc had already since pthread api is perhaps harder.

~~~
DougGwyn
I thought it would be best to standardize just a single thread, which should
be the basic unit to be embedded in a good parallel-processing model. However,
others prevailed.

------
arunc
What do you think about D language's mode to work as a better C
alternative[0]? It seems to even do printf format validation. Can this be the
future of C?

[0] [https://dlang.org/spec/betterc.html](https://dlang.org/spec/betterc.html)

------
om42
Not particular to the C language, but what are your opinions on build systems,
particularly for the embedded space? There's a couple vendor specific embedded
IDEs and toolchains and having to glue together make/cmake files to support
all of them can be a pain.

~~~
msebor
Robert's upcoming book has a survey of a few popular IDEs.

------
begriffs
What does the presence or absence of __STDC_ISO_10646__ indicate exactly? I
found this part of the C99 spec obscure.

For instance, the macOS clang environment does not define this symbol. Is
their implementation of wchar_t or <wctype.h> lacking some aspect of Unicode
support?

~~~
AaronBallman
If that macro is defined, then wchar_t is able to represent every character
from the Unicode required character set with the same value as the short code
for that character. Which version of Unicode is supported is determined by the
date value the macro expands to.

Clang defines that macro for some targets (like the Cloud ABI target), but not
others. I'm not certain why the macro is not defined for macOS though (it
might be worth a bug report to LLVM, as this could be a simple oversight).

~~~
begriffs
Would the following be a correct way to determine whether there's a problem?

* First call setlocale(LC_CTYPE, "en_US.UTF-8")

* Next feed the UTF-8 string representation of every Unicode codepoint one at a time to mbstowcs() and ensure that the output for each is a wchar_t string of length one

* If all input codepoints numerically match the output wchar_t UTF-32 code units, then the implementation is officially good, and should define __STDC_ISO_10646__?

~~~
AaronBallman
I think this is correct, assuming that locale is supported by the
implementation and wchar_t is wide enough, but I am by no means an expert on
character encodings.

------
loeg
Have any of you looked at the CHERI hardware architecture and fat capability
pointers, broadly?

------
begriffs
Has there been a survey to determine what percentage of known compilers
support each C version, like C89, C99, C11? I've been sticking to C99 because
I assumed later versions won't be widely adopted for a long time to come. Is
this accurate?

~~~
DougGwyn
There is a Web page I saw a few days ago that does that, probably findable by
grepping Wikipedia. Unfortunately I forget its URL.

------
hedora
I frequently rely on reading and writing uninitialized struct padding in code
that compare and swaps the underlying struct representation with some (up to
128bit) integer.

I could use a union type, but that adds extra memory operations, and is
finicky.

Is there a better way?

------
iamed2
What's an example of a codebase where _Generic has had a notable positive
impact?

~~~
AaronBallman
Not necessarily a code base, but _Generic is what makes <tgmath.h>
implementable for the type-generic math functions.

------
shivambw
What is the difference between C objects(#aka region of data storage in the
execution environment, the contents of which can represent value) and objects
in C++ in terms of representations and usage?

# from Chp2 of Effective C

------
kizer
How should I represent Unicode in memory?! UTF8? 16? 32bit integers? I keep
hearing pros and cons of all the previous three. Is there a consensus? In
cases where you don’t need a full blown Unicode lib.

------
jcranmer
Are there any plans to add support for multiple register return values to C?

~~~
DerekL
What are you asking for? Do you mean that if you return a small struct from a
function, the fields are placed in registers instead of memory, if they can
fit? This is up to the ABI, not the standard, to define, and some ABIs already
do that.

------
badrabbit
Hi, I have two questions.

1) Are there any plans or discussions on having a subset/extension of C that
is designed for formal verification? Much like SPARK with ADA.

2) Is there no plan to support GC? Even as an extension of C?

------
oscoder
A more chill question for you - What's your favourite string library?

------
Jahak
Tell me where I can get the C89 standard for free (pdf or other formats)

~~~
pascal_cuoq
The last time I needed it, archive.org had a link to a PDF of it.

I couldn't find that again in one minutes, but here is the text version:
[http://web.archive.org/web/20030222051144/http://home.earthl...](http://web.archive.org/web/20030222051144/http://home.earthlink.net/~bobbitts/c89.txt)

~~~
Jahak
Thanks

------
mey
This is a subjective question. From the array of tools in your belt, when do
you personally/professionally reach for C, or maybe more interestingly, when
do you _not_ reach for C?

~~~
DougGwyn
Since I do almost all my software development in a Unix environment, usually I
check the toolbox to see if there is already a program that has nearly the
functionality I want, and if so then I cobble together a shell script.
Sometimes (as with the Sudoku solver) it will be necessary to build a new
component, and for that I usually use C since I am comfortable and experienced
with it. (Also, if coded in Standard C, odds are that I can install it on
whatever platform I need, with little or no adaptation.)

------
douglascorrea
I'm trying to learn C during this quarantine times. I'm looking for good
beginner-friendly opensource projects to learn from. Can you please suggest
some repositories to look into?

~~~
woodrowbarlow
(obviously, i'm not one of the panel members, just chiming in.)

if you're interested in looking at how C can be used in embedded realtime
operating systems, i recommend diving into:

[https://github.com/ARMmbed/littlefs](https://github.com/ARMmbed/littlefs)

(i'm not affiliated.)

it's a lean, logging flash filesystem implementation and i recommend it
because the research, rationales, documentation, organization, codebase, test
harness, and public API ergonomics all impressed me a lot. it was written for
the mbed OS, but it is so well designed that i could integrate it into any
realtime OS without too much trouble. and the documentation is thorough enough
that after skimming the wikipedia article for filesystems, and maybe an
article on how flash chips read and write data, you'll be able to work your
way through it. i learned a lot by reading through that repository.

------
7532yahoogmail
A ton of comments here talk about arrays and basic preconditions. See
frama-c.com. even in c++ where encapsulation helps class/function contracts
and unit testing is a must

------
thesmileyone
What did you think of the Stuxnet code from your perspective? Was it clear who
made it from the start, and what it's purpose was? (Iran or China vs India?).
Thanks.

------
axelf4
In C89 is there a portable way to figure out the alignment requirement for a
struct, to be able to, say, store it after the NUL terminator in the same
allocation as a C string?

~~~
DougGwyn
I'm not sure what your requirement is. Usually things work out if you're
careful not to assume any specific value for alignment etc. It may mean a few
unused bytes here and there, but keeping things simple and portable often pays
off.

~~~
quelsolaar
Being able to know your alignments is VERY important for a lot of network
implementations. They are all defined by the ABIs, but its very annoying that
the standard keeps thinking that alignment is unknowable, when in fact its
impossible to implement a ABI without defining it. One of the reasons I stick
to C89.

~~~
DougGwyn
Note that the ABIs cover endianness as well as value range and/or object
widths. In general, one needs to have explicit marshaling and unmarshaling
functions to map from network octet array and C internal data representation.
Failure to get this right is (or used to be) a common bug for code developed
and tested on too few architectures.

~~~
quelsolaar
Sure, it wont be portable between any architectures, but a lot of times you
know you will be on a little endian platform where types are aligned to their
sizeofs. That covers a lot of ground and the performance gains you get from
optimizing with this in mind is significant. There is value in C being able to
be portable, but there is also a huge value in being able to write non-
portable code that takes advantage of what you know about the platform. C
needs to acknowledge that that is a legitimate use case.

------
Javantea_
Do you think that static analysis is a valuable tool for security research? Do
you recommend static analysis software to a single developer with a limited
budget or an amateur?

~~~
msebor
Yes, both :) There are a few in public domain that might be helpful to
experiment with. Clang has had a static analyzer for a while and GCC 10 adds
one as well (and the maintainer is looking for help with implementing checkers
so that's a good way to gain experience with writing one).

------
oreally
About time someone advocated for code in lower level styles of programming.
Hope it goes well!

Anyway, here's some questions:

\- What kind of programs would you say C is a good fit for?

\- There is some catching up to do for C. Is there a roadmap for C
improvement, or even a recommendation of C++ things that fit somewhat in the
style/philosophy of C? For example, I'd recommend not using the C++ smart
pointers stuff, while still using C++ threads and lambdas.

Also, you should include programmers from other fields in your committee. Game
(engine) developers, HFT programmers are used to lower level styles of coding
and align with your perspective.

------
aray
When do you think we will get an update to C11 or more recent version of C to
MISRA? Do you all have any influence on "Safety Critical C" standards?

~~~
AaronBallman
The MISRA committee is a separate organization from the C standards committee,
but there is overlap between the two groups and an official liaison process
for the committees to collaborate. So there's a bit of bidirectional influence
between the two groups.

I am not on the MISRA committee, but I believe they talk a bit about their
public roadmap in this video:
[https://vimeo.com/190304951](https://vimeo.com/190304951)

------
shivambw
What are your recommendations on going about learning the C language properly?
And How to go about learning all levels of abstraction of the language?

------
smlckz
I want this feature in the standard.

If there exists any memory block allocated using malloc() / calloc() /
realloc() which has not been free()'d, at the end of the program, they would
be free()'d automatically.

One can easily do it with keeping a linked list and using atexit(), but, can
it be added to the standard?

A general question, will anything, any feature, which is "easy" to implement
in pure C, like array knowing its own length or pascal strings, NOT be allowed
to be in C standard, even if it is widely used, maybe almost everywhere?

~~~
bch
The operating system handles this for you on process deletion. Lots of “one
shot” programs count on this (and (e.g.) file descriptors being automatically
closed).

~~~
smlckz
So, should I not care about that so much, and write programs without
free()'ing allocated memory?

If OSes do that, why not standardize that in C?

------
ativzzz
Other than these experts, what kind of companies do C developers work at? How
does the compensation look like compared to doing web development?

~~~
pascal_cuoq
I do not actually develop in C (other than short examples to feed the C
analyzer that I work on, which is not written in C) but our customers do
employ plenty of C developers. These customers are developing embedded
software that reads inputs from sensors, process them, and send the final
results of the computations to actuators, in fields such as IoT, aeronautics,
rail, space, nuclear energy production, autonomous transportation, …

The list is very much biased by the sort of analyzer we provide. There are
certainly plenty of non-embedded codebases in C and of developers paid to
maintain and extend them, it's just that we currently do not work with them as
much.

I do not know about whether the compensation is better or worse than for other
technologies.

------
kazinator
CAN I HAZ UNNAMED UNUSED PARAM

    
    
       void callback(int x, void *) // VOID STAR UNUZED, SO ANON
       {
          foo(x);
       }

~~~
DougGwyn
Why is there a second argument which is not used?

~~~
tropo
It could be for function pointer type compatibility.

There is an array of pointers, or there is a callback interface, or something
like that. The type is set in stone.

------
tridentboy
Know it's not exactly related to what you do. But do you have some
recommendations of books/online classes to learn C?

------
ebg13
How accurate, relevant, and useful today is
[http://c-faq.com](http://c-faq.com) ?

~~~
_kst_
It's a bit dated (it hasn't been updated since 2005), but apart from that I'll
say that _parts_ of it are excellent.

In particular, section 6 is the best resource I know of for explaining the
often counterintuitive relationship between arrays and pointers.

------
asimpletune
Why is shifting by a negative amount undefined?

~~~
kps
Because people want `c = a << b` to compile into `shl c, a, b` and C89 made
the giant mistake of calling it ‘undefined’ instead of ‘implementation-
defined, possibly fatal’.

------
kerkeslager
How do modern C developers approach writing secure network code in C? Are
there any tools for verifying network code?

------
freemind
1\. What is the easiest way to build cross-platform (native) GUI with C?

2\. Why it is harder to find lgpl licenced libraries to access windows
directories over network like jcifs pysmb (and libraries overall) when needed
to close most part of software source to sell small softwares to businesses?

3\. If you needed to combo C with another language to do everything you need
to do forever and never look back what other language would that be?

------
tayistay
To what extent does compiler complexity factor into your thinking about the
evolution of C?

Thanks for this!

~~~
AaronBallman
When the committee considers proposals, we do consider the implementation
burden of the proposal as part of the feature. If parts of the proposal would
be an undue burden for an implementation, the committee may request
modifications to the proposal, or justification as to why the burden is
necessary.

~~~
tayistay
Thanks. Do you have an example of a proposal that the committee considered an
undue burden for an implementation but was otherwise sound?

~~~
AaronBallman
Not off the top of my head, but as an example along similar lines, when
talking about whether we could realistically specify twos complement integer
representations for C2x, we had to determine whether this would require an
implementation to emulate twos complement in order to continue to support C.
Such emulation might have been too much of a burden for an implementation's
users to bear for performance reasons and could have been a reason to not
progress the proposal.

------
jasonhansel
Can/should the C language be extended to better support vector processors and
GPGPU?

------
jpfr
Quite a few new languages generate C code for the “backend” of their compiler.
For example ATS and the ZZ language.

This helps bringing these languages to embedded targets with closed toolchains
(with an existing C compiler).

Will there be developments to use a subset of C as a “portable assembly” in a
standard way? Like there is WebAssembly for JavaScript.

~~~
msebor
That doesn't seem likely. There have been no proposals for anything like it
and there is a general resistance to subsetting either C or C++ (the exception
being making support for new features optional).

------
papermachete
How and why will C combat Rust?

~~~
pascal_cuoq
In my opinion, the two languages are going to co-exist for a long time. C has
billions of lines of legacy software written in it… In recent news, COBOL
developers were sought after in order to update existing COBOL software, so
the same thing will happen with C, perhaps to the end of humanity (I have
become pessimistic as to humanity's future).

There are pieces of software that should be given priority for a rewrite in
Rust, but most of C software is never going to be rewritten, because there is
simply too much of it.

Therefore, even if C did not have any advantage of its own over Rust, there
would still be legacy software to maintain and to extend.

The advantages of C include that sometimes, an embedded processor with a
proprietary instruction set is provided by the chipmaker with its own C
compiler, which is the only compiler supporting the instruction set; that C is
still currently used to write the runtimes of higher-level languages (I'm
familiar with OCaml, but it isn't too much of a stretch to imagine that the
runtimes of Python, Haskell,… are also written in C).

~~~
sramsay
There's tons of legacy C around, we have to maintain it, it's not ideal unless
you're on some niche platform, lots of stuff should probably be written in a
better language . . .

I sincerely hope this is not the general attitude of the standards committee.
Some of us actually _prefer_ C, and would like to see the language continue to
flourish.

~~~
pascal_cuoq
Note that among the C experts participating in this AMA, I am not one who is
in the standardization committee. At 14:59 EDT, just before the AMA was
posted, we were joking between ourselves about me having to post this
disclaimer but I guess there was a hidden truth in the joke.

------
mesaframe
How to become a compiler engineer if you don't have a degree in CS?

------
ndesaulniers
What GNU C extensions do you think ISO WG14 would more readily accept?

------
complangc
Anybody know where Dan Pop went and what he's up to these days?

------
rand0mstring
will we ever see compile time programming in C like constexpr in C++?

------
7532yahoogmail
pascal_cuoq - Pascal Cuoq is the Chief Scientist at TrustInSoft and co-
inventor of the Frama-C technology

This looks to be a hell'va' good tool chain. I'm playing with as of yesterday.

------
mcguire
Any chance of getting something like Frama-C officially blessed?

------
freemind
Do you think object oriented languages are better than C to develop GUI-based
cross-platform programs?

The licenses of the majority of third-party libraries available for C are GPL,
do you think this makes harder reusing code to sell software?

------
a-bit-of-code
Any chance that we could have an STL equivalent in C. Of course, templating
and other features being absent it won't be as generic as CPP. However, having
even something close to STL will help in the long run. Thanks!

~~~
rseacord
There is always a chance. We would need to see a proposal based on experience
with an existing implementation.

------
networkimprov
Has there been consideration of async/await semantics?

------
rafaelturk
Something in the works for Async & Await?

------
dhhwrongagain
Is memset(malloc(0), 0, 0) undefined behavior?

~~~
DougGwyn
Let's assume the types have been corrected. malloc((size_t)0) behavior is
defined by the implementation; there are two choices: (a) always returns a
null pointer; or (b) acts like malloc((size_t)1) which can allocate or fail,
and if it allocates then the program shall not try to reference anything
through the returned non-null pointer. Now, memset itself is required (among
other things) to be given as its first argument a valid pointer to a byte
array. In particular, it shall not be a null pointer. Tracking through the
conformance requirements, if the malloc call returns a null pointer then the
behavior is undefined. Thus, you should not program like this.

~~~
dhhwrongagain
What observable difference is there between malloc(0) and malloc((size_t)0)?

~~~
saagarjha
None.

~~~
dhhwrongagain
I agree but he said the types needed to be corrected. As far as I know the
types were already correct.

~~~
DougGwyn
The argument "0" is not automatically converted to the right type unless there
is a prototype in scope. It isn't as important in this case because it is
highly likely that the appropriate prototype has been #included, but it is a
bigger deal if we're dealing with arguments for a variadic function. Anyway,
it's good to be reminded what the declared types are.

~~~
dhhwrongagain
Are you serious? Of course the question comes with the reasonable assumption
that the proper declaration has been made especially since it’s a well known
standard function. Additionally memset() is not a variadic function.

You said the types were corrected, you didn’t say you were reminding about the
declaration types. The types were correct from the start.

------
pakwlau
What do you think about Web Assembly?

------
Bambo
What is your favourite design pattern?

------
SaxonRobber
can we get compile time constant variables? something cleaner than enums and
defines

------
rand0mstring
is there no way to make C "memory-safe" during compilation?

~~~
zzzcpan
There are a bunch of research projects that did just that. And even just
compiling with address sanitizer makes it "memory-safe" to a significant
degree.

~~~
rand0mstring
can you link any to check out?

------
holografix
Is a time of Rust and Golang, how is C still relevant? (Sincere question)

~~~
saagarjha
There's millions of lines of C code that isn't going anywhere, and still many
platforms that those languages don't support.

------
DougGwyn
Some simple instructions about how to use a thread for conversation would be
appreciated. Thanks!

~~~
rseacord
Nothing to it! Just hit the reply button on comments you want to respond to.
You can also upvote anything you like by clicking on the up arrow to the left
of the comment.

~~~
DougGwyn
Okay, is there a starting thread for today's C Experts panel? I miss the old
net newsgroups.

~~~
dang
The thread is
[https://news.ycombinator.com/item?id=22865357](https://news.ycombinator.com/item?id=22865357),
which is the page you've been posting to. It's now listed on the front page of
the forum, [https://news.ycombinator.com/](https://news.ycombinator.com/),
which is a list of the stories people have upvoted today.

You're not the only person who misses the old newsgroups! The format that
Hacker News uses is one that became sort of standard on the web in the early
2000s. It works differently than usenet did, but you get threaded comments in
the sense that replies are nested under the posts they're replying to.

------
bumblebritches5
Hey guys,

How likely would the standard be to accept a proposal to add compile time
reflection to the preprocessor, or even adopt C++'s constexpr?

My use case is creating a global array in a header from static compound
literals in multiple source files at compile time, and outside of some crazy
clang-tblgen type solution, or very platform specific linker hacks, it's
completely unsupported by C.

------
mlvljr
How much UB does your own code contain, folks (and what practices do you
follow to avoid it)?

Cheers from the shadowland :)

------
zabana
Is it worth it to learn C in 2020 ? Will it still be a prominent language for
systems programming in the future ?

~~~
stephencanon
Yes.

\- Languages like Rust will gain more mindshare over the next decade, and be
used in more and more new projects, but there are billions of lines of
existing code in C, and those aren't going away.

\- Hardware architects, for better or worse, largely think about software in
terms of [a somewhat dated and idealized mental model of] C. So if you want to
be able to converse with architects (which anyone doing systems programming
should want to do), you need to have some basic fluency with C.

------
brainzap
How do you join three float values into a comma separated string, and then
split it again?

~~~
emilfihlman
Not sure what you mean but would

    
    
      s8 buf[enoughspace];
      snprintf(buf, sizeof(buf), "%f,%f,%f", your, three, values);
      sscanf(buf, "%f,%f,%f", &your,  &three, &values);
    

Do the job?

~~~
KMag
I think that the GP was making a commentary on the sorry state of locale
handling in C.

You need to first store the current locale, change the locale to one that
doesn't use a comma as the decimal point, perform the above, and set the
locale back. Plus, there's no threadsafe way to do this, since the locale is
process-wide.

------
orsenthil
Why is still the learning curve for C so high?

* Why can't the learning curve be solved using tools? * Why don't we actively promote more higher level languages which are implemented in C (by fewer people)?

~~~
pantalaimon
Do you find the learning curve for C to be high? I find it quite the opposite.
It's a simple language with only a few concepts to learn, once you got those,
that's it. There might be some preprocessor tricks you'll pick up later, but
the base language and library is pretty comprehensive IMHO.

~~~
throw_m239339
> It's a simple language with only a few concepts to learn

I mean by that logic, Assembly could be deem even simpler, yet writing OR
reading programs in Assembly is absolutely not simple at all.

At the end of day, one has to write programs that solve (complicated)
problems, and learning how to do that in C is difficult, thus the learning
curve deemed higher when it comes to writing professional C.

I can guarantee you that writing professional Go or Java and writing correct
programs in both takes way less effort than with C, for use cases that would
make Go or Java viable.

~~~
quelsolaar
Modern assembly language has a huge set of instructions, that make them hard
to learn, but the concept is still easy to learn.

~~~
DougGwyn
Many antique computers are simulated by SIMH. If you have the corresponding
software, you can operate on your desktop a simulated computer's software
development system. For example, DEC VAX (VMS or Unix) has a relatively simple
and sane assembly language.

~~~
quelsolaar
I think, learning a tiny bit of assembler, even if in an emulator, is very
valuable to teach the basics.

------
rurban
1\. When will we get proper strings in the stdlib?

2\. When we will get the Secure Annex K extensions?

3\. When we will get mandatory warnings when the compiler decides to throw
away statements it thinks it doesn't need? Like memset or assignments.
Compilers are getting worse and worse, and certainly not better.

ad 1) Strings are Unicode nowadays, not ASCII. Nobody uses wchar but
Microsoft. Everybody else is using utf8, but there's nothing in the standard.
Not even search functions with proper casing rules and normalization.
Searching for strings should be pretty basic enough.

2\. The usual glibc answer is just bollocks. You either do compile-time bounds
checks or you don't. But when you don't, you have to do it at runtime. So it's
either the compilers job, or the stdlib job. But certainly not the users.

~~~
rseacord
Going to try to answer these separately. For (1) if you mean strings that are
primitive types my guess is never. When had an hour discussion on this topic
at a London meeting where we were discussing new features for C11 and my take
away was that this would never happen because it would require a significant
change to the memory model for the language.

~~~
rurban
For the u8 type sure. Nobody needs a new type.

But at least add wcsnorm and wcsfc as I implemented them in the safeclib are
required. Not even coreutils, grep, awk, ... can search unicode strings.

And u8 library variants of str* and wcs* are definitely needed, maybe just
with uchar* not char*.

~~~
DougGwyn
Why would the utilities not handle unicode searching? Unicode characters match
properly, the null terminator works the same, and non-ANSI codes are just one
or more random 8-bit values which can be compared, copied, etc.

------
defectbydesign
I am sorry to tell this but the C programming language doesn't need anymore
the ISO committee since it introduced non de facto standard features such as
VLA.

For reference I still use The C Programming Language by KERNIGHAN/RITCHIE and
The Standard C Library by PLAUGER.

In my view what programmers need the most is good practices rather than any
syntactic sugar.

I prefer C rather than any other programming language for its conciseness.

There is opportunities for any new programming language to replace C if it is
at least backward compatible with K&R C SE (aka ISO C90) and provides a
portable access to de facto standard hardware acceleration such as SIMD
instructions for vector computing.

For now we have to write in assembly language SIMD optimized libraries in
order to get the full calculation power of modern processors.

For programmers who expect C to bring them a hot drink, I would recommend them
to stick with the bloated C++ framework which sometimes enlarges your p __*s.
:-P

~~~
defectbydesign
No answer but -2 points.

It seems cowards don't have any argument. :-)

------
WFHRenaissance
A bit off topic, but what are your views on Golang? I'm leaving this pretty
open-ended, but I'm curious how you see it interacting with the C/C++
ecosystem in the future.

