
C2x: the next real revision of the C standard - ingve
https://gustedt.wordpress.com/2018/11/12/c2x/
======
Aardwolf
They should add something to run code when leaving a scope, no matter how you
leave the scope (break, return, ...).

That would allow cleanup and other such actions without resorting to such
hacks as: "goto cleanup" pattern, separate functions, fake for loop to break
out of, ...

~~~
majewsky
The idiomatic way to do that is with destructors.

EDIT: Sorry for the confusion, I misread the article and thought this was
about C++2x.

~~~
mouldysammich
C doesn't have destructors, does it?

~~~
stinos
You can use C++ as some kind of C with destructors though :P Scoped cleanup is
really a nice thing..

But then you also want templates of course to get rid of your void*. And oh,
wait, what's that? A cross-platform thread implementation (well, sort of,
depends on platform)? A string thing which I can search without strstr?
Anonymous functions which capture? Auto? Gimme gimme :]

~~~
beefhash
> You can use C++ as some kind of C with destructors though :P Scoped cleanup
> is really a nice thing..

C++ is not a strict superset of C. You actually have to cast void pointers,
for example.

> But then you also want templates of course to get rid of your void*.

No, actually, I think I can live without that, but thank you.

> And oh, wait, what's that? A cross-platform thread implementation (well,
> sort of, depends on platform)?

C11, threads.h. Been there, got that.

> A string thing which I can search without strstr?

Okay, I'll give you that, string handling is hell.

> Anonymous functions which capture?

Have their uses, but I'm not sure if C is really the right place for it.

> Auto?

auto (as used in C++11 and later) is a solution in need of a problem in C.
When you don't have piles of templates and iterators, you also tend not to
need to automatically derive types. In a world where signed/unsigned integer
comparison can have fatal consequences, you probably do want to be very sure
about your types.

~~~
rubber_duck
>C++ is not a strict subset of C.

I think you flipped this :) ?

~~~
saagarjha
I mean, it’s not wrong…

------
nrclark
Request for C++14-style constexprs. It would be a great way to spice things
up, and I think would be very compatible with the C philosophy.

C is a great language, and I love working with it. But constexpr is something
I really feel would improve the language (by eliminating the need for
complicated #define macros). C++14's constexpr was a breath of fresh air when
I first used it. It basically lets me write real code which evaluates at
compile-time and can generate constants. I'd love to see constexpr ported over
to C.

~~~
rwbt
I think having a primitive templating system would also reduce the usage of
macros. While we're at it throw in AST macros too.

~~~
AnimalMuppet
Your second request seems considerably outside the style and philosophy of C.
Or was that sarcasm?

~~~
rwbt
Yep, it was made jokingly trying to make a point about the slippery slope of
'bloat'. But didn't go that well.

~~~
nrclark
Out of curiosity, would you consider constexpr to be bloat?

I feel like it has tons of applications, especially for embedded C
programmers. One example that comes to mind: CRC calculation. Most
microcontroller CRC libraries are lookup-based, or a mixture of a lookup table
and other methods.

Currently, a lookup table can be done only 3 ways in C: pre-compute it and
store 'magic numbers' (either while developing, or with some custom pre-
processing), compute at program initialization (which costs code-space and
startup time), or a compile-time computation based on some very ugly macros.

With constexpr support, I could just write a C function that calculates the
table values. Then I could use it to populate a constexpr table at compile
time, or even use the function at runtime if I needed to do some debugging.

C already has sizeof, which is a compile-time computation. Why not let users
write their own compile-time functions too?

I recently wrote a command-lookup library in C++14. Using constexpr, I could
precompute the hashes of each string and populate a switch() statement. Plus I
could use the exact same function to hash my incoming strings, even if they're
of unknown length. That's not possible in C at all, not even with loop-based
macro trickery.

~~~
rwbt
I think constexpr is a great feature. But without templates (or meta
programming in general) I feel like it's severely limited in it's uses
(although you cite a great use case). Adding templates is another whole can of
worms (for templates to be really useful, add operator overloading too? so
on..)

------
pjmlp
> Improve array bound propagation and checks

Looking forward to this one, specially since Annex K ( Bounds Checking
Interfaces) has proven to have not solved anything and is now scheduled for
possible removal.

[http://www.open-
std.org/jtc1/sc22/wg14/www/docs/n1969.htm#ad...](http://www.open-
std.org/jtc1/sc22/wg14/www/docs/n1969.htm#adi)

These other ones are also interesting from security point of view.

> Add a new calling conventions with error return for library functions that
> avoids the use of errno

> Extend specifications for enum types

> Add a simple specification of nullptr, similar to C++

~~~
rurban
> Looking forward to this one, specially since Annex K ( Bounds Checking
> Interfaces) has proven to have not solved anything and is now scheduled for
> possible removal.

It has solved the usual bounds errors on those who use it. I.e MS, Embarcadero
and safeclib. Since glibc, musl and freebsd libc refuse to use the Annex K
it's easy to blame just them, not the safe extension, which do serve their
goal.

With safeclib you even get compile-time checks, more than with the simple
_FORTIFY_SOURCE=2 checks. Only Android Bionic got a bit better lately.
safeclib is as fast or even faster than the fragile assembler bits in glibc.

> Add a new calling conventions with error return for library functions that
> avoids the use of errno

That's of course part of Annex K.

What's urgently missing is still a proper string API. ASCII str* goes nowhere,
wide char wcs* is not widely used, 80% use utf8. wcslwr, wcsfc, wcsnorm are
even missing. The whole u8* string API is missing, and not even discussed.
ICU, libutf8 and libunistring don't get you far. coreutils still cannot handle
non-ASCII strings.

~~~
pjmlp
Not at all, because it doesn't fix out of bounds errors caused by copy-paste
errors where the given buffer size doesn't match the actual size of the
declared buffer or string.

A situation which is described on the field report I linked to.

Only C++ compilers like Visual C++ which provide overloaded versions of Annex
K without size parameter thanks to the improved C++ type system, do actually
provide a real usable version of Annex K.

But then it isn't C anylonger.

~~~
rurban
> Not at all, because it doesn't fix out of bounds errors caused by copy-paste
> errors where the given buffer size doesn't match the actual size of the
> declared buffer or string.

That's wrong, check the implementation.

C++ only has advantages because there g++ is not as broken as gcc with
constexpr, but clang since 5.0 is doing fine.

[https://github.com/rurban/safeclib](https://github.com/rurban/safeclib)

~~~
pjmlp
As you mention on "Compile-time constraints" this extensions to Annex K
require a supporting compiler.

It is perfectly viable to have a 100% ISO C99/C11 compliant compiler that will
happily overwrite the target buffer, read more bytes than actually exist or
search the whole memory block for a '\0' terminator, because Annex K does not
require the existence of the compiler extensions used by your safeclib.

~~~
rurban
Right. Microsoft does it wrong.

But neither ISP nor POSIX require _FORTIFY_SOURCE neither which uses the same
object_size (bos) CHECKS. The alloc_size builtin or the CPU supported bounds
checks are not used at all with the major libc's.

Problem is not the Annex K, but the refusal to implement and use it. There's
no problem with state as in errno or locale. The criticism is easily
debunkable. It's outright NIH bias.

~~~
pjmlp
What I would like to see would be tagged memory like the SPARC has, which was
put to good effect on Solaris, to become common across all major platforms.

Apparently at least Android might adopt it on ARM.

And do concede, maybe the extensions you have used, to actually become part of
Annex K as well instead of deprecating it.

Because despite my dislike for C, I am fully aware that UNIX derived platforms
will stay with us for the years to come, so we need to improve the current
state of affairs somehow.

------
0x09
The linked discussion on the memory model is particularly interesting to me as
it covers a number of issues and ambiguities with the current specification
for strict aliasing and what kind of accesses should/shouldn't be allowed,
including for example the problem that there's no way to obtain untyped space
on the stack (Q75).

Since this as a whole is among the least consistently implemented and
(arguably based on the number of questions it generates) least well understood
aspects of the standard it's nice to see some authoritative efforts to clarify
the intended behaviors.

[http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2294.htm](http://www.open-
std.org/jtc1/sc22/wg14/www/docs/n2294.htm)

~~~
saagarjha
> the problem that there's no way to obtain untyped space on the stack (Q75)

As in a standard alloca?

~~~
gpderetta
Fun fact: alloca is neither in (any) C standard or POSIX. It is just a very
old extension, implemented in many compilers with somewhat compatible, under
specified semantics.

~~~
saagarjha
I am aware of that, which is why I said “ _a_ standard alloca” instead of “
_the_ standard alloca”. I’m asking if they’re proposing adding a version of
alloca to the standard.

------
beefhash
Wishlist: the arc4random family (would also put up with it being in POSIX),
strlcat and strlcpy (yes, glibc, just for you), getline (fgets sucks),
reallocarray and recallocarray, that signed integer overflow becomes at least
implementation-defined rather than pure undefined behavior, ideally fix the
notion of locales and guarantee (u)int8_t == unsigned char if (u)int8_t exists
(lots of code actually relies on that).

~~~
anticensor
u)int8_t == unsigned char if (u)int8_t exists

This alone will make C incompatible with its original platform, PDP/11 where
char is either 7 or 9 bits.

~~~
lifthrasiir
But the current C standard no longer allows a 7-bit byte anyway, as it
mandates that CHAR_BIT >= 8.

~~~
anticensor
Sorry, it should have been PDP-10, the point stays the same -still one of the
founding architectures of original UNIX. PDP-11 used 16-bit words and 8-bit
bytes.

------
kragniz
>Add a type aware print utility with format specification syntax similar to
python

Does anyone have more information about this proposal?

------
vkazanov
I will just pray for the better error handling proposals.

Both are fine. For modern code with no global state, especially the
multithreaded kind, everything is better than the errno.h mess.

~~~
jstimpfle
I don't think "error handling" should be a thing in C. Errors are just data,
not a special case.

errno.h isn't so much builtin to C. It's just part of the standard libary, and
more so of POSIX. It's only used for interfacing with the OS. And it's not
_that_ bad, since errno is a thread local variable.

~~~
vkazanov
C is not a holy cow, is it? It's just a very, very popular language. Error
handling is important in programming in general, and having ways to implement
it conveniently helps a lot.

Yes, errors are just data. And everything is data, including the code itself,
right? It a question of abstractions introduced to the language and the
program at question.

errno.h is bad. It's inconvenient, it's global, even if it's a thread-local
something, it's cumbersome to use.

AFAIK, there are 2 error-related proposals for inclusion. One is making errors
special, the other is a bit more generic. Both would improve on the current
situation.

~~~
unwind
I'm being a bit nit-picky, but in C code is not a lot like data, no.

You cannot take sizeof a function, and you can't copy functions around, for
instance. The address of a function is a value ("data"), but not the function
itself.

~~~
vkazanov
Yeah... C is not Lisp, okay. :-)

But that only strengthens the point: language features are just abstractions
we use to reason about data. One can always strip all the abstractions and end
up working with raw bits.

If there's a useful way to think about certain kinds of data - it might be
useful to codify that way as a language feature. Such as specialised error
handling.

~~~
kazinator
In Lisps, a function is also just a reference, and not the object itself. You
can't take its size or copy it. (You could copy it if your dialect provided a
_copy-function_ function, of course. I don't think I've ever seen one. Such a
thing could be useful if it provided a frozen copy of a closure's lexical
environment, that would be unaffected by mutations when the original copy of
the function is invoked.)

~~~
kazinator
Okay, I will try to explain it to the downvoter. Suppose we have a lambda like
this:

    
    
      (let ((counter 0))
        (lambda () (incf counter))
    

When we evaluate this we get a function. It contains the captured lexical
environment. If we call that function, the captured _counter_ variable
mutates.

Now suppose we had a _copy-function_ library function. I would expect it that
if we apply it to this function, we get an object which carries a duplicate of
the lexical environment. This means it has its own instance of _counter_. If
we call the original function, the copied function's _counter_ stays the same
and _vice versa_.

I don't remember seeing such a _copy-function_ in any Lisp dialect; there
isn't one in ANSI CL. It seems it might be useful, same as the ability to copy
a structure or OOP object.

------
man-and-laptop
Will anything be done about this problem? [https://github.com/mpv-
player/mpv/commit/1e70e82baa9193f6f02...](https://github.com/mpv-
player/mpv/commit/1e70e82baa9193f6f027338b0fab0f5078971fbe)

~~~
pilif
This reads like a problem with the POSIX libraries and not with the C language
itself, so it‘s probably not up to the language to fix it.

C actually comes pre-equipped in a pretty nice position with its default 8 bit
char data type (which is perfect for UTF-8).

The POSIX standard however, I agree is in serious need of an overhaul for
locale features and you‘re probably better off ignoring it completely and just
going ICU these days anyways - though then you have to go the extra length to
still use the OS provided means (LC_* env under Unix) to get to the user‘s
preferred locale.

But once you have that, yeah, ICU is probably the way to go for actual string
formatting.

~~~
detaro
Locales and the most basic functions around them are defined in the C ISO
standards. (e.g. section 7.11 of [http://www.open-
std.org/jtc1/sc22/wg14/www/docs/n1570.pdf](http://www.open-
std.org/jtc1/sc22/wg14/www/docs/n1570.pdf))

------
vortico
C is a good language, but why does the standard library need to be in the
language standard? It's perfectly possible for there to be competing modern
libraries which implement everything from the stdlib which can be built
completely from pure C source. Suppose I build libvortico, which had proper
string handling, threading, filesystem, etc. There would then be no need to
deal with stdlib quirks. Just use that as an alternative.

~~~
umanwizard
You're perfectly welcome to build your own library and put whatever you want
in it.

The reason there is a "standard library" is to make portability possible.

~~~
vortico
But portability has nothing to do with whether the library is standardized.
One could port any API, whether from the standard or a simpleton like me, and
port it in pure C and OS calls to any environment. One could even port a
better API to wrap the stdlib calls, to add support for at least what the
standard offers.

~~~
umanwizard
Your argument goes through just as well for the base language itself. What's
the point of having a C standard at all? Since you could just port your
compiler to any platform...

~~~
vortico
The point of a language standard is to have multiple compilers and to be able
to write code to some standard. The difference between these points is that
with a "stdlib" alternative library, you can still have multiple compilers and
write library function calls without following _the_ standard, but simply the
library's API. The reason I believe the stdlib shouldn't be a standard is for
the same reason that libpng, libjpg, libzip, etc. aren't part of the C
standard. It should be the software vendor's choice of which standard library
to use.

------
rwbt
I hope they make real progress on improving the C language, not just slap on
yet another library/header to the standard lib (cough.. <complex.h> ..cough).

------
baybal2
C is older than me, but it still has so much things to improve. And I am very
glad that it is still being actively improved.

------
The_rationalist
' char8_t: A type for UTF-8 characters and strings, see N2231 ' What does this
mean ? C11 already brought the u8 type

~~~
rurban
char8_t is the type, u8 is just a prefix for constants, like u8"xxx"; And it's
only in C11++, not C11 AFAIK.

~~~
The_rationalist
You are right, this is only for string litterals (so read only).But Wikipedia
C11 mention the feature, what do you mean by c11++ ? Btw c17 added 0 feature.

------
kevin_thibedeau
Next revision: Please stop kowtowing to recalcitrant compiler vendors too lazy
to invest in updating their code.

------
GautamGoel
What about concurrency?

~~~
muricula
There was a lot of work done on that in the C11 release with updates to the
memory model and thread.h. I have heard some criticisms of thread.h APIs which
could be addressed, such as the ability to specify stack size.

There are C++2x proposals to add a green thread/goroutine style mechanism. I'm
not convinced that it's something which belongs in the C core language or C
standard library though.

------
capsicum80
Previous attempts to doing this have introduced unnecessary ugliness to the
language such as the possibility to mix declarations with code. I can only see
the language getting worse by introducing new pointless features. Unnecessary
bloated C already exists and it is called C++

~~~
Twisol
By "mix declarations with code", do you mean the ability to put local
variables like `int x;` in locations other than immediately at the start of a
function? I don't understand why this is a bad thing.

~~~
capsicum80
This, and especially in the first argument of 'for'. Not back compatible, very
inelegant and luckily discouraged in many projects' coding guidelines.

~~~
capsicum80
Actually it is block, not function. It makes it very difficult to read the
code if you are looking for the scope of the variables, or investigating stack
usage.

