I used Kagi to generate a summary of the video and generate a list of key moments:
The podcast discusses recent and upcoming changes to the C programming language in versions C23 and beyond. It explores new features being adopted from C++, such as auto type inference, attributes, and variably modified types. These changes aim to improve compatibility while staying true to C. The presentation highlights the addition of bit-precise integer types that allow specifying exact bit widths and avoid integer promotions. It also covers new Annex F functions for improved support of IEEE 754 floating point operations. Overall, the podcast examines exciting updates to C that enhance functionality, performance, and security for developers.
- C has seen significant changes over the past 12+ years for the first time, with C23 introducing many new features.
- Some of the new features in C23 are inspired by C++, such as attributes and auto type inference, but aim to stay true to C's spirit.
- Attributes allow providing additional information to the compiler without changing code behavior, and can be standard or vendor-specific.
- Auto type inference makes generic macros easier to write by inferring parameter types.
- Enumerations can now specify their underlying type, like long or unsigned int, for more control over sizes.
- Variably modified types allow array sizes to depend on non-type parameters without stack allocation issues.
- New integer types like intN_t specify exact bit widths and avoid integer promotions.
- New macros help with endianness and bit manipulation operations in a portable way.
- Annex F for floating point is now in parity with IEEE 754-2008 for more consistent floating point behavior.
- Memory clearing functions like memset_s were added to improve security by overwriting secrets before freeing memory
The video actually makes it an important point that this is not the case since C and C++ are different languages with different design philosophies. However, in the areas where they overlap, it makes sense to eliminate silly differences (like 'func()' vs 'func(void)' or '= {};' vs '= {0};').
Auto and constexpr fix specific problems that also exist in C (auto is useful in type-agnostic macros, and constexpr finally fixes the problem that a const isn't actually a constant.
One interesting tidbit (which I didn't know yet) is that the C committee requires two real-world implementations for a proposal to even be considered (with the C++ standard counting as "implementation"), while the C++ committee doesn't require an implementation. Meaning C++ users are essentially guinea pigs for the C standard ;)
Explains why C++ has become such a hot mess, while C has been mostly spared any serious f*ckups (I can only think of one: VLAs, but those have essentially been removed from the standard in C11).
VLA were not removed. VLAs are almost always better than the next best alternative:
- They are better than alloca due to proper scoping and standard compliance.
- They use less stack than regular arrays on the stack with a worst-case size (e.g. a divide-and-conquer algorithm that may need O(N^2) stack space without VLAs could potentially be written using O(N log(N)))
- VLAs can allow more accurate bounds checking than with worst-case sized arrays.
- VLAs are faster than heap allocation.
There are issues with VLAs: If the size is controlled by an attacker, then this could cause security issues. This is largely mitigated by -fstack-clash-protection which transforms this into a DOS (same as unbounded heap allocation) and you want to have stack clash protection anyway. Static analysis tools and compiler flags can also help to detect cases where the size is controlled by input from the network. Assembler for VLAs worse than for fixed size array, but this goes at the cost of less space saving. But, again, people who avoid VLAs blindly because of these issues then often use something which is worse.
Also note that most other languages except C++ also have VLAs.
They were made optional in C11, which for real-world purposes makes puts it on the same level as a compiler-specific language extension. For instance MSVC will (most likely) never support VLAs:
The difference is that if there are there, you get consistent behavior across all compilers that support them. MSVC was essentially stuck with pre-C99 for a long time. So nobody in their right mind would use it for C programming if he had a choice.
Now they are catching up. Let's see how this goes. We made variably modified types mandatory in C23. I hope we make VLAs mandatory again in the next revision.
> Also note that most other languages except C++ also have VLAs.
I don't think that's actually true.
Lots of modern languages have "pointer plus length"--they call them lots of different things but I think "slices" is a common term. But those aren't VLAs.
Some languages have variable vectors backed by a compile-time fixed-size array (See Zig: BoundedArray). But, again, not a VLA.
I'm trying very hard to think of a non-GC language that allows you to allocate a run-time length array on the stack other than C, and I'm not coming up with one.
C has variably modified types (in CS usually known as dependent types), where the length is encoded into the type. A VLA has a dependent type: char buf[n] or a pointer to a VLA has: char (*buf)[n]. This is super powerful and theoretically sound concept although not yet really exploited in C. But the bound travels with the type and you can get bounds checking at run-time:
char buf[n];
auto foo = &buf;
buf[n] = 1; // run-time bounds check possible
Pascal, Ada, Fortran, D have VLAs and certainly more languages have VLAs.
Between the fact that most modern languages don't have VLAs and that the languages you mention are most certainly not modern, your statement of: "Also note that most other languages except C++ also have VLAs." is not even close to correct.
I guess this depends on whether you count higher level languages with GC which often have some kind of automatically managed dynamic array / vector type but not necessarily call them VLAs.
I also did not say "modern" - not that this is clearly defined. If you only count Zig and Rust as modern, then those two do not have VLAs as far as I know. But this was not my main point anyway.
Most of the stuff it imported from C++ are relatively minor convenience things, not radical paradigm shifts. You make it sound like C23 started adding stuff like classes or templates.
I think it's more that unsafe/unpredictable pre-processor features are being implemented in the compiler so that more errors can be checked at compile time
Has anyone ever encountered trigraphs in the wild? (in real code, IOCCC doesn't count)
I've seen K&R C a few times, even in modern (well, "modern") code, but I don't think I've ever encountered trigraphs. The proposal mentions that "instances of trigraphs being deliberately used in production code" via a codesearch.com query, but that's public code only (and probably only recent, and incomplete?)
Never seen intentional trigraphs in the wild and I doubt most people have given that they're disabled by default in GCC. A search did turn up whatever the fuck this is though:
Doesn't turn into anything meaningful with trigraph replacement. I'd guess that it's essentially a giant regex matching a header in some horrible, unknown dialect, but I'm not brave enough to look through the codebase to find out.
> Has anyone ever encountered trigraphs in the wild? (in real code, IOCCC doesn't count)
Here’s some real code which starts with trigraphs. Note it only uses the trigraph for the initial pragma to establish the EBCDIC code pages, later on it doesn’t use them. (Also, that file contains some invalid characters because it was converted from EBCDIC to ASCII using the wrong code page.)
This is a tool for storing dumps (basically mainframe equivalent of core dumps) on a networked Unix filesystem. Someone wrote it because they didn’t have enough space on their mainframe to store the dumps, but did on networked Unix storage. And then they released the source code (not clear what license)
I've seen accidental trigraphs such as ??! come up a few times in pre-ANSI codebases. Trigraphs are still used in EBCDIC codebases[0], but I wouldn't expect to see them anywhere else or in public code.
I've never seen them. It doesn't surprise me. Trigraphs were added to support machines without full ASCII support, lacking characters like ~ # { } and so on. Such machines are extremely rare, in practice. For example, even an early 1980s Apple //e, has those characters on its keyboard and in its character set.
I feel like it's been one of the great identifiers of blog spam - articles on "worst c mistakes" where someone writes about all the problems trigraphs cause, I guess because they are banned in some standards like misra.
Not in our lifetime I know, but we sort of understand why we have year 2000 problem. It seems as human, we know what that missing years are or “reverse numbering” is ok.
As long as the C committee avoids publishing standards in 2089, 2099, 2111, 2114, 2117 and 2123, the numbering scheme can go on for at least another 100 years...
What's the opinion here on decimal floating point (_Decimal32/64/128)? I'm guessing there's a significant amount of business software using double for monetary amounts out there that could benefit this being standardized. I think it has been in gcc for over a decade as an extension though...
The extension was the reference implementation of the TS. As for its uses, IBM was the main organisation to push for this (they implemented DFP extension on gcc too) so I guess their customers (most probably on the mainframe business) have a need for it.
Or just don't use NUL-terminated strings, seriously: zero-terminated arrays are very uncommon except when it comes to char arrays in which case they're everywhere.
[1]: https://en.wikipedia.org/wiki/C23_(C_standard_revision)