
Depressing and faintly terrifying days for the C standard [pdf] - signa11
http://www.yodaiken.com/2018/05/20/depressing-and-faintly-terrifying-days-for-the-c-standard/
======
copper_think
At Microsoft the compiler team (Visual C++) and the Windows team are joined at
the hip. I'm sure the same was true at Sun. This can lead to good decisions
about undefined behavior that I hope would make Linus smile.

I recently learned of one such good engineering decision (I hope I'm
remembering it correctly). Let's say you have a struct with an int32 and a
byte in it. That's 5 bytes, right? But the platform alignment is a multiple of
4 bytes, so there's 3 bytes of padding (sizeof the struct is 8 bytes). If we
stack-allocate an array of 11 of these and zero-initialize with = { 0 }, what
would you expect to see in memory after initialization?

It turns out the answer _was_ that the first element of the array would have
its 5 bytes zeroed, but the 3 bytes of padding would be left uninitialized.
Then, the remaining 10 elements of the array would be zeroed with a memset
that actually zeroed all 80 remaining bytes. It sounds weird but this is a
legal thing to do from the standard's perspective. All they're obligated to
zero out are the non-padding bytes. This UB was leading to disclosure of
little bits of kernel memory back into user mode because Windows engineers
assumed that = { 0 } was the same as leaving the variable uninitialized and
then memsetting the whole thing to zero. Nope!

The compiler team fixed this by always zeroing out padding too. Problem
solved. There are some cases where it's not quite as fast. But it's the right
engineering decision by the compiler team for their customers, both internal
and external.

~~~
bigcheesegs
The problem with this kind of approach is over time it removes the ability to
use any other implementation. You are no longer using C, you are using
<Implementation>C.

This becomes a problem when a different implementation adds amazing tools for
finding bugs (the sanitizer suite, for instance), and you can't use them
because your code doesn't build with any other implementation.

~~~
iforgotpassword
Luckily clang does a good job at being GCC compatible, so I don't worry too
much about using GCC extensions. It's quite unlikely one of them will go away
anytime soon, and theory pretty much cover all architectures/platforms that
have ever existed.

~~~
pjmlp
Usually true, unless you want to target embedded, mainframes or some
industrial OSes.

------
notacoward
Like phkahler, I don't want C to grow any more. I've been a C programmer for a
long time, the vast majority of my day-to-day work is still in a C codebase,
and I expect to continue working in C for a while yet. Nonetheless, its time
has passed. It will be around for a while, just like FORTRAN and COBOL are,
but there's no good reason for new code to be written at that poor level of
abstraction. Even for systems software - what I write - there are always
better choices that provide higher-level data and control structures. They
variously use garbage collection, reference counting, ownership rules, or
whatever you call that hot mess C++ has. Writing safe, secure C code is
certainly possible, but it's too much unnecessary work - especially in the
concurrent and/or parallel world that any non-trivial code has to live in
nowadays.

That said, I really wish proponents of other languages would get their stuff
together about creating libraries that can be used from other languages. A
library written in C can be used by anyone else. Many other languages are avid
consumers of this functionality, many advertise it as a key feature, but very
few return the favor by _producing_ reusable code. The industry doesn't need
such Balkanization. If you're one of the very many people who look down their
noses at C and want to get rid of it, do your part.

~~~
blobjectivism
>A library written in C can be used by anyone else.

This is because of how much of the unix clone ecosystem has been built around
C workflows, and this wasn't true on Windows until linux compatibility was
developed on it.

>If you're one of the very many people who look down their noses at C and want
to get rid of it, do your part.

convincing linus and sysadmin greybeards to modernize linux and is no small
task, until then we'll all still be just be scripting over archaic C apis

"science progresses one funeral at a time"

~~~
notacoward
> This is because of how much of the unix clone ecosystem has been built
> around C workflows

Absolutely correct.

> convincing linus and sysadmin greybeards to modernize linux

That's not what I'm suggesting. Anything that's already written in C can and
probably should continue to be so. What I'm suggesting is that people who
prefer to work in other languages should have an easy way to make their work
available beyond their own language community. I'm not talking about
interpreted/scripting languages here. I'm talking about compiled/systems
languages. Stuff that gets linked together, or that should be able to use some
sort of dlopen/FFI back and forth fluidly despite multiple languages being
involved. There's some work to be done there, but everyone seems to prefer
hiding in their own language bunker instead of reaching out to others.

~~~
slededit
The problem is that other languages necessarily place more restrictions on how
their data can be used in order to gain all their nice features. Since you
can't control the caller you lose all those guarantees. Because of that it
will never be easy to interop across higher level languages. C works well here
because its low level enough that it expects few guarantees. Just keep the
stack aligned and balanced and it will mostly be happy.

------
buserror
I also think that the committee is out of touch. C99 was an awesome
improvement on the language, and since then, it went downhill. We don't need
the extra weird C11 syntax things, duplicating of existing libraries; we want
tools that scope C better, or extensions that have been proven helpful and
stable (one such example is the switch(x) { case A...B: } from gcc!).

I want strict boundary checking, I want an array base type that can't be cast
as a pointer. I want some sort of scoping mechanism (ie, blocks), I want a bit
of standardisation of memory barrier and such. I want #pragma once FFS -- it
was proven a good idea for 25 years.

Basically there's tons of stuff that could help make the language better --
C99 did that; C99 is a masterpiece for example on how you can statically
initialise extremely complex data into a single block, without having to use
code. It's used all over the Linux kernel (amongst other thing; for example my
own simavr is heavily based on that feature [0]).

* Standardise the stupid bit order in bitfield declaration FFS. I've been wanting to use that feature for 30 years and I can't because they 'forgot' to make up their mind!

* Coroutine standardisation would be awesome (stack swap primitives, with boundary checks etc)....

* gcc 'sub functions' (or a derivative) would be awesome if improved to make them safe.

* Reference counted allocator (basically, get libtalloc and roll it in [1])

There are so many things that could be improved, without diverging into weird
stuff nobody _needs_ (complex math anyone??!?!).

* In fact I want SIMD. I don't need these complex types.

[0]:
[https://github.com/buserror/simavr/blob/master/simavr/cores/...](https://github.com/buserror/simavr/blob/master/simavr/cores/sim_megax.h)

[1]:
[https://talloc.samba.org/talloc/doc/html/index.html](https://talloc.samba.org/talloc/doc/html/index.html)

~~~
xtrapolate
Honest question. You keep expecting all of these to be readily available for
you in C (as part of the standard). Why don't you, instead, just a use a
language/ecosystem which already offers all (or most) of these for you today?
(ie. D/Go/Rust/Nim)

> "* Reference counted allocator (basically, get libtalloc and roll it in
> [1])"

Why and how should that be standardised exactly? Memory allocation is
platform-dependent, hardware-dependent and generally case-specific. malloc()
and free() are the lowest common denominators the standard can assume,
anything beyond that is simply restrictive. If you need a "reference counted
allocator", why not just find/implement one that simply suits your needs?

> "* Coroutine standardisation would be awesome (stack swap primitives, with
> boundary checks etc)...."

Again, what makes you think this can be standardised across the infinite span
of platforms and compilation-targets, where C is often used?

> "* In fact I want SIMD. I don't need these complex types."

I'm not following your point. You're simply asking for a better abstraction
for SIMD. Also, as I'm sure you're well aware, SIMD is not available
everywhere. Wherever available, you have clear instruction-set APIs/ABIs you
need to follow to make it work. What else is missing?

~~~
buserror
I don't see your point. There's tons of stuff in C11 for example that is not
applicable to a vast majority of where C is used. Even in C99 basic stuff as
'floating point' or 'malloc' is not available on many hardware, that doesn't
stop having a standard way of using them /when applicable/.

I know there are traps to fall into -- when I see people writing floating
point code on an 8 bit AVR, I cringe, but well, 'it works'.

As far as changing language, you just answered your own question by mentioning
4 of the myriad of them that aren't ported on as many platform as C, requires
runtime of unknown quality, and also requires a body of developers that...
doesn't exist.

I've had a long enough time in the industry to have seen quite a few times a
whole bunch of software done by someone who was following the fancy trendy
language of the day, and required a complete rewrite in... C to be able to
move on from it.

Heck, _I 've_ done similarly as well, done 20+ years of C++, gradually trying
to scope down the subset of what I was using to then realize I just might be
better of with plain C -- and magic happened -- stuff still compile/work years
after they were made... And anyone/everyone can just dive in and use the
codebase.

------
maxlybbert
I found the title a little misleading: there isn’t much about where the
standard is heading or even where it’s been.

I would vote for the title “a rant on undefined behavior in C.”

——

Simple example: the article complains that unsigned integer overflow is
defined in C while signed integer overflow is not. There is very little in the
article about this except for the claim that the performance for incrementing
a signed int should match the performance for incrementing an unsigned int.
The writer refuses to believe otherwise, even though he accepts that undefined
behavior “supposedly” allows the compiler to omit overflow checks.

It’s the “supposedly” that makes this a rant. The article’s sources mention
that Clang does omit overflow checks and that the Clang team believes this
makes loops up to 20% faster (“up to” because the optimization can’t be
applied to all loops, and the performance increase will depend on how tight
the loop is, i.e., how much overhead there is in incrementing and testing the
loop variable in comparison to the loop body).

~~~
loup-vaillant
> _the article complains that unsigned integer overflow is defined in C while
> signed integer overflow is not._

The article complains that signed integer overflow is undefined, while
unsigned integer overflow is not. There's a difference.

> _the Clang team believes this makes loops up to 20% faster_

I'd love to see the benchmarks they conducted. Anyway, undefined behaviour was
the wrong way to solve the problem. The C language should have grown a `for`
loop that's more than a thin veneer of syntax sugar over `while`. Seriously,
how hard would the following be?

    
    
      for (int i = start; end)) {
          // body
      }
    

No explicit comparison, evaluate `start` and `end` at the start of the loop,
maybe cast them to the type of i, and increment implicitly. While loops can
handle the weird cases. There you have your 20%.

~~~
maxlybbert
> > the article complains that unsigned integer overflow is defined in C while
> signed integer overflow is not.

> The article complains that signed integer overflow is undefined, while
> unsigned integer overflow is not. There's a difference.

You are correct.

> > the Clang team believes this makes loops up to 20% faster

> I'd love to see the benchmarks they conducted.

It wouldn’t surprise me if they have micro benchmarks. I convinced myself they
were telling the truth based on crude instruction counting: a loop must be
converted into something like:

\- loop body (assume it contains at least one machine instruction)

\- increment loop variable

\- (for unsigned): clear overflow flag

\- test loop variable, exit loop if appropriate

\- goto beginning of loop

I believe they don’t have to actually check the overflow flag, it’s OK to let
the overflow happen. But they do have to clear the flag to avoid a spurious
error if the flag gets looked at later.

I’m no expert, so it’s possible this oversimplifies things, but removing the
instruction to clear the flag does remove a big chunk of this loop. But it’s a
big chunk _only_ because it’s a tight loop.

For the record, I happily use the foreach loop constructs in C++, D, Java, C#,
Python, Perl, etc. but I originally avoided them until I saw a comment by
Walter Bright that there is no performance penalty in D (the compiler rewrites
the loop appropriately; there may be a penalty in Java because the feature
might be defined in terms of their relatively heavy iterators).

~~~
xorblurb
Microbenchmark can be very misleading compared to real impact in real
programs. Still, the gains allowed by UB of signed overflow (when you are
lucky enough that this transformation is actually correct in the context of
what the original programmer had in mind...) are positive and probably
measurable even in real programs, or if hardly measurable, maybe they at least
permit a few percent of whole system perf improvement when using SMT
processors. But they are more suited to other programming language than C, and
actually yes, in C++ (and probably in most languages at this point) it is
better both of code readability (most important!) and performance (nice to
have, but very secondary compared to code readability) to use for each
constructs compared to maintaining an index yourself.

Technically there is no overflow flag to reset, it is just that some CPU
instruction sets do not support indexing with a 32 bit register when using 64
bits addressing, so you have to insert an extra sign extend instruction if you
want to support 2s-complement signed overflow on 32 bits indexes. So you
typically already don't have any cost if your indexes are already
size_t/ptrdiff_t, but ptrdiff_t signed overflow is _still_ UB according to the
C standard, which is also a shame, because it allows for far less interesting
"optimizations" at this point (maybe a + w >= a --> true if w is positive, but
that's actually typically dangerous, because that was historically what was
used to check for overflow at source level, and now the compiler is
suppressing all the checks!)

So all of that really only are trade-offs, and in the modern age (with e.g. a
security picture that is kind of worrying, etc.) some people are arguing that
this was a terrible idea to use this approach so carelessly, in their opinion.
Most experts now think that no non-trivial codebase exist with no potential UB
in it, so it is not just rants all around, some even are working on the
mathematical model of the llvm optimizer to make it actually sound (for now
even internally, it seems that it is not -- so unfortunately with this
approach of optimisation for now there is no mathematical justification as for
why the optimizations performed are actually correct even with the hypothesis
of strict conformance to the C standard, so I let you imagine what happens in
practice when almost no program is actually conforming...)

~~~
maxlybbert
If there are microbenchmarks, I didn’t write them. And I’ll acknowledge that
my instruction-counting approach has limits, especially since I don’t really
know the details of the platform. And my approach also doesn’t account for
pipelining.

But I would expect someone complaining about this optimization to do more than
simply hand wave with a “supposedly.” They could instead say that the
optimization can be applied when the compiler can prove x < x + 1, which it
can show when both the beginning and end of the loop are known at compile
time. In fact, I think it’s better to say “omit the pessimization that applies
when the compiler has to allow for overflow.”

But going no farther than labeling it a “supposed optimization” turns the
complaint into a standard rant.

------
justboxing
Direct link to PDF: [http://www.yodaiken.com/wp-
content/uploads/2018/05/ub-1.pdf](http://www.yodaiken.com/wp-
content/uploads/2018/05/ub-1.pdf)

------
jpfr
You are invited to bring forward concrete proposals for changes to the C
standard in the relevant committee.

[http://www.open-std.org/jtc1/sc22/wg14/](http://www.open-
std.org/jtc1/sc22/wg14/)

Since UB allows the compiler to do anything in those situations, we can reduce
the amount of UB without breaking existing “legal” code. By actually defining
more behavior.

If a proposal is reasonable needs to be discussed in the standardization
committee. All other discussions are a nice hobby. But eventually moot.

~~~
vyodaiken
Here is a start.
[https://docs.google.com/document/d/1xouelPcphQ-o7DmdSwz5UcL4...](https://docs.google.com/document/d/1xouelPcphQ-o7DmdSwz5UcL42M6bdA3t93Nm_5Hbomc/edit?usp=sharing)

~~~
jpfr
If desired, I can comment your document over a private channel. In that case,
please give a short heads-up when ready.

Note that I am not directly affiliated with WG14. So take my comments with a
grain of salt.

~~~
vyodaiken
i think you should be able to make private comments on that document.
Otherwise victor.yodaiken@gmail.com

------
nn3
My understanding is that a lot of undefinedness in the original c89 standard
came from a desire to support non 2 complement machines like Burroughs. Of
course in hindsight that was a total mistake. Yes would be a great idea to
come up with an modern update to the c standard that is specified on let's say
the level of java.

~~~
Someone
There’s also the fact that registers may be larger than the numbers stored in
them.

The best known example is that of the 80-bit floats of the 8087 FPU, but
that’s relatively rare compared to loading integers into larger registers.

For example, if you compile

    
    
       if( foo + 10 > bar) goto baz
    

to assembly similar to

    
    
       move foo to R1
       move bar to R2
       add 10 to R1
       subtract R2 from R1
       branch-if-greater-than-zero baz
    

and _foo_ and _bar_ are 32-bits, that _add_ can overflow if R1 and R2 also are
32-bits, but never overflows if R1 and R2 are 64-bit. That changes the result
of the comparison if, for example

    
    
       foo = INT_MAX
       bar = INT_MAX
    

The designers of the CPU with 64-bits registers may not want to add variants
of _add_ (and _subtract_ , _mult_ , etc) that work on 32 bits (and 16 bits,
and 8 bits).

They also won’t want to have their C compiler slowing down this kind of code,
only because other (often relatively old and slow) CPUs exist.

~~~
Asooka
But that's trivially solvable (on x86 at least). You do your arithmetic in rax
and then compare the result in eax. You've got the "mask the high 32 bits"
operation for free. At worst you'll need to do one bitwise and before using
the value for cases where the high bits being poisoned matters. Or let me add
annotations to arithmetic expressions that "this expression will definitely
not need expensive cleaning up afterwards, promise", which I will need to use
in maybe a dozen super-hot loops in the whole codebase. FWIW, we compile all
our code with -fwrapv and don't notice any slowdown.

~~~
BeeOnRope
Of course it's trivially solvable in x86-64 because it has a full complement
of 32-bit operations inherited from its x86 lineage. The GP is presumably
talking about _other_ architectures when he mentions designs that might not
want to add a full complement of 32-bit operations.

In any case, the problem doesn't really apply to x86 because all x86 compilers
I'm aware of use the "expected" size for the various types rather than larger-
than-needed-for-speed, exactly because the smaller operations are all
generally available.

------
raverbashing
It's really annoying

It seems the committee is more interested in adding optimization gotchas (that
nobody cares about for the most part) while making it impossible to write
correct code without relying on convoluted code (which denies the benefit of
"extra optimization") and "UBing everything"

C is broken, let's face it

\- Builtin strings are a joke. stdlib functions are even worse.

\- It was not developed with modern systems in mind.

\- DIY Memory management: malloc/free is like giving a kid a chainsaw to play
with. Sure, every C programmer had to work out with this crap. But yeah,
please complain how my static void * should really be a volatile char * .
Idiot. (of course it's not wrong to complain, but that's like complaining
about a faulty blinker in a car without brakes). And of course the compiler is
going to ignore the 'volatile' part of the pointer because f. you and item 7c
of paragraph 3 of the spec lets us do that, even though it is blatantly stupid
to do so.

Rust is a step in the right direction.

Making string and (memory) slices a _fundamental_ part of the language helps.
You can have null-terminated memory pointers for interoperability, but having
it as a fundamental construct eliminates several problems. Also
greenlets/threads/multiprocesses.

~~~
pjmlp
C was broken from the beginning if you go search what compiler researchers
from Algol school of languages had to say about it during the early 80's.

"Oh, it was quite a while ago. I kind of stopped when C came out. That was a
big blow. We were making so much good progress on optimizations and
transformations. We were getting rid of just one nice problem after another.
When C came out, at one of the SIGPLAN compiler conferences, there was a
debate between Steve Johnson from Bell Labs, who was supporting C, and one of
our people, Bill Harrison, who was working on a project that I had at that
time supporting automatic optimization...The nubbin of the debate was Steve's
defense of not having to build optimizers anymore because the programmer would
take care of it. That it was really a programmer's issue....

Seibel: Do you think C is a reasonable language if they had restricted its use
to operating-system kernels?

Allen: Oh, yeah. That would have been fine. And, in fact, you need to have
something like that, something where experts can really fine-tune without big
bottlenecks because those are key problems to solve. By 1960, we had a long
list of amazing languages: Lisp, APL, Fortran, COBOL, Algol 60. These are
higher-level than C. We have seriously regressed, since C developed. C has
destroyed our ability to advance the state of the art in automatic
optimization, automatic parallelization, automatic mapping of a high-level
language to the machine. This is one of the reasons compilers are ...
basically not taught much anymore in the colleges and universities."

\-- Fran Allen interview, Excerpted from: Peter Seibel. Coders at Work:
Reflections on the Craft of Programming

However, C and UNIX are symbiotic, so there is no way we can get rid of C,
while keeping POSIX based OSes around us.

So, it will stay around for many decades until quantum computers or something
else eventually takes off.

------
antirez
I've the feeling that only a language fork from the GCC/clang teams could have
any chance to success. It looks like the C standard committee is not
interested in changing point of view.

~~~
pcwalton
GCC and Clang have already essentially forked the C language. Perhaps the most
important C project—the Linux kernel—is written in the GNU dialect of C.

~~~
pjmlp
Although Google has managed to compile it with clang, as they removed gcc from
Android.

~~~
pcwalton
Right, that's the point—a useful C compiler nowadays needs to be a GNU C
compiler, not a standard C compiler.

~~~
pjmlp
To reinforce your point that is also how Microsoft decided to go on Azure
Sphere.

[https://www.mediatek.com/products/azureSphere/mt3620](https://www.mediatek.com/products/azureSphere/mt3620)

------
phkahler
I want C to not grow any more. The problem with having groups to oversee
things is that they usually feel the need to change or grow the thing they're
supposed to be watching. Sometimes this is good, but often it's not. C is the
foundation of a lot of stuff, and it'd be better to rewrite that stuff in a
better language than to try and turn C into something it's not - that's what
C++ was for.

~~~
andrewmcwatters
Well, you're in luck. A lot of people don't care to use anything beyond C89.

~~~
iforgotpassword
Oh _please_. I've been using C99 for almost 5 years now.

~~~
jschwartzi
I would never go back. And the amount of campaigning I did at my last job to
use C++11 was because all the new language features to allow compile-time
checking helped us not do dumb things at run-time.

Except for the stupid TI compiler, which is still stuck on C++03.

~~~
pjmlp
Or C++98, depending on the chip, if some of their docs are still up to date.
:\

------
armitron
I can't be the only one that 99.9% of the time doesn't care _one iota_ about
these mythical optimizations that compilers can introduce by exploiting
undefined behavior. I just want to write straightforward, predictable code
that tries its best to be safe but for one reason or another have to stick
with C.

Is there a guide/reference on how to disable these optimizations in modern
compilers? A list of GCC/Clang arguments that disable as much of this as
possible would be greatly appreciated. I've seen a lot of posts and articles
discussing C undefined behavior but almost nothing describing how to counter
it.

~~~
pcwalton
Sure. Compile at -O0.

A huge number of seemingly-trivial optimizations depend on assuming that
undefined behavior will never happen. The number of optimizations that don't
depend on UB _in any way_ is quite small. For example, if you want to get
pedantic about it, even automatic promotion of local variables to registers is
exploiting undefined behavior—who's to say you didn't have a pointer that just
happened to point to one of them?

~~~
esrauch
Most languages have no concept of undefined behavior and they still have
compilers and runtimes that can do a huge amount of optimizations.

~~~
xenadu02
Because they define-away the scenarios where UB is an issue for most code.
Sometimes they do this by having a GC, but Rust and Swift prove that isn't
necessary.

Such languages typically do provide some form of "no promises, here there be
dragons" in the form of unsafe blocks or functions. Restricting this region of
unsafety is a huge benefit for programmers and compilers. Programmers because
relatively few pieces of code need to be carefully reviewed. Compilers because
most of the program contains little or no UB.

As a typical example, in C you can alias and type pun. Therefore to avoid all
UB the compiler would need to be extremely careful in any function containing
a pointer. You could return a reference to a local through a separate opaque
function call, or receive multiple pointers all aliasing the same memory. To
completely avoid all UB means inserting checks after every potentially mutable
machine instruction, or emitting duplicate function bodies that take different
paths when the pointers alias; that's assuming the compiler even has enough
type information to answer the question!

With typical C#, Rust, Swift, etc you just can't cause those kinds of problems
without deliberate subterfuge and use of Unsafe types or blocks.

~~~
eddyb
Note that `unsafe` code blocks (or having some unsafe primitives)
fundamentally results in some kind of UB in the language as a whole, and you
_totally_ can create tons of problems from it, since the rest of the language
has invariants it can't _itself_ violate, but the unsafe code can (potentially
much easier than C, if there are more invariants) - that is the "UB", and the
invariants that can be violated are what the compiler optimizes based on.

Lest we forget, _any_ language with a C FFI capability must have some notion
of UB, because the FFI _effectively includes_ the C code in its own semantics
(unless fully sandboxed, which may be too expensive to be done).

~~~
pjmlp
True, but they are quite easy to track down.

In systems like Unisys ClearPath, you can even configure the system such that
only admins can execute applications with unsafe blocks.

In C any line of code, if care is not taken to use the correct compiler flags,
can be a possible source of UB.

~~~
umanwizard
> In C any line of code, if care is not taken to use the correct compiler
> flags, can be a possible source of UB.

What do you mean?

~~~
pjmlp
You need to turn on all warnings as errors, static analysers and pedantic
modes depending how each compiler allows them.

ANSI C11 has 200 documented cases of UB, and each compiler might have
additional cases, are you sure you can know all of them by heart while looking
at a random line of C code?

------
Paul-ish
How much of the UB in C is carried over to C++? Does the C++ standards
committee carry a lot of the UB over into C++, or try to fix the issue? If C++
doesn't have the same UB, maybe the solution is to compile C code with a C++
compiler.

~~~
jcranmer
The intent of C++ is that the subset that is valid C code ends up being
largely semantically identical in C++ and C. There are some differences,
particularly related to the type system (for example, in C, a ternary
expression is an rvalue, where it can be an lvalue expression in C++). Most,
perhaps all, of the undefined behavior is retained C++.

There is potentially a case where C eliminated undefined behavior that C++
retained--the "union trick" for getting access to the bits of a float was made
legal in C99 but the C++ language (modified to support unrestricted unions)
suggests that it is undefined behavior in C++.

------
cryptonector
Strict aliasing needs to go out the window. That's for starters.

~~~
burfog
That may be. It sure is broken. What they did:

Things of similar base type (so ignoring "signed", "const", etc.) may alias.
Everything may alias with "char".

What would make more sense:

Any unsigned integer type may alias with any type of the same size; this is
not transitive. (so a floating-point type may not alias with a pointer type,
but there likely are unsigned integer types that may alias with both) Structs
may alias other structs from the beginning until the point at which the types
of their content diverge.

Even with that better default, you'd still want easy ways to override it in
both directions.

------
jancsika
Naive question:

Are there any compiler optimizations that -Ox makes possible which couldn't
have been written originally as readable C code?

~~~
BeeOnRope
For sure.

Consider idiom recognition, where a compiler takes a series of operations that
implement some operation that isn't expressible as a language primitive and
turns it into a single machine instruction that performs that operation.
Rotate built up from a couple of shifts for example.

Consider auto-vectorization for which the language has no direct support.

Consider inlining and value-propagation and all the simplification can that
occur when you combine them.

Basically compile any non-trivial C program without optimization (or without
some interesting optimization option) and with optimization, and look at the
assembly. In many cases you wouldn't be able to write C code at all to
reproduce the optimized version with the not-optimizing compiler. In other
cases you could, but at the cost of code duplication or other things that
would reduce the quality of your code.

------
esaym
I didn't know the C std was still being modified/enhanced. Interesting. Does
sound scary.

~~~
andrewmcwatters
It hasn't been updated in any meaningful sense beyond C89, though.

~~~
pjmlp
C'mon it even has some kind of primitive generics support nowadays, which
other modern languages keep ignoring.

~~~
platinumrad
C11 generic selection is basically not worth using and almost worse than not
having "generics" at all. It exists so tgmath.h can be written without
implementation magic but tgmath.h is kind of a garbage fire itself.

------
cryptonector
Faintly?!

------
mabynogy
We need a -Osafe flag.

------
backpropaganda
[https://github.com/ziglang/zig](https://github.com/ziglang/zig) and Jonathan
Blow's JAI language seem to be contenders for C replacement.

~~~
bokglobule
Arguably the Rust language is a replacement candidate for 'C' (and C++).

~~~
1bent
I'm studying Rust, though wouldn't claim I know it yet. I believe Rust is an
effort to create a language in which you can write code that's as fast and
portable as C; and yet, for which, the compiler can help you coordinate a
"single owner permitted to write" strategy for thread safety -- at the expense
of increased bother and fuss coaxing your program into that pattern. I
wouldn't call it a C replacement; while I'm glad to have Rust available, I
prefer concurrency (when I need it at all) via processes, rather than threads,
with no shared memory.

~~~
kbenson
> while I'm glad to have Rust available, I prefer concurrency (when I need it
> at all) via processes, rather than threads, with no shared memory.

Are you implying that Rust can't do multi-processing through processes?

~~~
1bent
I don't know yet.

I've heard about at least two languages that adore threads, and have issues
with multi-processing. Perl6 has threads in its runtime helping its GC --- and
calling fork kills its programs. I don't know yet if rust has threads
_required_ in a fashion that makes it crash if you try to fork().

perl6 cannot fork; and golang behaves badly if you fork in a program that uses
goroutines. Ends up unhappy, since shared library manipulation is kinda
dependent on the process loader. Oops.

I'm tempted to hope that the Rust lang devs have kept use of threads out of
its required runtime. To answer your question, I didn't mean to imply that
rust cannot call fork(2), and I hope it isn't true. I meant to say that I
wouldn't call Rust a C replacement; it's an effort to solve a problem (threads
with shared heap or stack) that C flat doesn't attempt to consider. I love C,
and feel obliged to learn Rust.

~~~
nicoburns
Rust has no more runtime than c. Threads are implemented in the standard
library, but they'll only be used if you explicitly create one (or a library
you are using does). Fork works fine :)

Rust has a very different design philosphy to c, and in that respect you are
right that it isn't a complete replacement. That said, Rust is good for far
more than just multithreading. It solves memory safety as well as tgread
safety.

~~~
steveklabnik
> Fork works fine.

It _can_ work but it’s quite unsafe. The standard library doesn’t make any
guarantees about fork safety. The language itself doesn’t understand what fork
is, and so can’t guarantee anything.

------
pm24601
Wow... I remember a simple, elegant language. The C, I knew had close to a 1-1
mapping to assembly. The compiler didn't overthink things.

How can you muck this up!

I thought it was C++'s job to turn something simple/elegant into an
indecipherable mess!

