
A perspective on friendly C - ingve
http://www.philipreames.com/Blog/2016/01/03/a-perspective-on-friendly-c/
======
pcwalton
Fantastic article. This nails down in a very precise way what I've always said
whenever the "compilers should just stop doing optimizations that rely on
undefined behavior" meme comes up here. You depend on undefined behavior for
performance; you don't realize you're depending on it because the
optimizations "just work" most of the time; the reasons they work is due to
undefined behavior, but it's easy to overlook that and blame compiler
developers for being language lawyers when in fact GCC and LLVM are doing what
you want them to do.

~~~
Analemma_
Fair enough, but this isn't exactly an argument in C's favor. If you're saying
"But we _need_ this unsafe behavior for performance!", that only adds to the
urgency to abandon C as soon as possible in favor of things like Rust, which
should eventually have all of the speed and none of the UB insanity.

~~~
valarauca1
>which should eventually have all the speed and none of the UB insanity

Judging by the benchmark game [1]
[https://benchmarksgame.alioth.debian.org/u64q/rust.html](https://benchmarksgame.alioth.debian.org/u64q/rust.html)
Rust has almost caught/eclipsed C. The biggest issue for the marks its behind
on are stabilized SIMD support in Rust (nightly has it but its not finalized
on an RFC level).

[1] Look I'll save you the comment. Yes the benchmark game isn't always
indicative of real world performance. Its a bunch of small micro benchmarks
that demonstrate basically how quickly Regexs/Hashing/Math are in each
language. While that's not everything you do when programming its a decent
chunk.

~~~
bluejekyll
While I completely agree with migrating towards Rust, these games are a little
misleading. I can't find the source, but when I was reading some of the
authors comments in the past, it sounded like there was unsafe code used.

Does anyone have a link to the source used in these tests?

~~~
igouy
_Hiding in plain sight._

The program names on the website are links to the source code, the task names
are links to measurements for the task, _etc etc_

[http://benchmarksgame.alioth.debian.org/u64q/measurements.ph...](http://benchmarksgame.alioth.debian.org/u64q/measurements.php?lang=rust)

~~~
bluejekyll
Wow. I feel like an idiot.

~~~
igouy
And that is not what I want looking at the website to make you feel!

However, underline for all link text makes a distracting mess. Hmm.

------
kazinator
Start by removing _gratuitous_ undefined behavior from the dialect: undefined
behavior which has not reason to exist because it arises from of the lack of
requirements with regard to some translation aspect of the language, and not
anything related to a run-time difficulty.

For instance, the unspecified order in which function arguments are evaluated,
or the operands of most operators, or initializers in a declarator and such.
In combination with certain uses of side effects, this creates undefined
behavior (and even when not undefined, it can create surprises and bugs). This
is purely a C semantics problem, not connected to the way the machine works,
or the large scale way in which programs and their data fit together.

And let us note that _even if evaluation order is pinned down by the language
semantics, the compiler can still re-arrange it anyway, when it can recognize
opportunities in which re-ordering evaluation makes no difference (the result,
including all effects, are still as if it had been left to right)._

Contrast that with, say, bounds checking arrays, where the neglect to do so is
justified by the fact that pointers don't carry the size information, since
they are mapped to machine addresses, which is a translation decision that has
deep, non-localized semantic implications. Wider pointers don't fit into a
register, which affects how they are passed between functions. They take up
more space in every data structure which has pointers, and so on. There is a
tangible, external difference, and performance impact.

A reasonable dialect in C should sacrifice safety only when there is a
tangible performance issue.

Undefined behaviors at translation time (particularly in the preprocessor!)
should be completely banished.

Here is one: there is no reason why a preprocessor token pasting operation (A
## B) which results in an invalid token should be undefined behavior. The
execution of the preprocessor is quasi cost-free. Whether or not two elements
pasted together to form a valid token can be checked, and diagnosed. The few
preprocessor cycles which that requires are worth it!

The fact that it's not practical to banish or diagnose _all_ undefined
behaviors (nor desirable or reasonable to do so, since UB is an area for
useful extensions) shouldn't be used as an excuse not to eliminate some of the
silly ones.

~~~
kevin_thibedeau
The undefined behavior is there so that the language can be used across
heterogeneous systems. 9-bit chars, 36-bit words, 1's complement arithmetic,
decimal registers, NULL somewhere other than address 0? You can have all that
with standard compliant C.

~~~
marvy
Fine, so keep those aspects. There still remains plenty of relatively
pointless undefined behavior. For instance, is the following well defined?

int f(int* p, int* q){return ++ _p + ++_ q;}

It depends! It depends on whether p == q. If so, it is not defined, and on
high optimization levels, compilers will tend to give the wrong answer, but on
low levels not so much. I bet a lot of people (but probably not all) would be
willing to take the speed hit so that this becomes well defined, as it is in
most other languages. (Note: I haven't fact checked anything I wrote here; no
doubt some language lawyer will correct me if I'm wrong.)

~~~
kazinator
We don't even have to take the speed hit. Why? Because here is how the
unspecified evaluation order benefits the above scenario: the compiler doesn't
have to reason about whether p == q.

But since C99 we have had a tool by which we can tell the compiler, "trust me,
p != q". Namely:

    
    
      int f(int * restrict p, int * restrict q) { ... }
    

So with this, even if we have strict left to right order, it can still reorder
the code the same as before.

------
scott_s
Regehr recently wrote a response to his own proposal, called "The Problem with
Friendly C",
[http://blog.regehr.org/archives/1287](http://blog.regehr.org/archives/1287).
HN discussion:
[https://news.ycombinator.com/item?id=10786512](https://news.ycombinator.com/item?id=10786512)

~~~
StefanKarpinski
That post is linked to in the first sentence of this post.

~~~
scott_s
Somehow I missed the reference to the follow-up, but I mostly wanted to link
this HN discussion to the one from two weeks ago.

------
twic
This seems like a crazy conversation to be having in 2016. We're seriously
considering what semantics to attach for illegal operations? How about doing
what every other language of the last couple of decades does and just not
allow them?

> I doubt anyone is willing to pay 2x performance for their C code to be more
> friendly. If they were, why are they writing in C?

Beats me why anyone is writing in C full stop.

~~~
pcwalton
> How about doing what every other language of the last couple of decades does
> and just not allow them?

I fully agree. But this is not without tradeoffs: eliminating undefined
behavior pretty much means (a) the performance and runtime overhead of a GC
(most languages); (b) forbidding malloc (verified "mission-critical" variants
of Ada, C, etc.); (c) requiring programmers to learn a lot of new concepts
(Rust†). I do think that there is rapidly becoming little reason to use C
except for throwaway programs, and as an industry we need to be more open to
(c) if we are ever going to move beyond making the same memory management
mistakes we've been continuously making since the 1960s. But I also understand
the reasons why programmers continue to choose C and C++.

† It's interesting to me that the biggest reason C++ programmers give for
bouncing off Rust is fundamentally "I want my undefined behavior back", though
very few actually word it like that.

~~~
rbehrends
This is really overkill. When I'm writing C, it's generally for one of two
reasons:

1\. Interoperability code with some library or the OS.

2\. C as the closest thing to a portable assembler there is (e.g. to implement
something like the Ruby interpreter).

What I need for this isn't perfect safety; it's the ability to reason with
some confidence about the code I'm writing (I may want to be able to rely on
code review by merely mortal programmers or create my own tools to enforce
this). If I wanted a safe, expressive language, I wouldn't use C in the first
place.

But right now even something as simple as a malloc implementation is riddled
with traps even for reasonably skilled programmers. The Linux kernel uses
-fno-strict-overflow and the FreeBSD kernel uses -fwrapv because the price to
be paid by enabling the undefined behavior of -fstrict-overflow was too high.

~~~
pcwalton
> 1\. Interoperability code with some library or the OS.

If by "interoperability" you mean "binding a more modern language to a C
library", no argument there. Binding to C is a legitimate reason to write C.

> 2\. C as the closest thing to a portable assembler there is (e.g. to
> implement something like the Ruby interpreter).

My question is: why do you want a portable assembler? Why not write Ruby in,
for example, Go? It could certainly be done; look at JRuby, for instance.

The vast majority of the time, the answer to this question is "performance".
Which brings us back to the point of the article. Without compiler
optimizations enabled by undefined behavior, you can easily lose 2x
performance or more. A 2x performance loss for MRI is unacceptable.

Malloc implementations embody this even more. In fact, it's kind of hard to
think of any piece of code that's more performance critical than malloc. Many
large applications (apps, games) spend 10%-20% of their time in the malloc and
free functions. Allocator performance is so important that Facebook and Google
have invested a huge number of man-hours into these two routines (jemalloc and
tcmalloc respectively). It all comes down to performance again: under these
extreme contraints, malloc simply can't afford a 2x performance loss. A
Friendly C that produced slower code would have little chance of being adopted
by allocator writers.

~~~
rbehrends
> My question is: why do you want a portable assembler? Why not write Ruby in,
> for example, Go? It could certainly be done; look at JRuby, for instance.

I would in principle like to do this in another language. Go isn't that,
because Go's runtime makes some very specific assumptions (about things like
stack layout and how it interoperates with syscalls). I may not want to be
weighed down by these assumptions. For similar reasons, you may want to eschew
JRuby, as it locks you into the JVM ecosystem (which is great if that's what
you need, not so great if you don't want to suffer from the poor interop with
non-JVM libraries).

> The vast majority of the time, the answer to this question is "performance".

The answer for me is generally that C is ecosystem-agnostic (what I'd expect
from a portable assembly language; practically any other language that isn't
called BCPL makes more ambitious assertions about its environment). Note that
when I talk about interoperability above, I don't just mean interoperability
between one language and C, but also to build bridges between two non-C
languages (which is actually fairly important for some of my current work).
Performance is a nice additional benefit.

> A 2x performance loss for MRI is unacceptable.

We have a few points to chew through here. First, I didn't say that undefined
behavior is unacceptable. My problem is with undefined behavior that is
difficult to reason about or to investigate (as an extreme case, when an
infinite loop is turned into a no-op).

Second, I don't buy the 2x performance loss. What makes most bytecode
interpreters slow is how C compilers optimize the dispatch loop (poorly), not
the exploitation of undefined behavior or the lack thereof. For example,
-fwrapv has virtually no effect on Ruby's performance (even though clang/gcc
unnecessarily disable strength reduction in some cases where they don't have
to). Ruby would likely benefit a lot more from having its bytecode dispatcher
rewritten in assembler LuaJIT-style (I'm talking about the LuaJIT interpreter,
not compiler) than it would lose from not exploiting undefined behavior on a
large scale.

Finally: code that crashes really fast (or may suddenly start crashing for
inexplicable reasons because the compiler was upgraded) is not going to help
anyone. This is not a hypothetical concern; you'd be surprised how much C code
doesn't survive an encounter with UBSAN.

------
munificent

        int foo(int* p_int, float p_float) {
         int a = *p_int;
         *p_float = 0.0;
         return a - *p_int;
        }
    

Should that be `float* p_float`?

------
helmut_hed
_I doubt anyone is willing to pay 2x performance for their C code to be more
friendly. If they were, why are they writing in C?_

To me this is the fundamental issue with the friendly/boring C proposals. The
whole reason to use C/C++ anymore is for performance, in return for which you
are responsible for certain things - like using only defined behavior.

If you are writing e.g. crypto code, why not use a higher level language that
provides more checks and guarantees, and is generally easier to reason about?
Or segregate the performance-critical "engine" type code into C++ and use
something higher level for everything else?

~~~
rixed
Crypto may be one of the fields where you need tight control over speed of
execution to alleviate timing attacks. The mere presence of a GC can be
exploited.

------
nkurz
_As a simple example, let’s consider trying to establish semantics for stray
(i.e. out of bounds) read and writes. We can start by trying to define what
happens for a stray read. That’s fairly easy, we can simply return an
undefined value._

I don't disagree with this piece, but I think it's missing the thrust of the
"friendly C" argument. The desire (at least I feel it) is simply for the
compiler to "do what I mean", rather than making unexpected optimizations
based on undefined behavior. There is no expectation that the resulting
language will be "safe" as the word is normally used. If one wants safety (and
can accept the performance compromise that this entails) then there are
alternative languages to choose from.

Instead, "friendly C" has much simpler ambitions: replace undefined behavior
with compiler specified behavior. The spec could simply guarantee that if you
read an out-of-bounds value, that the compiler will generate assembly that
attempts to read the value at that address. Having the program segfault on the
attempt is perfectly acceptable. The only thing that is not acceptable is for
the compiler to reason that the read will be out of bounds, and thus decide to
omit the error checking code that follows.

~~~
pcwalton
> The spec could simply guarantee that if you read an out-of-bounds value,
> that the compiler will generate assembly that attempts to read the value at
> that address

You didn't understand the example. Your proposal destroys load-load
forwarding, as the article demonstrates.

~~~
nkurz
_You didn 't understand the example._

Certainly this may be true, but I've re-read it again with your prompting and
still think I understand it.

 _Your proposal destroys load-load forwarding, as the article demonstrates._

I'm fine a compiler that optimizes out the write to a value on the stack on
the assumption that this can't be the same address as a passed pointer,
whether that parameter is an int or float. I'm fine with a compiler that
creates assembly that simply returns 0 here, with or without 'restrict'.

That you and the author think this example represents a conflict with John's
"friendly C" proposal makes me more certain that you don't mean the same thing
by "friendly C" as I do, or as I think John does. Really, I think we're asking
for something quite different, and much simpler to achieve.

~~~
pcwalton
Are you proposing that compilers emit actual load instructions for every
memory access you write in your code? That is, preventing the compiler from
optimizing out memory loads? I can't read your comment any other way, but I
may have missed something.

In any case, if you require that a compiler emit an actual load instruction
for every memory access then you've destroyed SROA, which is one of the most
important optimizations, especially in C++. Without SROA C++ can easily be
4x-5x slower.

~~~
nkurz
_Are you proposing that compilers emit actual load instructions for every
memory access you write in your code?_

No, if the compiler can reason that that there is no aliasing between
pointers, there is no need to emit a load. Reordering operations so a value
remains in registers is desirable, even if an out-of-bounds write would
otherwise have had the side-effect of changing the value.

The problem (in this context) is limited to optimizations that reason that the
value of the read is "undefined" (as opposed to "implementation defined"), and
that the compiler no longer has any obligation to be faithful to the source
code for any actions that follow.

For example (conceptual rather than exact) if I set all the bits in a region
to 1's, and then somehow manage to write an unaligned load from within this
region, I do not want the compiler to reason that this value is undefined and
that all consequent code can be ignored.

I'm fine with a compile time error, and I'm fine with generating assembly that
attempts the load and uses the result. Getting SIGBUS for the load on some
processors is perfectly acceptable. I'd probably even be OK with skipping the
load and using a constant, although I'd hope this would at least be
accompanied by a warning message.

What I don't want is for the compiler to silently omit the load and all the
rest of the code within that function on the theory that once undefined
behavior has been encountered it no longer has any obligations, not even the
obligation to inform the programmer that it has perversely optimized the
function to nothingness.

(And while most of the lack of clarity is probably my own fault, I do now
realize that I was being thrown off by an unfortunate typo in the original
post: presumably he meant for the second argument to be a 'float *' rather
than a 'float'. Please ignore my comment about 'stack variables' higher in the
thread.)

~~~
pcwalton
> No, if the compiler can reason that that there is no aliasing between
> pointers, there is no need to emit a load.

Without strict aliasing? Then you destroyed a bunch of optimizations that
lower for loops to memset. 4x performance loss if AVX would have been used in
memset.

See: [http://blog.llvm.org/2011/05/what-every-c-programmer-
should-...](http://blog.llvm.org/2011/05/what-every-c-programmer-should-
know.html) ("Violating Type Rules")

> What I don't want is for the compiler to silently omit the load and all the
> rest of the code within that function on the theory that once undefined
> behavior has been encountered it no longer has any obligations, not even the
> obligation to inform the programmer that it has perversely optimized the
> function to nothingness.

No, you need that optimization too. After aggressive inlining, especially of
templated code like the STL in C++, it's often the case that large swathes of
code are never dynamically reached, and people want that dead code to be
removed. However, the only—or at least the easiest—way the compiler can tell
that this is the case is by observing that, for example, a null pointer would
have to be dereferenced to get to that dead code. Therefore the compiler is
doing you a favor by eliminating that obviously dead code.

Again, all of these optimizations do not stem from compiler authors being
language lawyers for the fun of it. They stem from their customers filing bugs
on real-world missed optimizations.

~~~
nkurz
_Without strict aliasing?_

Generally, I think strict aliasing is a fine thing. It would have been nicer
to have "restrict" be the default and "alias" being an optional specifier, but
likely there was no way to do this without breaking too much existing code.
Warnings are often nice, but I don't recall having any issues with compilers
optimizing based on assumptions of lack of aliasing.

 _especially of templated code like the STL in C++_

I'm an opinionated C programmer, but have no opinion of how C++ should handle
things. I'm only considering the impact on C. If the STL requires an
unfriendly C++ compiler, I'm OK with that, but this doesn't affect my thinking
about C compilers.

 _However, the only—or at least the easiest—way the compiler can tell that
this is the case is by observing that, for example, a null pointer would have
to be dereferenced to get to that dead code._

Or, for example, that a signed integer addition would have to overflow. Or a
full-width shift. Or a variety of other things that have a might make sense in
both the programmers mental model and on the processor on which the compiled
program will be run.

 _Therefore the compiler is doing you a favor by eliminating that obviously
dead code._

Do you really believe that the compiler is doing you a favor by silently
removing a check for signed integer overflow that you explicitly included? If
so, we have found (at least one of) the point(s) of our disagreement!

If on the other hand you feel that removing security related error checking is
an unfortunate negative consequence of a necessary optimization, perhaps we
can find a way to mitigate the consequences?

~~~
nkurz
Here's a timely example of the behavior I'm referring to:
[https://github.com/websnarf/bstrlib/issues/8](https://github.com/websnarf/bstrlib/issues/8)

I'd be happy to concede that this is not good programming practice, but I do
not accept that the compiler is doing us a favor by silently creating this
security problem. I do appreciate that gcc at least offers a warning for some
of these cases.

------
foxhill
i believe the premise for a "friendly C" (wether attainable or not) is one
such that, existing C programs can be compiled with a new compiler, which
could, in principle, prevent certain classes of bugs.

designing a new language somewhat hinders this - unless some sort of source to
source translator tool could be produced, perhaps.

in my opinion, C should roughly stay the way it is. we have higher level
languages which provide a lot more safety, and we have the clock cycles
available to absorb the reduced performance that they may incur. more
importantly, compiler implementations for these languages can continue to
reduce the gap between "high level" and "native C".

~~~
pjmlp
The thing is, such languages already existed outside AT&T before C had any
value outside UNIX.

------
exDM69
The "2x slower" is an extremely conservative estimate if all loads and stores
were bounds checked. Given the indirect effects on program optimization, it
might be closer to 10x for practical programs.

------
stcredzero
Let's say someone developed a variant of Go with manual memory management. How
far would this be from a "friendly C"?

EDIT: _" I doubt anyone is willing to pay 2x performance for their C code to
be more friendly. If they were, why are they writing in C?"_

Because it's friendly? Because the cost/benefit in terms of avoided bugs and
productivity is worth it to the company/shop.

~~~
pcwalton
> Let's say someone developed a variant of Go with manual memory management.
> How far would this be from a "friendly C"?

Depends on what the semantics of that hypothetical Go variant are. How would
that Go variant solve the load-load forwarding problem described in the
article?

> Because it's friendly? Because the cost/benefit in terms of avoided bugs and
> productivity is worth it to the company/shop.

I think the point the author was making is that if you are OK with that
performance hit you're probably OK with Java, which banishes undefined
behavior in the C sense entirely.

~~~
stcredzero
I could see someone being ok with 100% overhead, but not with 400%. Last I
checked (which was years ago) hitting 50% native speed in a JIT managed
language was an admirable feat and not guaranteed.

------
dmitrygr
> In the example above, your normal C compiler could return “0” because it
> assumes the intervening write can’t change the value at p_int

not unless you use the restrict keyword...

~~~
pcwalton
Strict aliasing. The article is correct.

~~~
dmitrygr
You're right. You'd have to change int to char to make it valid

