
What Every C Programmer Should Know About Undefined Behavior (2011) - arunc
http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html?m=1
======
bhaak
Some of the "performance penalties" look like FUD to me. At least to the
extent implied in the article.

For example "use of an unitialized variable". Given that the compiler already
has to analyze the program flow to determine if the variable could be use
unitialized to output that warning, the optimization step to not initialize
those variables that are written to before used doesn't seem so far away. This
happens at compile time.

But, there's also the fact that a modern OS gives you heap memory zeroed
mostly for free. A call to calloc is highly optimized nowadays. It needs to be
for security reasons, as a program must never receive memory with data left
from a different process.

Likewise the signed integer overflow check. Modern CPUs are so superscalar
internally that an overflow check (which doesn't get flagged most of the time
anyway) doesn't add anything to the execution speed.

And finally we are able to use that without resorting to assembler. GCC 5 has
added support for built-in functions for arithmetics with overflow checking.

~~~
haberman
> But, there's also the fact that a modern OS gives you heap memory zeroed
> mostly for free. A call to calloc is highly optimized nowadays. It needs to
> be for security reasons, as a program must never receive memory with data
> left from a different process.

Memory often is coming from previously-freed heap space within the process.
There is no security need to zero such memory. Zeroing it wouldn't even help,
because the process is free to read the memory before malloc returns it.

And calloc, no matter how optimized it is, will still be more expensive than
malloc which doesn't have to zero the memory.

> Modern CPUs are so superscalar internally that an overflow check (which
> doesn't get flagged most of the time anyway) doesn't add anything to the
> execution speed.

"Doesn't add anything" is not correct. These checks are not free:

[http://danluu.com/integer-overflow/](http://danluu.com/integer-overflow/)

[https://news.ycombinator.com/item?id=8765714#up_8765979](https://news.ycombinator.com/item?id=8765714#up_8765979)

~~~
barrkel
Zeroed memory should not be particularly expensive on modern machines, because
we have a lot of parallelism available to zero it off working threads. There
might be some constraint on memory bandwidth, but IMO memory latency is a
bigger issue, usually preventing maximizing memory throughput, so there should
be some spare.

The danluu link doesn't really support your argument. Yes, the checks aren't
free, but they aren't very costly either in practice, and much of the cost is
down to C compilers not being optimized for generating range-checking code.

I feel you're also possibly missing a longer term feedback loop issue. When
software starts relying on things like overflow checks and zeroed memory,
hardware manufacturers like Intel start building and optimizing hardware to
make them faster.

The "prefer performance over correctness" bias that's existed in the C world
has been responsible for a lot of damage. We need to get out of it.

~~~
Narishma
Zeroing memory is certainly not free, and if you're doing some types of real-
time applications, can have a significant cost.

[http://randomascii.wordpress.com/2014/12/10/hidden-costs-
of-...](http://randomascii.wordpress.com/2014/12/10/hidden-costs-of-memory-
allocation/)

~~~
ArkyBeagle
But if zeroing memory is necessary for the program to be correct, hang the
cost. I don't know what sort of "real time applications" would require
gambling on the initial state of memory.

If you really can't afford to use dynamic memory, don't. 'C' will cheerfully
allow you to hog as much as you want outside the heap.

~~~
shadowfox
> But if zeroing memory is necessary for the program to be correct, hang the
> cost.

Sure. And your application should bear that cost and not necessarily every
application written in C. Maybe there is a case for using a safer malloc
implementation for your program?

> I don't know what sort of "real time applications" would require gambling on
> the initial state of memory.

This isn't really gambling. If you know you are going to initialize the memory
with sane values immediately (and this is a common practice), it is a waste to
do that initial zeroing.

~~~
ArkyBeagle
Ah - in that case, I'd agree. I couldn't draw to that conclusion from the text
provided - that's just as good as zeroing ( and, as you say ) renders the
zeroing moot.

------
to3m
Obligatory links: [http://blog.metaobject.com/2014/04/cc-
osmartass.html](http://blog.metaobject.com/2014/04/cc-osmartass.html),
[http://robertoconcerto.blogspot.co.uk/2010/10/strict-
aliasin...](http://robertoconcerto.blogspot.co.uk/2010/10/strict-
aliasing.html) (I post these every time this subject crops up. TL;DR - stand
up for yourself, and don't let your C compiler treat you like shit.)

~~~
userbinator
See also the "maybe it's not the programmers who are being stupid, it's the
language that is in need of change" proposal at
[http://blog.regehr.org/archives/1180](http://blog.regehr.org/archives/1180)
with discussion
[https://news.ycombinator.com/item?id=8233484](https://news.ycombinator.com/item?id=8233484)

~~~
lmm
There are better alternatives to C if a friendly language is what you want. C
has its niche and I don't see C changing, and I've yet to see a "dialect" be
really successful. If you're willing to switch to "friendly C", why wouldn't
you already be switching to Rust, or D, or even Haskell?

~~~
pascal_cuoq
> If you're willing to switch to "friendly C", why wouldn't you already be
> switching to Rust, or even Haskell?

Because you are already using it. You, the median C developer, are already
writing code assuming that it will be compiled by a friendly C compiler.
Friendly C is one of the most widely used programming languages on the planet
as of 2015. The most widely used SSL/TLS stacks are written in friendly C.

The “friendly C” manifesto is not intended for C developers, an amazing
proportion of whom already write friendly C. The manifesto is aimed at C
compiler developers.

Friendly C just needs a compiler, or perhaps several of them.

~~~
to3m
Yes. To quote one of the links I posted: "The tl;dr version of this is that
the C compiler writers have no idea who their users are, so they keep fucking
up working code [...] All of this only hurts the systems-level programmer (and
I claim is of dubious value to the high-level guys). Which is why I insist
that the compiler writers don't know their own customers. C is a systems-
programming language! Why do you want to screw the systems-programmer to help
the high-level programmer, when the high-level guy has long since moved to C++
and is bumbling around with STL?"

That whole post is worth reading. Though something of a tossed-off ranty blog
post, I think it very accurate.

~~~
pascal_cuoq
> Though something of a tossed-off ranty blog post

Touché.

~~~
to3m
Think of it not as a criticism, nor a suggestion that there is anything wrong
with its author writing it that way. It's simply an admittance in advance that
this thing I'm recommending might come across, on first reading, as more
easily dismissed than I think it should be.

Perhaps I phrased it poorly, though. My first go sounded wishy-washy, so I
rejigged it... I could well have over-tightened things.

------
iopq
Anyone know any other websites that talk about surprising things about C? I
saw a really good one about syntax and semantics of C, but I lost it.

~~~
GregBuchholz
You may like:
[http://www.gowrikumar.com/c/index.php](http://www.gowrikumar.com/c/index.php)

...and "Expert C Programming: Deep C Secrets":

[https://www.google.com/search?q=expert+c+programming+deep+c+...](https://www.google.com/search?q=expert+c+programming+deep+c+secrets)

------
spott
I'm actually curious: is there anything in C that would prevent some sort of
annotation being made part of the standard that essentially meant "Do what I
say"?

For example:

    
    
        [[dwis]]
        {
        if ( signed_int + another_signed_int < signed_int ) {
            // do stuff if there is an overflow
        }
        }
    

Or something similar, which wouldn't be optimized away?

------
youngthugger
This makes C look like it's full of land mines. The minesweeper language.

~~~
jokoon
It's not the 80s anymore, there area many cases where there are much better
alternatives to C.

But it's still a language you need for certain things, like responsiveness or
real time applications. Like linus torvalds said it, C is like a more
comprehensive assembly generator than anything else.

Having such a language allows you to change every little boring details to do
exactly what you want. It has its use, and it's surely not intended to be
mainstream as of today. But there are many cases where C comes back as a
constant common denominator.

It's sure that it's kinda shitty that there's nothing really better than a
language designed in the 70s, but I guess you could either blame language
designers for not making anything better. There also are business/political
issues and preferences for certain things that comes into play.

All in all, C just seems to be pretty good when you want to have code that
works well with systems. Maybe there's just inertia as to where computer
development is headed, which might be heavy related to C.

All I wish, is someday to be able to learn some more "virtuous" language like
haskell, scheme, or smalltalk.

~~~
pjmlp
The 80's already had Mesa and Modula-2, but then those guys at Berkley had to
create a few successful startups.

------
zvrba
He forgot one: modifying & using a variable between sequence points (e.g., i =
i++ + i++;). EDIT: not forgot, but only mentioned briefly at the end.

