
Who Says C is Simple? (2010) - StylifyYourBlog
http://www.eecs.berkeley.edu/~necula/cil/cil016.html
======
copsarebastards
All of the examples here are _really_ horrible code. This is the second
article in a few days on Hacker News to list out a few examples of how hard C
is. And for little reason; there's _absolutely no value_ in being able to
write something like:

    
    
        return ({goto L; 0;}) && ({L: 5;});
    

It probably has a bug, will be hard to debug, and isn't more performant than
writing it in a clearer way. And unfortunately, while the examples here are
probably all contrived, there are plenty of real-life cases where code as bad
as this gets into production systems.

So why are we still writing code like this?

The answer is reverse compatibility. Not just of compilers, but of tools and
skillsets: people are unwilling to support multiple versions of C and want
their code to run forever.

Objective-C and C++ do things to add functionality to C, but they don't remove
the functionality of C that allows these kinds of problems.

This points to a need for a new language that avoids these issues. I think
Rust is the answer, but I would like to see more languages try to fill that
gap--competition is healthy.

~~~
sclangdon
> This points to a need for a new language that avoids these issues. I think
> Rust is the answer, but I would like to see more languages try to fill that
> gap--competition is healthy.

Programs written in C and C++ may have issues because the languages assume the
programmer knows what they are doing. This assumption leads to some great
solutions to hard problems because the programmer is essentially free to do
what they want.

Of course, this assumption, as with most others, doesn't always hold true.
This doesn't mean there is a problem with the language. The problem is with
the programmer.

If you're going to write something like "return ({goto L; 0;}) && ({L: 5;});",
no language is going to save you.

C and C++ are still used today, in part, because modern languages try to
restrict the programmer. Rather than assume the programmer knows what they are
doing, they assume the programmer is stupid and needs help to cross the road.
By assuming stupidity, the restrictions modern languages put in place prohibit
certain solutions and as such C and C++ will remain the go-to systems
languages.

We do not need new languages. What we need is programmers who won't abuse the
languages we already have.

~~~
copsarebastards
> Programs written in C and C++ may have issues because the languages assume
> the programmer knows what they are doing. This assumption leads to some
> great solutions to hard problems because the programmer is essentially free
> to do what they want.

There’s an assumption here that it would be impossible to design a language
which would make these solutions available without being as error prone.
Existing languages may be less capable than C, but that’s only because equally
capable languages with less risk haven’t been created (Rust may be a solution,
I'm not sure yet).

What exactly do you think _can’t_ be done in a language that is less error-
prone?

> We do not need new languages. What we need is programmers who won't abuse
> the languages we already have.

You’re part of the problem. It takes incredible hubris to say something like
this, to think that it’s even possible for a human to do this.

Every nontrivial networking program written in C has security holes caused by
memory management issues. If you’re going to claim that these errors are
caused by bad programmers, then every C programmer is a bad programmer,
because every C programmer has written bugs like this. If you’re claiming that
bugs caused by C’s error-prone semantics are programmers abusing the language,
then _using C is equivalent to abusing C_. The very best programmers writing C
write bugs in C that they wouldn’t write in a language like Rust.

A system which depends on humans being perfect is bound to fail. There’s
simply no way you can reasonably debate this fact.

Every other engineering field has redundancy, multiple layers of error
checking that catch errors.

Until you see this as a problem then you’re a danger to any mission-critical
product you work on. Not understanding that using C is a risk displays a
shocking level of naiveté for a professional in this field. I’m not saying C
is never a good choice. I write a lot of C myself, but I do so with the
awareness that my code is not being checked adequately and that I have to take
extreme measures to ensure that my code is well-validated.

------
cbd1984
C is simple to compile into nearly-optimal code, or at least it was back in
the 1970s, when computers had single-opcode dispatch or trivial pipelines, no
SIMD hardware, no other parallelism worth mentioning, and it wasn't worth
worrying about cache too much. (Running in the registers was a neater trick.)

That meant it was relatively simple to 'see' the assembly language 'behind' a
given C function or stretch of code; it didn't take much to get inside the
head of a C compiler, so you could be reasonably sure that a simple piece of C
would result in a similarly simple piece of assembly out the other end.

That, of course, was well and good when it was reasonably simple to predict
actual performance from glancing at assembly code, which assumes opcode
performance (as opposed to, say, cache performance) dominates how fast the
code runs.

Now... how many of those things still hold true on desktop and server class
hardware?

~~~
revelation
But everything that came after C hasn't improved on this, at all. In fact,
languages now dominate that aren't compiled.

So as it stands, C is still your best bet when you are looking for that
optimal translation. Intel has recently made some effort to augment it in ways
to fully utilize new CPUs various parallel pipelines and specific
functionality:

[https://ispc.github.io/](https://ispc.github.io/)

~~~
yoklov
ISPC is cool, what it helps out with (writing the SIMD kernel) has never
really been the bottleneck in my experience.

The data still has to be arranged optimally for the hardware in order for SIMD
code to have any benefit (and at this point, writing SIMD code is
straightforward). You also still need to be experienced with the capabilities
of the hardware if you have any chance of writing good ISPC code (although
this is true of C, as well as any shading language).

That said, using it to target SSE and AVX with the same code is attractive.

------
Moral_
The first two invoke UB and are thus completely illogical.

~~~
kyberias
UB? I don't get the first one.

~~~
EpicEng
Reading an uninitialized variable is UB.

~~~
barrkel
Is it a local? I think the int declaration is for showing the type of x. It's
just a code fragment as is.

~~~
EpicEng
Fair point, we don't have context.

------
kazinator
I stopped reading after the explanations about _return ((1 - sizeof(int)) >>
32);_

Unless size_t is wider than 32 bits, it has undefined behavior. That's why it
returns 0; it could as well be 42, or the program could terminate with or
without a diagnostic message, etc.

------
taeric
Who says simple things always yield simple results?

~~~
cbd1984
Depends on how you define simple:

If something is simple for the compiler-writer, then simple things do yield
simple results.

If something is simple for the programmer, simple things often yield quite
complex results.

For example, in a language that's simple for the compiler-writer, (1/10) times
10 is only very rarely 1. 0 is a common answer, as is some fraction which is
almost, but not completely, unlike 1.

In a language which is simple for the programmer, Heaven, Earth, and minor
deities will be moved to make (1/10) times 10 come out to the obvious, simple
answer.

~~~
taeric
It really only depends on if you define simple as "can only derive simple
results."

And, you do realize that one of the simplest languages for compiler writers,
lisp, doesn't have to move heaven/earth to make that calculation work out how
you want it.

~~~
copsarebastards
> And, you do realize that one of the simplest languages for compiler writers,
> lisp, doesn't have to move heaven/earth to make that calculation work out
> how you want it.

I've written a C compiler and am currently writing a Lisp compiler, and I'm
not sure where you get the idea that Lisp is a simple language for compiler
writers. Lisp's simple representation belies a very complicated runtime, to
the point that the majority of Lisp implementations don't support compilation
at all--they're interpreted only.

~~~
taeric
This seems to get back to the other debate that crops up with "simplest." Just
because I posit that it is one of the simplest languages, does not mean I
imply it is by definition simple.

~~~
copsarebastards
I'm pretty comfortable with asserting that there's no reasonable definition of
"simple" which would make a mature Lisp compiler simpler than a mature C
compiler.

~~~
taeric
Fair, though I am focusing on the less mature situations. Specifically, a
naive lisp evaluator is much easier than a C compiler.

Are there any mature compilers, for any language, that would qualify as
simple?

~~~
copsarebastards
> Specifically, a naive lisp evaluator is much easier than a C compiler.

That's true, but a naive Lisp evaluator is a) not a compiler, and b) not
correct. A naive implementation of Lisp leaks memory very rapidly, and obvious
memory management schemes fail. Garbage collection was invented for Lisp to
deal with these problems, and even assuming an immature implementation, GC
isn't trivial. The simplest implementation of mark-and-sweep garbage
collection requires a lot of discipline to make sure that objects are
correctly allocated so that in-scope objects are kept live and out-of-scope
objects are discoverable as dead. Even if you're writing your own allocator
for C, that allocator is pretty trivial in comparison.

------
agounaris
who said C is simple? :S

~~~
EpicEng
C _is_ simple, but that doesn't mean it's not powerful, and with power
comes... yada yada yada. There are many edge cases to be sure, but as a
language, its constructs and features are very simple.

------
dang
Can anybody figure out the year on this one? Internet Archive says 2011 but it
seems it might be earlier.

~~~
mmozeiko
This URL
[http://www.eecs.berkeley.edu/~necula/cil/cil002.html#toc1](http://www.eecs.berkeley.edu/~necula/cil/cil002.html#toc1)
mentions cil version 1.3.7. Changelog
[http://www.eecs.berkeley.edu/~necula/cil/changes.html](http://www.eecs.berkeley.edu/~necula/cil/changes.html)
says 1.3.7 was released on 2009-04-24.

~~~
dang
But 2010 may just have been when they stopped updating a version number. It
looks like early 2000s material to me, based on nothing in particular. There's
also this:

 _When I (George) started to write CIL I thought it was going to take two
weeks. Exactly a year has passed since then and I am still fixing bugs in it_

The CIL paper was published in 2002. Actually the whole project looks
interesting—arguably more so than the currently posted page. It should have
its own HN thread sometime.
[https://news.ycombinator.com/item?id=836735](https://news.ycombinator.com/item?id=836735)
was a while ago!

------
iopq
I've been looking for this for months, I wanted to link this to my friend who
said he prefers C to Java for his CS classes.

~~~
tdsamardzhiev
Come on, almost any language I can think of is a better fit for CS classes
than Java.

~~~
iopq
But get this, he said Java was HARD. Compared to C?

