

C: A Technological Landmine - jonshea
http://blog.expandrive.com/2009/07/31/technological-land-mines/

======
plinkplonk
Ugh, this is close to being a troll post. Yes C has its weaknesses and domains
appropriate to it's use. But sentences like

"Lacking a strong and expressive type system, C not only permits but
encourages its programmers to sacrifice correctness, safety, robustness,
testability, and maintainability in favor of some highly underdeveloped and
ill-measured ideas about “performance”. Much of the infrastructure of the
Internet is built out of this garbage."

and especially words like "garbage" only exposes the author as someone who
doesn't know what he is writing about. (ok i could have used the shorter word
"fool" here, but ..).

The "infrastructure of the internet" (including the underlying operating
systems) _is_ one of the domains in which C shines.

There is good reason that even today, large chunks of "infrastructure" code is
written in C/C++.

"anybody who considers C for high-level application development at this point
in history, is in a grievous state of sin"

With "high level" being conveniently undefined and without any examples, that
statement means next to nothing.

What a terrible, ill thought out article.

~~~
jacquesm
C is advanced assembler. It is an absolutely _great_ language for system level
stuff, especially because everything is explicit, no relying on side effects
or stuff hidden from view.

The thread of execution is extremely easy to follow.

The only thing I would change if we could revisit the past is that I would add
a string primitive to the language with a half decent set of string operators.
That would have made my life a lot easier at some point in the past.

The funny thing is that most languages that people use that criticize 'C' are
usually at the core levels written in C.

There is probably a good case to be made for the claim that Unix would not
exist if it weren't for the C language.

~~~
evgen
> The funny thing is that most languages that people use that criticize 'C'
> are usually at the core levels written in C.

The difference is that the people using these higher level languages only
depend on a single set of maintainers who need to get the primitives right
once instead of every random coder needing to manage buffers, garbage
collecting unused memory, threading, and a host of other landmines on each and
every project.

~~~
jacquesm
I think those 'random coders' would grow up quite rapidly if they had to learn
how to program for real.

It's the high level language cruft that is a prime source of all the endless
layering of glue on top of other glue that we're stuck with.

Less scripting, more binaries.

I cringe every time I install some minor system level package and I have to
include perl, awk, php and python or some other combination of stuff.

~~~
evgen
Good thing those coders wading around what you seem to consider the shallow
end of the coding pool are not writing things like our nameservers, web
servers, operating systems, or anything else important. Oh, that's right, they
are. And they keep fucking up. Repeatedly.

When this keeps happening over and over again it is time to consider the
possibility that the problem isn't the craftsman but the tool.

The advantage of using lots and lots of glue languages is that it forces the
important bits to be loosely coupled, adds flexibility to the development and
deployment process, and leads to systems that are easier to comprehend and
reason about. The only things that should be opaque binaries are system
libraries and VMs/runtimes. Everything else should be "glue".

~~~
jacquesm
Given the number of lines of C code out there the number of fuckups is a lot
less than you'd expect. Because C is usually used in gatekeeper situations
(operating systems, compilers, servers, network stacks) when there is a breach
it is serious.

I find it hard to conceive of a posix compliant OS written in a higher level
language, especially because of the lack of deterministic behaviour when
handling interrupts and allocating memory. C is mind numbingly simple at that
level which is exactly why it is used in these situations.

But every language makes it possible to write insecure code, C has its own
unique challenges:

It isn't that long ago that somebody managed to get an exploitable scenario
out of UTF-8, it took me a long time looking at the code to see how it was
even possible. In a higher level language that sort of thing is more difficult
to achieve, that's for sure.

------
hemancuso
For a long time people built huge buildings with very very thin measures in
place for worker safety. Buildings cost a lot less and went up a lot faster -
but it came at the cost of workers lives.

OSHA's rules make it much more expensive and tedious for American cities to
grow - but the growth isn't coming on the backs of construction workers. It's
a trade off we've decided to make because we value safety and we value not
getting our pants sued off for negligence.

You can write some well designed quick-and-dirty C code that does what you
want, and does it fast. But once in a while you'll make a mistake that you
probably won't notice and might cost you your company.

------
psyklic
Ironically, the article referenced by the author does not blame the C language
for this problem. Instead, it blames the CA for issuing the certificates in
the first place:

"Marlinspike said since there is no legitimate reason for a null character to
be in a domain name, it’s a mystery why Certificate Authorities accept them in
a name."

~~~
olefoo
Yes, and there's no reason for browsers to accept more than one domain name in
a CN field. However a quick look through rfc3280 and an ASN.1 reference make
me think it is a less than trivial task to figure out what would and would not
be a legal termination for a string encoded in the Subject field of a
certificate. But it is perfectly reasonable to expect the CA to check for
that.

------
TallGuyShort
The reason I like C is that every action is so specific. Yes, that means it's
not suited for "high level" applications, like web apps, and situation in
which development time needs to be cut. But that specificity and control over
every action is exactly why it's good for network and hardware programming. I
haven't seen C used outside of those realms in a long time.

edit: Furthermore, it's low-levelness makes it very versatile. It centers
around the universal abstractions used in Unix - the ability to open, read,
write, and close files. That, combined with structs, unions, and it's basic
data types allow you to use it for virtually ANY protocol.

~~~
creachadair
> That, combined with structs, unions, and it's basic data types allow you to
> use it for virtually ANY protocol.

Sadly, when you use C to implement low-level binary wire protocols, you
quickly discover that structs, bit-fields, and unions are nearly useless
because they are incompletely defined. Byte order is undefined. Structure
layout is mostly undefined -- you pick field order, but you can't choose
packing, alignment, or padding rules. The sizes of the integer types vary by
platform and compiler. Bit field layout, packing, and alignment are almost
completely undefined.

What you're left to work with are unsigned characters, pointers, and bitwise
operations. You have to pack and unpack everything manually, or your code
won't port. It's enough to get the job done, but it's like using a wrench to
pound in screws.

I could do with a little less specificity of action, myself.

------
sophacles
One thing this guy doesn't mention, that I would think relevant to the
discussion: Every language currently used by more than 4 people has a notion
of FFI via C. This is nice as it allows for the old "profile it and write the
slow bits in C" type programming. I particularly like that style of
programming, because in the end, you only need to do C style intensity for a
small bit of code. Over time, the number of these small, but useful bits
accumulates, and the result is a decent, bottom up style library, without the
pain of having started in C. (It also helps avoid the cruft...).

------
slackerIII
This article in particular crystallized a thought I've had about this site,
and sites like this in general. I would love to see a wiki-editable block
attached to each submission that tries to describe, in as few words as
possible, what information the article contributes.

Think of it as compression, where a basic knowledge of computing is assumed.
More interesting articles would have a lower compression ratio, which might be
a fun thing to filter on. This article might go down to, "C is generally
unsafe, and you probably aren't skilled enough to make it safe, so don't use
it". Or maybe, "I needed to write something for my company blog, so I found a
recent security hole and added some vaguely related platitudes".

------
dryicerx
C is used for low level libraries for it's lean and mean performance. It
sacrifices checks and safety features for this, and allows the programmer full
control. Do you see professional race cars with ABS and Automatic
Stabilization? No, you give the Driver FULL and TOTAL control, same with C and
other low level languages. C has only a few data types that are as basic as
you can get, I mean what do you expect use something like STL strings?

If you start having type checking and various other easy-to-code and child-
safety features, you are bloating and giving up performance in the low level
libraries, if this happens imagine what the performance on the higher up
application level would be.

~~~
krschultz
I'd prefer my libraries be rock solid secure even if I lose some (or even a
lot of) performance for it.

Hardware is always getting cheaper.

Losing data integrity and the trust/confidence of your users is extremely
expensive, and can be fatal for a startup.

Performance is not the most important metric for a lot of applications.

I'd prefer the safe but slow 5 star crash test rated sedan with a good alarm
over the race car that is going to blow up after a few races, in library
terms.

~~~
pyre
> Hardware is always getting cheaper.

This is a poor justification. A few years ago a house was a good investment
because 'housing prices will always be going up.'

~~~
evgen
Please provide even a single deluded fantasy in which the price/performance
ratio for a particular piece of hardware or component in the hardware stack
will not continue to trend in the direction of more bang for the buck.

~~~
tow21
I don't know if this counts as "deluded", but how about: resource exhaustion
of raw materials required in hardware manufacture.

See, for example: [http://blogs.wsj.com/informedreader/2007/05/25/a-metal-
scare...](http://blogs.wsj.com/informedreader/2007/05/25/a-metal-scare-to-
rival-the-oil-scare/) which talks primarily about LCD displays (we're fast
running out of Gallium/Hafnium/Indium), but points out that copper is likely
to get _significantly_ more expensive throughout this century.

That's going to increase the price/performance ratio of practically
_everything_.

~~~
krschultz
Silicon makes up much more of the cost of a computer than copper, its not even
close. Silicon is >50 cents a gram, copper is less than a cent per gram.

~~~
tow21
That's what the _significantly_ was all about. We're not really going to run
out of silicon any time soon, but we actually might with copper.

At the moment, both those prices are strongly dominated by the cost of
processing, I would imagine. At some point, it could well be the scarcity
you're paying for with copper, though.

Scarcity increases price, and two orders of magnitude is hardly inconceivable.

------
jwhitlark
Use the right tool for the job. C is the right tool for some jobs; if you jam
it into a place where it doesn't belong, you probably don't have a deep enough
understanding of it to use it safely.

There are two groups you find misusing something. Those that really know what
they are doing, have weighed the risks/rewards, and have decided that misusing
the tool to get the job done is worth the associated risk. Then you have
people who don't know what they are doing. They are going to have problems,
but don't blame the tool.

------
zandorg
I wrote my own sprintf handler which checks string length and truncates if
necessary.

unsigned long lsprintf(unsigned long max_length,char [asterisk]dest,char
[asterisk]fmt, ...)

char buffer[1024];

lsprintf(1024,buffer,format);

Slightly overkill.

~~~
parenthesis
C99 has snprintf() for this.

------
tarkin2
Anyone care to guess at what he means by a strong and expressive type system?

~~~
jonshea
I’m pretty sure he’d point to Haskell as the best example.

~~~
jonshea
Common Lisp, {,S,O?CA}ML, and even C# also make the list.

