
NASA JPL C Coding Standard [pdf] - m0nastic
http://lars-lab.jpl.nasa.gov/JPL_Coding_Standard_C.pdf
======
DiabloD3
I agree with almost all of it except for one thing: gotos have pretty much one
legitimate use, as C's equiv to finally {} as part of a try {} block, ie, that
specific form of cleanup after error management. NASA implies they have no
legitimate use.

longjmp banning is also slightly questionable (although I can see why because
it is very easy to do wrong). I use it inside of my code as part of an STM
implementation (so begin_tx() setjmps[1], abort_tx() longjmps; its faster than
manually unwinding with if(tx error) { return; } spam in deep call stacks.)

Using longjmp for this makes writing code much easier (no needing to error
check every single tx function call), so less chance for bugs to slip in.

1: The only ugly part of that is begin_tx() is a function macro, which I
prefer never to use in code that is executed; I tolerate it in "fancy
template-like generator" setups, though.

~~~
mcpherrinm
You mention longjmp as useful in an STM implementation.

But you'd never see something like that in flight control software. Simplicity
begets correctness. Correctness begets safety.

When the software is flying a rocket ship, I'm okay with it being 10% longer
but 10% safer.

~~~
DiabloD3
I wouldn't _use threads_ in mission critical software such as that, so that
solves the problem.

------
mvanveen
A few summers ago I was an intern at JPL working on a static analysis suite
for this exact standard.

Writing code checkers for these sorts of rules is a really interesting
exercise and it helped me grow a lot as a programmer! I went from having no
exposure to formal languages, parsing, and grammars to actively playing around
with these concepts to try and help build more reliable software. It was a
humbling, challenging, and incredibly rewarding experience.

Sometimes, a rule is extremely simple to implement. For example, checking a
rule that requires that an assert is raised after every so many lines within a
given scope is just a matter of picking the right sed expression. Other times,
you really need an AST to be able to do anything at all.

A rule like "In compound expressions with multiple sub-expressions the
intended order of evaluation shall be made explicit with parentheses" is
particularly challenging. I spent a few weeks on this rule! I was banging my
head, trying to learn the fundamentals of parsing languages, spending my hours
diving into wikipedia articles and learning lex and yacc. The grad students at
LaRS were always extremely helpful and were always willing to help tutor me
and teach me what I needed to learn (hi mihai and cheng if you're reading!).
After consulting them and scratching our heads for a while, we figured we
might be able to do it with a shift-reduce parser when a shift or reduce
ambiguity is introduced during the course of parsing a source code file. This
proved beyond the scope of what I'd be able to do within an internship, but it
helped me appreciate the nuance and complexity hidden within even seemingly
simple statements about language properties.

Automated analysis of these rules gives you a really good appreciation of the
Chomsky language hierarchy because the goal is always to create the simplest
possible checker you can reliably show is able to accurately cover all the
possible cases. Sometimes that is simple as a regular language, but the next
rule might require you to have a parser for the language.

For what it's worth, this is only one of the ways the guys at LaRS
(<http://lars-lab.jpl.nasa.gov/>) help try to improve software reliability on-
lab. Most of the members are world-class experts in formal verification
analysis and try to integrate their knowledge with missions as effectively as
possible. Sometimes, this means riding the dual responsibility of functioning
as a researcher and a embedded flight software engineer, working alongside the
rest of the team.

If anyone's interested in trying out static analysis of C on your own, I
highly reccomend checking out Eli Bendersky's awesome C parser for Python
(<http://code.google.com/p/pycparser/>). I found it leaps and bounds better
than the existing closed-source toolsets we had licenses for, like Coverity
Extend. At the time, it had the extremely horrible limitation of only parsing
ANSI 89, but Eli has since improved the parser to have ANSI 99 compliance.
Analyzing C in Python is a dream.

~~~
tzs
> A rule like "In compound expressions with multiple sub-expressions the
> intended order of evaluation shall be made explicit with parentheses" is
> particularly challenging. I spent a few weeks on this rule! I was banging my
> head, trying to learn the fundamentals of parsing languages, spending my
> hours diving into wikipedia articles and learning lex and yacc.

Hmmm....how about this? If the code is parenthesized enough, then the
precedence and associativity of the operators has no effect on the shape of
the parse tree. So, if you take the expression and repeatedly make random
changes to the operators and parse it, and you keep getting the same shape for
the parse tree, it is sufficiently parenthesized.

~~~
radicalbyte
Reverse the order of precedence and then compare the ASTs. If they're the
same, then pass.

Smart.

------
ninetax
Oh if only the project that I had been working on followed any of these rules.
Most of the code was generated from Matlab, but some had to be translated by
hand. I'm not sure any of us knew this even existed...

------
afhof
Wait.... no malloc or sbrk? That means all space has to be stack allocated?
That's a pretty serious limitation and would probably make it hard to do
anything really interesting.

~~~
mvanveen
You don't want to do anything interesting in a space flight mission. Code
affecting a space mission only has one chance to get it right in many cases.
In general, memory management in these systems is extremely serious business.
Think of all the ways that manual memory management in C has damaged the
reliability of software you have written in the past.

In addition, a lot of the rules are created to make the task of reading source
code easier, both for humans and machines. I remember Dr. Gerard Holzmann once
half-joked in a meeting that he wanted to disallow any declaration of pointers
except at static initialization. I sort of thought he was joking, but then he
assured me that it was a serious consideration. He reminded me of the gravity
of the situation and explained that $2 billion of public funds were on the
line.

Disallowing pointer indirection would make the task of certain automated
analysis techniques much, much simpler to perform. Adding a pointer
indirection can really conflate matters sometimes.

~~~
afhof
But without pointer indirection and dynamic memory allocation, why even use C?
The big idea of C is pointers. Aren't there languages designed for mission
critical and embedded environments (Ada for instance) ?

~~~
mvanveen
This is a much larger discussion, akin to most religious wars in software ;-).
There is a huge argument for what you're saying, but there's other forces at
play. One aspect that plays into it is that C is a really nice layer on top of
assembly, and there are a lot of extremely talented embedded C software
engineers, (I'm assuming) moreso than the number of available Ada engineers.
Also, keep in mind this is a pretty conservative domain. What has worked in
the past is trusted much more than what might work better. Up until a decade
or so, there was no operating system to speak of and most development used
hardware controllers.

Also, C is not the only horse in town. In fact, I hear that Ada is actually
pretty popular in space-flight. Other missions have successfully leveraged
FORTH even. Using a Lisp read-eval print loop from many millions of miles away
once saved the Voyager mission.

One really interesting point of view on this topic is Ron Garret's essay
called "Lisping at JPL," available at <http://www.flownet.com/gat/jpl-
lisp.html> .

~~~
vbtemp
In my experience, I have never come across Ada in the spaceflight domain.

Among the big players in spacecraft flight software, there seems to be a
divergent east-coast/west-coast preference for C and C++, respectively. In my
estimation, this is the reason: there is a wide variety of target hardware and
OSes and the need for FSW to be reused across all of them (embedded linux,
VxWorks, QNX on PowerPC, SPARC, intel architectures). In terms of development
environments and compiler toolchains, only ISO C (and to a slightly lesser
degree C++) is supported by all of them.

Edit: Various instruments on spacecraft may be programmed in Forth or other
nifty languages, for example, and there's a growing effort to make some of the
more "interesting" challenges in spaceflight (autonomy, fault management,
guidance-and-control, etc..) to be coded in custom domain-specific languages
or other scripting languages like Lua.

------
SjuulJanssen
"A recommended use of assertions is to follow the following pattern:" if
(!c_assert(p >= 0) == true) { return ERROR; }

Why not: if (!c_assert(p >= 0)) { return ERROR; }

~~~
vbtemp
Because the whole point is to be maximally explicit and not implicit.

------
mangamadaiyan
One thing that stood out for me was the vehement "no" to goto. That's a little
too harsh, one thinks?

~~~
jlgreco
That is anything but an uncommon stance.

~~~
andrewcooke
for C? it's very commonly used to jump to implement "catch" (jump to clean-up
on error).

~~~
jlgreco
goto is commonly used that way in C... and it is just about as commonly
straight up banned in C.

I am not defending the bans, but find it strange that people would be
surprised by them.

~~~
andrewcooke
> I am not defending the bans, but find it strange that people would be
> surprised by them.

yes, i think a lot of replies here are speaking past each other, making valid
and not actually contradictory points, but not really replying to each other.

if you read the entire doc it is clear that they are pushing c in a very safe,
but somewhat unusual direction. there's no dynamic memory for example, so the
main reason that most of my c code uses goto - to free memory on failure in a
"catch" - is irrelevant.

taken as a whole, it's not what i would call normal c use, and i don't think
it's very useful for most other people as a guideline, but it is internally
consistent and, for the specific use case, reasonable.

------
lottoro
Compared to other coding standards I like the brevity: 22 pages, 31 rules.

------
alexeiz
This document reinforces my opinion that most coding standard documents suck.
I’ve seen a countless number of coding standards from different companies
(some of the companies I even worked for) and they all sucked. No exception.
Even though coding standards have some common sense advice and guidelines
which is generally helpful for producing code of good quality, the amount of
arbitrary irrational rules and beliefs that coding standards writers put into
the standards and try to enforce through the standards actually end up hurting
the quality of the code produced by developers trying to follow those rules.

Case in point with examples from the NASA JPL coding standards for C:

* no direct or indirect recursion What is it, FORTRAN-77? Some algorithms are way easier to implement recursively whereas the iterative algorithm can be much less straightforward and buggier. Think sorting: it’s easy to prove that the recursion is finite and that the implementation of the algorithm is correct. Do they use sorting in NASA or is it prohibited by this rule?

* no dynamic memory after initialization FORTRAN-77 again! While dynamic memory management can be challenging in real-time systems and the generic malloc/free implementation is not acceptable, it doesn’t mean that statically pre-allocated fixed-size memory is better. It inevitably leads to brittle code ripe with excessive memory use, bugs like static buffer overruns, and sometimes even inability to use dynamic data structures like linked lists. To work around this restriction, a developer can construct a linked list structure in a statically allocated memory, but doing so is essentially equivalent to creating your own dynamic memory manager which is more likely to be poorly implemented than a good dynamic memory manager. Instead of denying the use of dynamic memory they should develop memory managers with acceptable performance characteristics.

* The return value of non-void functions shall be checked or used by each calling function, or explicitly cast to (void) if irrelevant. Given that there are a lot of library functions in C that return some error code rarely useful, this rule leads to code littered with (void) casts: “(void) printf(…)”, “(void) close(…)”, etc. Along with the littering the rule doesn’t make the code any more robust because it encourages to use (void) casts to ignore error codes and therefore error codes will likely be ignored rather than handled correctly.

* All functions of more than 10 lines should have at least one assertion. This leads to littering code with assertions in those functions that don’t necessarily have anything to assert and that are accidentally longer than 10 lines (for example, due to mandatory parameter validation checks. I hope parameter validation checks are not assertions, are they?).

* All #else, #elif and #endif preprocessor directives shall reside in the same file as the #if or #ifdef directive to which they are related. This is just a bizarre rule. What developer puts #ifdef in one file and #endif in another? Unless of course he’s drunk or high but I hope that’s not how NASA develops its software.

* Conversions shall not be performed between a pointer to a function and any type other than an integral type. Wait, pointers to functions should be converted to which integral type? They are a number of integral types: char, short, unsigned long long. Which one do I choose? Why not void* or intptr_t?

* Functions should be no longer than 60 lines of text and define no more than 6 parameters. Finally a good rule. But what does the explanation say? “A function should not be longer than what can be printed on a single sheet of paper in a standard reference format with one line per statement and one line per declaration.” Printed on a sheet of paper? Is this still how code is reviewed in NASA?

And before you say "these coding standards are for a special kind of software
that runs on space flight control systems," embedded devices these days are
more powerful than desktop computers ten years ago. Embedded sortware grew
beyond draconian restrictions a long time ago and it's much closer now to non-
embedded software.

Let's not forget that NASA did use Lisp in their systems and they were able to
solve pretty difficult problems remotely with help of Lisp REPL
(<http://www.flownet.com/gat/jpl-lisp.html>). Lisp code certainly can't be
subject to any of the restrictions from these coding standards, which is
another indication of how irrelevant these coding standards are for producing
robust software.

