
Rules for Developing Safety Critical Code [pdf] - rrampage
http://pixelscommander.com/wp-content/uploads/2014/12/P10.pdf
======
bluedino
We just talked about this:

[https://news.ycombinator.com/item?id=8856226](https://news.ycombinator.com/item?id=8856226)

------
tgb
The most impressive thing I've read about NASA's software development isn't
mentioned here. It was the system where bugs and defects were (at least in
principle) not treated as the failing of an individual but rather as a failing
of the entire development process. So when a problem is found, the question
isn't "Who did this and how can we punish them?" it's "What can we as a team
do so that this never happens again?"

I don't know how this works in practice, but it sounds like a great system
even when dealing with code in a lower-risk setting.

~~~
benihana
Blameless postmortems work phenomenally at Etsy which is a pretty low risk
setting (after reading Sidney Dekker's book The Field Guide to Understanding
Human Error [highly recommended] I would say that blameless post-mortems are
even more important in a high-risk setting). Except failure isn't the correct
word - the book makes the case that these are natural artefacts of complex
systems.

1: [https://codeascraft.com/2012/05/22/blameless-
postmortems/](https://codeascraft.com/2012/05/22/blameless-postmortems/)

2: [http://www.amazon.com/Field-Guide-Understanding-Human-
Error/...](http://www.amazon.com/Field-Guide-Understanding-Human-
Error/dp/0754648265)

~~~
amirmc
Nice article. Something it doesn't point out, which can become one of the
elements in the cycle of name/blame/shame, is the creation of 'heroes'. This
is the figure who throws themselves at the issue and brings it back from the
brink of disaster. If the culture isn't fixed and failure easy to talk about,
then this can play out repeatedly. Since it's usually a celebrated event ('we
survived!'), it creates poor incentives for fixing the culture. One can also
play this game by noticing a potential failure mode, figuring out how to fix
it, but only leaping in once it's struck. I'd rather find another employer
though.

------
jordanpg
> All code must be compiled, from the first day of development, with all
> compiler warnings enabled at the compiler’s most pedantic setting. All code
> must compile with these setting without any warnings. All code must be
> checked daily with at least one, but preferably more than one, state-of-the-
> art static source code analyzer and should pass the analyses with zero
> warnings.

Wow, am I ever frustrated by the summary ignoring of warnings I see going on
around me. I spend my spare time at work making warnings go away and deleting
uncommented @SupressWarnings from our codebase.

In a word, KISS. Another engineer designing critical systems for the US
government coined that one, apparently [1].

[1]
[http://en.wikipedia.org/wiki/KISS_principle](http://en.wikipedia.org/wiki/KISS_principle)

~~~
tormeh
Then there's Clang's struct alignment warnings using -Weverything. "Oh no,
your struct will have 3 bytes of padding!" Made me turn off -Werror. Some
warnings can be ignored.

~~~
BudVVeezer
It's shocking to see how often people write code like: memcpy(somePtr,
&myStructObj, 12);

So while that warning can certainly be useless information, it does definitely
catch bugs. Unfortunately, the people who write code like the above are also
more likely to be people who ignore warnings. ;-)

~~~
RogerL
Maybe you are being too specific in your example, and thus I am being nitpicky
rather than helpful, but wouldn't that be caught by a static analyzer?

~~~
BudVVeezer
I just tried the following simple example:

    
    
      #include <string.h>
    
      struct S {
        int i;
        char c;
        int j;
      };
    
      void f(void) {
        struct S s1 = { 0, 0, 0 }, s2 = { 0, 0, 0 };
        memcpy(&s1, &s2, 9);
      }
    
      int main(void) {
        f();
        return 0;
      }
    

MSVC, Clang, and PC-Lint are all silent on the code (which is reasonable
behavior since it's impossible to glean programmer intent from that snippet --
maybe the programmer really only wants the first nine bytes!). (Btw, yes, I am
using the analyzer features for MSVC and Clang, not just relying on high
warning levels.)

------
getpost
// Of course this is only an example, but except in rare cases when the value
of a boolean is significant (e.g., interfacing code in different languages),
don't compare to booleans! Not only is it redundant, it's another potential
source of error.

if (!c_assert(p >= 0) == true) { return ERROR; }

// vs

if (!c_assert(p >= 0)) { return ERROR; }

// The following makes the error condition clearer, but it goes against the
convention of having a hopefully true condition as the argument to assert.

if (c_assert(p < 0)) { return ERROR; }

~~~
asynchronous13
I agree with you in principal, but in practice I work on a lot of C code that
does not have a built in bool type (C89).

In your example using > or < there is not an issue because it's certain the
return value is either 0 or 1. However, when checking a specific variable used
as a boolean I prefer to see an explicit comparison.

if(variable)

versus

if(variable==TRUE)

I developed this preference after spending weeks on a particularly nasty bug.
The bug was triggered by a corrupted int that was used as a boolean variable
that passed a check because if(146134613) { kill_me_now(); } will run, even
though the value had been corrupted. (of course, tracking down the source of
the corruption was the real fix for that case, but there's no point in having
safety checks that don't work)

~~~
getpost
Interesting example, but then you're at risk of creating code paths that vary
depending on whether boolean values are true, false, or other, which is
contrary to the definition of a boolean value.

~~~
asynchronous13
You're right, but that is true because there is not a boolean type, not
because of the form of the check.

Either form has the same problem since an int is used in place of a true
boolean (true | false | other), but I prefer the form that makes it more
explicit and noticeable.

------
qwerta
This paper also explains each rule. They use code analysis tools a lot, and do
not like stuff which makes analysis harder (recursion, unbounded loops..)

~~~
john_b
Recursion and unbounded loops are far too risky on their own, even ignoring
the problems they cause for static analysis. The code that goes on NASA
spacecraft is bombarded by cosmic rays on a daily basis, something a server
sitting in a building at the bottom of the atmosphere doesn't have to worry
about nearly as much. Even if the code is perfect, if a ray hits the right
spot it's possible for it to flip a bit and then your perfect recursive
function will do who knows what. [1]

They use error-correcting memory to combat this, but it's still better to be
safe than sorry when your software controls a multimillion dollar space oasis
with human beings inside who have no means of escape.

[1] [http://www.statemaster.com/encyclopedia/Single_event-
upset](http://www.statemaster.com/encyclopedia/Single_event-upset)

------
kikki
So basically keep code simple, concise, and predictable.

------
kibwen
I'm interested by the rule that says that loops must either be statically
verified to terminate or statically verified to never terminate. Combined with
the forbidding of recursion, it's an interesting approximation of a non-
Turing-complete language. Or are guaranteed-to-be-infinite loops admissible in
non-Turing-complete languages?

~~~
wyager
You are correct. Total languages are not Turing complete.

However, in the case of safety-critical software, we don't actually follow
that rule perfectly. We usually have one or more main loops that never
terminates, but everything called from there _does_ terminate.

In languages that _enforce_ totality, like Agda (which is not suitable for
most safety-critical programming), you can either selectively disable
termination checking on the main function, or you can recurse over a large
number. I'm partial to 99999999999999999.

------
volent
The rule about pointer is pretty limiting (no pointer function, no more than
one level of dereferencing).

The reason is also pretty sad : "Pointers are easily misused, even by
experienced programmers". I don't see how an experienced programmer has more
chance to misuse pointers than anything else.

~~~
TeMPOraL
I believe this is because of the type of code they write. When writing real
time systems you care about predictability of execution. When you allow
function pointers, you (as a programmer / static analysis tool) suddenly have
no idea what code will be executed, thus you can't reason about the time it
takes. You could scour the codebase for all instances of code that set a
particular function pointer, but then again it might be done indirectly, or -
heaven forbid - based directly on external input. It makes code much harder to
reason about.

~~~
pjc50
This. There are categories of embedded system that don't allow _reentrancy_ ;
it must be possible to represent the program's callgraph as a DAG with each
function appearing exactly once. In some systems (ADA?) this allows for the
preallocation of all local variables, too.

Some hardware (e.g. the smaller PICs) has poor support for pointers which can
make indirection cost quite a lot of instructions.

Then there's the general principle of cutting your coat according to your
cloth. You don't use a generalised pointer-to-rocket-engine system because you
aren't going to add more engines at runtime

~~~
termain
"You don't use a generalised pointer-to-rocket-engine system because you
aren't going to add more engines at runtime"

Well, we might. Particularly if the vehicle you're modeling for a hardware in
the loop sim isn't entirely specified at compile time.

------
laurencei
> Typically, this means no more than about 60 lines of code per function.

Uncle Bob would be very upset by this statement. He advocates a maximum of 4-5
lines per function to ensure readable code in his Clean Coders video series.
Anything more, and you need to refactor into smaller functions.

~~~
anton_gogolev
> He advocates a maximum of 4-5 lines per function to ensure readable code

Yeah, and you end up with myriad of functions which are used once and only
once and it becomes impossible to actually find code which does something
useful.

~~~
nly
... and good luck naming all those functions.

~~~
pjc50
See sibling comment about domain-specific language; there's nothing wrong with
long function names if your autocomplete is working. Codebase I'm working with
has names like "auditNullRecipeCursorAtEndOfOrderLineAdditionsIfRequired" and
"getUnselectedSiblingWithSametabIndexOrCreateRepeatedChildIfLessThanMax"

~~~
im2w1l
It's hard to notice typos in names that long. Can you immediately tell where
the typo is in auditNullRecipeCursorAtStartOfOrderLineAdditionsIfRequired ?

~~~
pjc50
I can't tell at all. Fortunately, as long as someone hasn't declared another
function with the typoed name, either Intellisense or the compiler will tell
me.

I have a much harder time with "_tcscpy_s" and similar suites of functions
where a single-letter typo is likely to be a different function with the same
signature.

~~~
im2w1l
I changed ...End... to ...Start..., which could be a plausible real world bug.

------
dimman
A bit surprised it doesn't mention the use of curly braces. Perhaps they
assume their "state of the art" static code analysis tools will find potential
issues (like Apple's goto fail failure).

~~~
TorKlingberg
This documents is a summary. There in much more detail in the JPL
Institutional Coding Standard for the C Programming Language : [http://lars-
lab.jpl.nasa.gov/JPL_Coding_Standard_C.pdf](http://lars-
lab.jpl.nasa.gov/JPL_Coding_Standard_C.pdf)

------
tempodox
The PDF file seems to be timestamped 2014-12-26. Does any-one happen to know
the exact authoring date? Sadly, there is no date given in the document
itself.

~~~
JNRowe
atsaloli already gave a good answer, but I'll add that the date is in the
PDF's metadata. fex, poppler provides pdfinfo which shows the creation date of
2007-01-15. Your viewer probably has an option to show it too.

~~~
tempodox
Indeed, you're right. Even Apple's Preview app on my box can do that :)

------
jokoon
I would gladly print that PDF and stick it in every office. Maybe an
alternative would be to make it work, and then to make it readable.

