Hacker News new | past | comments | ask | show | jobs | submit login

Early in my career I worked on military flight data recorders, including the development of the software for the F-22's "black box". Those systems have SBIT, IBIT, PBIT and MBIT sub-systems were BIT is "built in test" and S = startup, I = initiated, P = periodic and M = maintenance. I remember making the Star Trek diagnostic joke myself when I was assigned the SBIT work.

Each BIT does varying level of testing based on it's runtime budget but there are a lot of very basic tests that don't make much sense until you see your first field report of some register bit "sticking". Its much better to ground a plane that can't add 1+1 than to find that out in mid-flight.

An F-22 had a FCS failure and didn't realize takeoff. The pilot didn't do a IBIT on the FCS as required, therefore wasn't aware that all 3 rate sensors latched up. The jet was uncontrollable once it left the ground, and the pilot ejected safely. [0]

As I recall, they modified the sensors to avoid latch up (extra pullup resister) and updated the FCS software to provide a warning if all 3 sensors return zero output. Even though this wasn't really an error in the FCS software, It could be argued that it failed to detect an erroneous input from 3 redundant sensors that latched up for the same reason.

This is a classic example of a pilot who misunderstood the pre-takeoff checklist procedure and cost the USAF a $325m aircraft.

[0] http://usaf.aib.law.af.mil/ExecSum2005/F-22A_20Dec04.pdf

To be fair this seems to be more of a classic example of a documentation or training problem. Right from your link:

> During the mishap sequence, the MP started engines, perfomled an IBIT, and had a fully functioning Flight Control System. Subsequently, the MP shut down engines to allow maintenance personnel to service the Stored Energy System. During engine shut down, the MA's Auxiliary Power System (APU) was running. The MP believed the APU provided continuous power to the Flight Control System, and therefore another IBIT after engine restart was unnecessary. This belief was based on academic training, technical data system description, and was shared by most F/A-22 personnel interviewed during the investigation.

  To be fair this seems to be more of a classic 
  example of a documentation or training problem.
Maybe - but if you were designing a consumer product, you wouldn't rely on the user following a checklist; the IBIT would run automatically when they turned on the ignition and sound an alarm (or even prevent takeoff) if the vehicle would be uncontrollable.

So you could also classify this as a user interface / design problem.

SBIT & PBITs are the ones that run automatically, with SBIT running automatically at startup and PBIT running on a watchdog. The SBIT time budget and scope is usually much smaller than IBIT so time intensive tests like ones that talk to sensors on the bus aren't present in SBIT. You can think of the stages as SBIT: can I run? IBIT: should I run? PBIT: am I running right?

True. The pilot wasn't the only one who misunderstood the details.

What types of errors would cause one of the tests to fail? Is it mostly testing for hardware errors, or are there any software logic errors that could make it to production, but be caught by one of the tests several months down the road? The only software related items I can think of are edge cases where a built in test is based on real time input. Kind of like running the calculations through multiple independent implementations of the software.

There's actually a decent probability of memory corruption in space applications due to radiation. So in addition to checking communication across busses, application checksums are typically run continuously.

Another thing that's software-related is if you've got a (rare) race condition. For example a data structure that gets corrupted if you have a particular series of nested interrupts.

Now, hopefully your system is set up such that race conditions cannot happen, but good luck with that.

Yup, as far as I remember all the avionics hardware I saw had these types of BITs built in.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact