
A400M Airbus Flier crashed because of software issues - ExpiredLink
http://translate.google.com/translate?hl=en&sl=de&u=http://www.spiegel.de/politik/ausland/airbus-a400m-militaermaschine-stuerzte-wegen-software-problemen-ab-a-1034421.html
======
alandarev
Software contractor for Airbus and Rolls-royce here.

All safety critical software (every piece of code ran on-board is safety
critical the least) in aerospace needs to pass the DO-178 standard [1].

That is far more serious than standard unit tests you are used to in node.js
applications. Generally speaking, to develop a piece of code under that
standard it takes 20% of time to write the code, and 80% to testing, and
enormous amount of documentation (that is optimistic estimation, usually
worse).

Quoting speaker from DO-178 training course I attended:

    
    
       People often ask us. "How do we know the standard works?"
       We give this answer: "We do not know. But there have been zero crashes due to software issues since introduction"
    

If this crash confirms the cause to be a software bug, that is something much
bigger than an airplane crash - a huge punch to the whole federal aviation
administration.

[1] -
[http://en.wikipedia.org/wiki/DO-178B](http://en.wikipedia.org/wiki/DO-178B)

~~~
castell
Which programming language is commonly used there? Ada, C, C++, JOVIAL, Asm?

~~~
lamby
With that kind of coding-to-testing-and-documentation ratio, does it even
matter?

~~~
TheLoneWolfling
Yes, definitely.

For a quick example, there are many languages where you _cannot_ accidentally
run off the end/start of an array, barring a compiler error.

With a language like C / C++, it's _possible_. Not _probable_ , given that
sort of testing. But _possible_ nonetheless.

Some languages are also easier to test than others, partially because of this,
partially because of other issues. For instance, in some languages you can
guarantee at the language level (again, barring compiler errors) that
something won't be modified. (Like const, but actually working.)

------
mhandley
Here's Airbus's statement on the Alert they sent to operators:

 _Airbus Defence and Space has today (Tuesday 19 May) sent an Alert Operator
Transmission (AOT) to all operators of the A400M informing them about specific
checks to be performed on the fleet._

 _To avoid potential risks in any future flights, Airbus Defence and Space has
informed the operators about necessary actions to take. In addition, these
results have immediately been shared with the official investigation team._

 _The AOT requires Operators to perform one-time specific checks of the
Electronic Control Units (ECU) on each of the aircraft 's engines before next
flight and introduces additional detailed checks to be carried out in the
event of any subsequent engine or ECU replacement._

 _This AOT results from Airbus Defence and Space 's internal analysis and is
issued as part of the Continued Airworthiness activities, independently from
the on-going Official investigation._

They're asking for a one-time check to be performed on the ECU. If it's just a
software bug, normally a one-time check wouldn't reveal whether or not that
bug could trigger. So, obviously they've found something they're concerned
about, but it seems to me to be a bit early to say as Spiegel Online do that
software caused the crash.

In any event, one of the flight recorders was only just sent off to the
manufacturer:
[http://economictimes.indiatimes.com/news/international/world...](http://economictimes.indiatimes.com/news/international/world-
news/a400m-black-box-sent-to-us-to-seek-crash-clues-
sources/articleshow/47331926.cms)

~~~
JshWright
Or it seems like the problem may have been a known (or suspected) issue with
an old firmware, and the check is to make sure the firmware is above a certain
version (which would also explain why the check would be necessary on any
replacement ECUs).

~~~
mhandley
The A400M that crashed was on its first test flight, so unless they've done
something very odd with versioning, it's unlikely that all its ECUs had older
firmware than planes that already shipped. Besides, with aircraft, all changes
are logged with a ton of paperwork, so they shouldn't need to check the
aircraft to know what firmware they're running.

~~~
DougWebb
It doesn't seem so unlikely to me. Supply chains are long and parts are bought
in bulk. I think it's likely that the parts for the ECUs, including boards
with chips containing the older firmware, are warehoused where the ECUs are
assembled. The ECUs are probably then bought in bulk and warehoused where the
planes are assembled. The paperwork for the plane probably includes the ECU
serial numbers, but it probably doesn't include serial numbers for all of the
components installed on all of the boards inside the ECU, especially if they
didn't have the foresight to think that those numbers would matter. Afterall,
it seems there's a way to get the firmware version by querying the ECU.

~~~
JoeAltmaier
I think you underestimate the tracking done in commercial airplane
manufacture. I believe the source of every rivet is well-known. And how many
planes do they make? Many parts are likely manufactured as needed, one at a
time.

~~~
DougWebb
I'm sure the tracking is a lot more detailed than, say, consumer products, but
it's probably not as detailed as it is for space hardware. There's a cost to
that tracking, and the commercial airplane industry needs to be cost-
competitive. So I'm assuming they track well enough to meet their own and
government-imposed safety standards, but perhaps not well enough to be able to
look up the chip firmware for each circuit board on every in-service airplane
in a database.

As far as JIT manufacturing, that's certainly the case for bigger parts and
systems, but it sounds like the ECTs are replaceable so they probably have
backup replacements stocked near most major airports. (Assuming it's a part
that can be replaced during routine pre-flight maintenance.) And the
components that go into the ECTs are almost certainly produced in large
batches rather than continuously.

------
rurounijones
Watching too many air-crash investigation episodes had lead me to believe 0%
of media reported "facts" surrounding plane crashes.

I will wait for the official accident investigation report.

~~~
madez
This is a very sound decision.

The media is infamous for not getting facts straight. They much rather write
an opinion based on suspicion.

Well, no, thank you. Give me facts, get them straight, then _I_ will make up
my opinion.

~~~
minwcnt5
Related: the Gell-Mann Amnesia effect.

~~~
madez
The Gell-Mann Amnesia effect seems to be a specific version of something more
general: cognitive dissonance. The fact that the media is not to trust is not
forgotten, it is simply not evaluated and acted upon. It sits there in your
brain until you sceptically reflect on what you think to know and how you act.
Some people do that to some degree, most less so.

We have had a long time to recognize that our brains don't work well. It is
time to accept the facts.

------
inetsee
I worked on software for the C-130J military cargo plane. It was before my
time, but an earlier model aircraft crashed during a test flight. The crash
occurred shortly after take off, and the entire crew was lost.

There is a critical time period during a take off when the aircraft is at
maximum risk. If an engine fails before rotation (i.e. before the wheels leave
the ground) an alert crew can stand on the brakes and use thrust reversers.
The aircraft may get dinged up, but there is a reasonable chance the crew (and
passengers) will survive.

If there is an engine failure after rotation but before the aircraft has
gained sufficient altitude, unless there's a big, flat field next to the
runway a crash is almost inevitable.

When an aircraft turns, it will lose altitude unless the crew compensates by
adding power. An aircraft without power and sufficient altitude cannot make
the turns necessary to go all the way around to land on the runway they just
left.

~~~
sokoloff
Your post is substantially correct, with a clarification on rotation speed(Vr)
vs takeoff decision speed (V1).

There are three relevant speeds for large aircraft. (I'm going to generalize
very slightly to keep this short and readable.)

V1, Vr, V2.

V1 is the takeoff decision speed. An engine failure recognized before reaching
V1 is handled by aborting the takeoff. An engine failure recognized after
reaching V1 is handled by continuing the takeoff. At the V1 callout, the pilot
flying removes their hands from the top of the throttles (as a physical
reminder that aborting/rejecting the takeoff(RTO) is not happening for a
simple engine failure).

Vr is the rotation speed, where the nose wheel is lifted from the ground.

V2 is the speed at which the airplane will climb safely with one engine INOP.

In most cases, V1 is the lowest speed, meaning there are cases (between V1 and
Vr) where an engine out with the nosewheel on the ground results in continued
acceleration, then rotation, and flight.

It's a checkride bust to RTO above V1 for a simple engine failure.

~~~
inetsee
Thank you for clarifying my post. I am not a pilot. I am a software engineer
(retired). I worked briefly on the C-130J mission computer operating system,
then on the maintenance software the ground crews used to maintain the
aircraft.

I did not know that there was a situation when a flight crew would continue a
take off after an engine failure but before rotation.

------
tirant
I am not surprised at all. There are a lot of contractors involved in the
development of the software for the A400M, and they are basically competing
for price and employing undergraduates making below €18K/year, which they
replace every few months due to burnouts and bad working conditions. Projects
get continuously delayed, and key people barely stay more than a couple of
years.

~~~
bottled_poe
Have you been involved with the engineering of aircraft control systems, or
are you just speculating?

~~~
mdaverveldt
I can confirm similar situations for other EU IT projects. Currently, I am a
contractor working on the IT side for the Galileo Satellite System. Although
the IT consulting company that I work for pays significantly better then the
18K mentioned (25K-30K range) the quality of the work that is delivered is
terrible. The mentioned salary is for graduate engineers.

The attrition rate is terrible and its impossible to keep people who have
knowledge of the program on board.

I would guess that 30-50% of our team are inexperienced java software
engineers with less then 5 years of working experience. Around 100 people work
on our project but there are probably 3-5 people who still have a clear
picture of how everything is working / supposed to work.

~~~
scrollaway
Uh, there's a massive difference between contracting on satellite systems and
contracting on civil aviation systems.

~~~
colechristensen
In America such things are more often done by DoD contractors and they are
paid well and space systems are more tightly controlled than civil aviation.

------
lexy0202
I have done a quick and rough translation of the German article into English.
Hopefully this is better than the Google Translate version:
[https://gist.github.com/alexcoplan/0018e3320f99a612c737](https://gist.github.com/alexcoplan/0018e3320f99a612c737)

~~~
BuildTheRobots
Much more readable than the Glenglish version -thank you.

I still don't understand "The crash is the worst accident since the
development of the A400M", though. To me that implies there was a pretty
devastating accident during the development.

~~~
bhaak
This is not a translation error. This sentence has the same puzzling meaning
in the German version.

I think it's sloppy journalist writing for the worst accident that happened
with the development of the A400M.

------
bhaak
There are not many details about why exactly the three engines stopped working
and it's not yet officially announced. This article has been written with
"information Spiegel Online received".

Two translated quotes: "The investigation yield a clear result: Shortly after
the lift-off of the test machine, the computers send conflicting commands to
the three engines which then powered off."

"Soon after the crash, experts of the German Air Force suspected a software
issue with the fuel supply unit because such a fatal drop of power so soon
after the start could hardly be explained differently."

So, not much information why the computers sent conflicting commands and also
why the engines power down in such a situation.

~~~
lucaspiller
> So, not much information why the computers sent conflicting commands and
> also why the engines power down in such a situation.

I think shutting down the engines is probably the safest option when this sort
of thing happens. You could argue they should stay in the present setting, but
what would happen if one engine were at 0% and another 100%?

Most aircraft are pretty good at gliding even without power, and I'd assume a
deadstick landing is part of the pilots training. In 2001, TS236 flew
unpowered for 19 minutes before making an emergency landing (on a runway) with
only minor injuries:

[http://en.wikipedia.org/wiki/Air_Transat_Flight_236](http://en.wikipedia.org/wiki/Air_Transat_Flight_236)

~~~
tim333
>what would happen if one engine were at 0% and another 100%

Planes are designed to fly fine in that situation - it's what you get if one
engine breaks down. Now landing the thing with one engine stuck on 100% would
be interesting. I guess you could kill the engine somehow - turn off the fuel
or pull the fuses.

~~~
jonnycowboy
Planes with four engines are certainly not designed to fly with 3 of 4 engines
out...

~~~
Piskvorrr
Indeed, planes are not designed to be flown in that configuration normally,
but they are certainly _capable_ of single-engine flight in emergency
situations (where the alternative would be "flying like a ton of bricks").
Exhibit A:
[http://en.wikipedia.org/wiki/British_Airways_Flight_9](http://en.wikipedia.org/wiki/British_Airways_Flight_9)

~~~
jonnycowboy
No, actually "capable of flying" means capable of performing all flight
maneuvers required for safe flight and landing, including takeoff, go-around
and approach/landing. BA Flight 9 failed all four engines, normally not
survivable - but managed to restart all four engines by windmilling. So they
recovered in time for landing.

~~~
fransan
Just curious, where did you find that definition of "capable of flying"?
Because under that definition a glider ( unless it is a motorize glider ) is
not "capable of flying".

------
traufetterg
Dear Engup, I have read your thread that is unfortunately deleted. I am the
author of the Spiegel-Story on A400M
[http://www.spiegel.de/politik/ausland/airbus-a400m-militaerm...](http://www.spiegel.de/politik/ausland/airbus-a400m-militaermaschine-
stuerzte-wegen-software-problemen-ab-a-1034421.html) I would like to get into
touch with you. gerald_traufetter@spiegel.de Best Gerald

------
mrmondo
I wonder if this is the time to argue that it may be worth open sourcing the
controlling software for hackers to start criticising and contributing pull
requests to. I'm willing to bet that the competence of the collective
community far outweighs that of those specially trained to write the software
at present.

What is there to lose by opening up the software to criticism other than
better aviation safety? We know that obfuscating / hiding source code does not
make applications / platforms safer or less at risk to malicious behaviour so
I'd like to challenge the manufacturers to do so.

~~~
sleeping_pills
While I agree with your statement that open sourcing code can help with
improving it's quality, how exactly do you envision (paraphrasing) "hackers
contributing pull requests" to code that controls engines on an airplane? Here
you have an extremely specialized codebase which can perhaps be understood by
a tiny group of professionals and it can actually be tested by an absolutely
vanishingly small group of individuals under special circumstances. I don't
think "hackers" could even begin to make useful critique of this kind of
software, let alone contribute pull requests to it.

~~~
madez
You sound exactly like my current boss, who says that Linux is just a hobby
project that you can put no trust in.

I mean, who would fix problems if it's just a hobby? And if it's open, it must
be a hobby. Surely, that can't possibly work!

~~~
sleeping_pills
Did you actually read my post? If yes, can you seriously not see the
difference between writing/testing an OS and writing/testing the software that
controls jet engines?

Open sourcing something like Linux works very well _precisely_ because it has
a very large audience and is (relatively) approachable by hobbyists too.

On the other hand, aerospace engineering and software is narrowly specialized
with a (relatively) small group of experts and code used in
commercial/military aircraft is anything but approachable to hobbyists.

Then there is the fact that unit testing this kind of code requires
engineering knowledge of the specific hardware involved (e.g. not just any jet
engine, but one very specific model). Finally, let us not even mention the
huge pink elephant in the room, namely that the absolute and vast majority of
"hackers" does not have access to jet engines used in commercial (or military)
airplanes and even fewer have the ability to conduct test flights.

------
PhantomGremlin
Aircraft manufacturers and operators go through great amounts of effort to
avoid single points of failure. E.g. on a twin they overhaul the two engines
at different times.

But this is different. I wonder if they need to rethink their approach to
software? Four engines, running the same software --> single point of failure.

------
sean-duffy
Wow, this comes as a shock! Last year I saw the A400M appear at the Royal
International Air Tattoo and was very impressed, as were many others.
Hopefully they'll find the problem and this won't be too much of a blot on the
development of this aircraft.

------
tim333
From another report:

"Problems in developing the engines, and particularly in certifying the engine
control software, contributed to three years of delays and a new cash
injection by governments in 2010."

Seems like they had some issues. Surprising it's that difficult.

[http://www.reuters.com/article/2015/05/19/airbus-a400m-idUSL...](http://www.reuters.com/article/2015/05/19/airbus-a400m-idUSL5N0YA28V20150519)

------
userbinator
I wonder if this is what lead to Airbus making a pretty vague statement
compared to Boeing about its software on this other recent related news:

[http://www.bloomberg.com/news/articles/2015-05-18/hacker-
cla...](http://www.bloomberg.com/news/articles/2015-05-18/hacker-claims-of-
plane-takeover-aren-t-credible-official-says)

------
tntcl
So basically the software bug happened, because there are quality problems in
the manufacturing street? wtf!?!?

~~~
madez
The original speaks of quality problems in the manufacturing plant. This is
very general and could include problems in the design phase up to the flashing
of the module after assembly of the plane.

~~~
tremon
I think the OP was reacting to the use of the term "quality problem" to
describe a catastrophic failure leading to loss of human life. At least, that
was my response when reading that quote.

------
zurn
Anyone know how much effort Google is putting into improving Translate? It
feels like it hasn't gotten much better over the years even with these first-
tier language pairs.

~~~
crististm
Effort has little to do with it. If you don't have an angle of attack, you
can't solve the problem. Humans have years of experience to put a text into
context and interpret it.

~~~
zurn
That would be a bad excuse for not improving even if GT matched the state of
the art. But it's doesn't for many/most languages.

