
F-35 radar system has bug that requires hard reboot in flight - pavornyoh
http://arstechnica.com/information-technology/2016/03/f-35-radar-system-has-bug-that-requires-hard-reboot-in-flight/
======
hackuser
I read elsewhere, from a reliable source that I can't find right now. that the
mean time between reboots on 'mature' fighter planes is 8 hours and the F-35
is at around 4 hours so far with plans to improve it.

It was the first number that shocked me the most.

EDIT: I found it:

[http://breakingdefense.com/2016/02/bogdan-
predicts-f-35s-for...](http://breakingdefense.com/2016/02/bogdan-
predicts-f-35s-for-less-than-80m-engines-included/)

 _The toughest problem the program is having is matching the timing of the
aircraft’s fusion software with its sensors’ software. “As we add different
radar modes and as we add different and capabilities to the DAS system and to
the EOTS system, the timing is misaligned,” and then you have to reboot it.
Bogdan said he’s aiming for eight to nine hours between such software failures
when a radar or DAS or EOTS needs to be rebooted, which is what legacy
aircraft boast. Right now they are at four to five hours between such events.
“That’s not a good metric.”_

I don't understand a lot of that, but I think the fusion softwre refers to a
unified UI for the pilot that merges information from the plane's very many
sensors and presents it in a comprehensible way on one screen. Apparently
pilots in older planes suffer from data overload and from watching many
different screens at once - all while trying to fly a plane in combat. UI is
reported to be a highly consequential improvement for the new plane.

~~~
mbell
> The toughest problem the program is having is matching the timing of the
> aircraft’s fusion software with its sensors’ software

I don't know if I'm interpreting that correctly but what I'm getting out of it
is clock skew between sensors and the fusion system, which I assume to be
referring to sensor fusion.

That makes sense, analogous to a typical distributed systems issue. I would
imagine the skew needed for the type of sensing they do is quite small. I
doubt they can just toss an atomic clock on the plane :)

~~~
gherkin0
> I doubt they can just toss an atomic clock on the plane :)

They totally could. Some atomic clocks are quite small. The one pictured on
the below wiki page doesn't look that much bigger than a 3.5" hard drive.

[https://en.wikipedia.org/wiki/Rubidium_standard](https://en.wikipedia.org/wiki/Rubidium_standard)

~~~
mbell
> They totally could. Some atomic clocks are quite small.

Are they mil spec? In other words do they operate correctly over a very large
temperature range, with large input voltage variance, under high G-forces, in
high vibration environments, etc?

I don't know anything about atomic clocks past what I can read on wikipedia
but, I can say there is generally a large difference between mil spec ICs and
commercial or industrial grade ICs, I would assume this would extend to atomic
clock scale devices as well, but just a guess.

~~~
engi_nerd
Who needs atomic clocks on individual planes when the US government has spent
billions of dollars to place a fleet of atomic clocks in orbit?

~~~
mbell
Someone whom wants their planes to work when the enemy jams GPS and/or shoots
the satellites down. We're talking about war here after all.

~~~
engi_nerd
So you freewheel once GPS goes down. You might drift a few microseconds. This
is not necessarily a problem.

~~~
NeutronBoy
If you're traveling at high speeds and relying on this timing to adjust flight
control surfaces, or trying to use radar to get position of combatants and/or
the ground, a few microseconds clock skew would be a major issue.

~~~
engi_nerd
I didn't say clock skew. I said drift. As in, you lose GPS timing signals so
your own internal master clock freewheels. That has nothing to do with clock
skew (different components within the jet 'seeing' different time due to
delays or other issues). The jets are designed to function with degraded or no
GPS functionality. But, if it's there, they will take full advantage of it.

------
maxxxxx
I am getting a little tired of the "F-35 is bad" headlines. There should be
more context that tells us how common such problems are in general. I remember
listening to a podcast where they said that SR-71 pilots had to restart their
engine every so often. Same for the Concorde. Sounds really bad but neither of
these problems hurt the airplanes' usability and were not really a problem.

I firmly believe that military procurement is a mess but the discussion
reminds me of the political discussion where Republicans jump on anything that
goes wrong under Democrats and vice versa.

~~~
Jtsummers
To be fair, the SR-71 was a highly specialized, and ultimately, highly
experimental aircraft at its time.

The Concorde was similar. Only a few were ever made, and they were never
expected to be the ubiquitous aircraft of their time or even their type
(reconnaissance and passenger transport).

The F-35 is intended to replace several aircraft models which currently
satisfy more specialized roles. It _was_ intended to be the ubiquitous (in
various configurations) fighter aircraft for the US military (and some
allies). This problem may not be the worst of its issues, it may not even be a
major issue or really (from the pilots' perspective) an issue at all since
similar reboots (but less frequent) are the norm for them in other aircraft.

But many other issues are demonstrating, and the reduced production numbers
and increased production and development costs reveal, that there are numerous
problems with this project in particular. It shows failings within the US
political structure (that's essentially forced this project to continue), DoD
procurement (that allowed many of these issues to occur initially), defense
contractor procedures (LM, IIRC, doubling or tripling the number of engineers
on the software side to "speed things up").

Really, it's going to be a classic for future engineers, computer scientists,
MBAs and others to study. So I guess that's a good thing.

~~~
maxxxxx
I totally agree that the F-35 project is a mess. But I would like to see a
real analysis of the problems and not just jumping on any problem that sounds
bad on the surface. Maybe a hard reboot is a problem, maybe not. I don't know.

~~~
Jtsummers
[http://www.dote.osd.mil/pub/reports/FY2015/pdf/dod/2015f35js...](http://www.dote.osd.mil/pub/reports/FY2015/pdf/dod/2015f35jsf.pdf)

Haven't read it all, but started reading it. At least this is more "from the
horse's mouth" rather than outside commentary like the Ars article.

------
sean-duffy
According to a lot of people in this thread[0], it is pretty much expected
that the radar in any fighter jet will need rebooting from time to time.

[0]
[https://www.reddit.com/r/technology/comments/49iegx/radar_gl...](https://www.reddit.com/r/technology/comments/49iegx/radar_glitch_requires_f35_fighter_jet_pilots_to/)

~~~
neurotech1
This is correct. A radar or mission avionics system reset isn't a critical
emergency. It may force a mission abort, but the jet is still flyable.

A reset event would be more worrying if it was a FCS (Flight Control System)
computer reset in flight. Air Asia 8501 did something similar and the jet
crashed into the sea. An F-22 pilot failed to properly reset his FCS before
takeoff, and destroyed a $150m+ aircraft at Nellis AFB.

------
twoodfin
I assume that bugs requiring reboots to resolve is part of the design of the
failure model. When a software fault is detected, its consequences are
unknowable. Those consequences could include killing the pilot and destroying
his $100M aircraft. Therefore, the default response is to reset everything and
get to a known-good state ASAP.

This is not a bad general-purpose failure strategy: It's been believed for
decades that most software faults encountered in production are Heisenbugs[1],
and disappear rather than reproduce.

[1] [http://research.microsoft.com/en-
us/um/people/gray/papers/Ta...](http://research.microsoft.com/en-
us/um/people/gray/papers/TandemTR85.7_WhyDoComputersStop.pdf)

~~~
macintux
Indeed. Hence Tandem's success with hardware and Erlang's with software.

Crashing doesn't have to be crashing.

------
awinter-py
Though to be fair, the software isn't due to reach 1.0 until 2050 (assuming
the pentagon can scrape up another billion dollars to train ada programmers).
No surprise that alpha software is bug-ridden.

~~~
Someone1234
According to this Stackoverflow answer the aircraft is mostly C/C++, no wonder
they're having so many issues:

[https://stackoverflow.com/questions/9827176/what-is-the-
pred...](https://stackoverflow.com/questions/9827176/what-is-the-predominant-
programming-language-used-for-the-f35-lightning-ii-aircr)

I'm not sure there is anything I'd build with such insecure languages in 2016,
let alone an aircraft or flight system.

~~~
jasonwatkinspdx
Developing avionics software is quite different from what almost everyone on
this site does. Check out JPL's 10 rules to get a taste of how it works, then
consider coding within a bureaucratic process where not a single character of
the source or supporting documents changes without multiple approval and
confirmation steps.

It's a painful and expensive process, but it _can_ work, as demonstrated by
the handful of teams that have reached the ~1 bug per 1e6 opportunities defect
rate _all_ using variations of the above (tho some of them use Ada).

Let's check ourselves a bit and realize developing flight software is a
different world from web services and mobile apps.

~~~
Someone1234
Smug condescension aside you didn't actually say why using a less
secure/quantifiable language is a benefit to avionics software? I read JPL's
10 rules, they're pretty common sense stuff that I'd hope any C/C++ project
would take advantage of, but they too don't address the reason why C/C++ would
be better than Ada for this specific use-case? Or functional for that matter?

> Let's check ourselves a bit and realize developing flight software is a
> different world from web services and mobile apps.

I agree, it is far more important in avionics to be provably correct and to
eliminate as many bugs as possible. It is exactly BECAUSE these aren't web
services or mobile apps that C/C++ shouldn't even be a candidate.

You've made exactly zero arguments to support C/C++ here. Just acted smug and
condescending to all of us lowly other developers. Do you yourself work in
avionics?

------
Raphmedia
> F-35 radar system has bug that requires hard reboot [of the radar] in flight

~~~
pc86
To be fair, a 4-hour reboot on the radar is pretty bad to have in-flight.

~~~
Demiurge
I don't understand, why are people downvoting comments like yours without any
explanation for those without the background on this :\

~~~
Sharlin
Because it was a simple reading comprehension fail, not specifically related
to the subject at hand.

~~~
solipsism
If downvotes only affected the specific comment, that would make complete
sense, but they also affect karma. I guess it's not irrational to punish
people who consistently misunderstand things, but is that a purposeful
decision or an unfortunate byproduct of the karma system?

~~~
hackuser
The idea isn't to punish people, IMHO, but to improve content for others
reading the conversation. That comment doesn't contribute to the conversation,
and in fact it confuses it, so it gets downvoted to where few will read it.

Voting is about the readers, not the commenters. Personally, other than for
determining specific capabilities such as downvoting, I think tracking karma
per user does little good. Karma per comment - a bit of feedback on your
comment - is useful.

------
BinaryIdiot
Had a bug. Everywhere I read said they already delivered a fix for this but
I've noticed some news outlets not mentioning it.

Though I gotta say this damn jet has been a nightmare. Since they'll never
stop working towards newer and newer jets I'm curious if lessons learned here
will apply to the next one. After doing government contracting for many years
where we don't get the "lessons learned" from the previous contractors or much
of anything (I've had to FIGHT to get source code for a product we were
supposed to update; ended up having to rewrite the damn thing) I'm curious how
it works when they contract out to have a jet built.

------
hacker_9
'Have you tried turning it off and on?'

~~~
engi_nerd
You joke, but for so many things in aviation, this is the answer. There are
also a lot of problems where the answer is "re-poll all your interrupts, that
problem you (the computer) think is a problem shouldn't be anymore".

------
Havoc
Judging by reddit comments this type of thing is pretty common with fighter
jets which is kinda surprising to me. I would have thought that this is one
area where reliability is highly prized. Along with space shuttle software &
other life/death stuff etc.

Instead "bash it with a hammer and maybe reset it" seems widely accepted.
Weird.

------
Someone
So, how fast does it reboot, and how much does this affect the other software
components and operating the plane?

Not that I expect it, but this could be a minor issue, if this 'radar system'
is as good as stateless (and not, for example, a component that tracks friends
and foes over time) and if it reboots in milliseconds.

------
sliverstorm
So? The SR-71 has (had?) a bug that required both engines to restart in
flight. Frequently.

~~~
alayne
Do you have a reference for this?

~~~
dogma1138
It wasn't really a "bug" but more that the computers could not keep up with
changes in the flight conditions so if the engine spikes were not in the
correct position for the speed and altitude it could cause a blow back which
would extinguish the engines which will require the pilots to restart them.

[https://en.wikipedia.org/wiki/Lockheed_SR-71_Blackbird#Air_i...](https://en.wikipedia.org/wiki/Lockheed_SR-71_Blackbird#Air_inlets)

~~~
gherkin0
It also looks like they later mitigated it as technology improved:

> Lockheed later installed an electronic control to detect unstart conditions
> and perform this reset action without pilot intervention. Beginning in 1980,
> the analog inlet control system was replaced by a digital system, which
> reduced unstart instances.

------
dboshardy
Looks like they need to add a "Turn Radar Off and On Again" to their FENCE IN.

------
iblaine
> Lockheed Martin has discovered the cause of the problem and has diverted
> developers who were working on the next increment of the F-35's code to fix
> it. A patch is expected by the end of March.

Much ado about nothing.

~~~
s_m_t
Seriously, isn't this to be expected? I know a lot of people have a massive
hate boner for the f-35 because they read a lot of shitty war blogs but for a
bunch of coders to act incredulous towards this information is absolutely
ridiculous.

How many bugs have any of you introduced and fixed in the past week?

~~~
solipsism
Most of us are working on things that are several orders of magnitude _less_
likely to get someone killed than the F-35 radar code. The goal for safety-
critical embedded flight code is 0 hard reboots mid-flight. Is that
controversial, in your opinion?

Edit: "standard for safety-critical embedded flight" was a bad word choice and
wasn't what I meant. s/standard/goal

~~~
icegreentea
They found a bug in testing and are fixing it. That's literally why you test
critical code, in case you didn't get it right.

~~~
solipsism
I realize this. But bugs are sometimes worth talking about. Many people feel
that a bug requiring a hard reboot mid-flight in a 5th generation fighter, a
problem apparently not able to be overcome through multiple generations of
fighters, is worth talking about. If you, or the original complainer,
disagree, you're free to not participate in the conversation, but complaining
about the rest of us talking about this bug is annoying.

Quoting from Boeing: _The F-35 has the most robust communications suite of any
fighter aircraft built, to date. Components include the AESA radar, EOTS
targeting system, Distributed Aperture System (DAS), Helmet Mounted Display
(HMD), and the Communications, Navigation and Identification (CNI) Avionics._

If by _robust_ they mean "you can go 8 hours instead of 4 hours between hard
reboots", I think that's worth talking about.

------
crb002
How much you want to bet that simply running Valgrind would have caught it?

~~~
officialchicken
Everything... valgrind can not and does not fix GIGO.

------
themodelplumber
Anybody know how long the reboot takes? That seems like important information.
Especially if e.g. your AAMs need directions from the master system (?) in
order to take over tracking by themselves.

~~~
bra-ket
3hr and 55 min

------
matt_wulfeck
I bet the Chinese read about the F-35 fiasco and just laugh at us. The amount
of money we're throwing down the hole here is staggering.

------
btbuildem
"Have you tried turning it off and on again?"

------
bsamuels
this doesn't surprise me one bit. I interviewed at one of the major defense
contractors for a programming job last year and the technical interviewer had
never heard of C#

this was for a java position

~~~
tclmeelmo
What use would a major defense contractor have with C#? And on what basis do
you assume their code base is poor?

~~~
simonw
It doesn't matter if they're using it or not. If you haven't _heard_ of C#,
you are very unlikely to be a competent software engineer.

~~~
tclmeelmo
There is so much more to software engineering than choice of language. I think
it's entirely possible that defense contractors have more knowledge of the
software development lifecycle than you are giving them credit for.

~~~
mikestew
I believe parent's point is that one should _know_ that C# is a choice, not
that they have to use it. How can one make an informed choice if one doesn't
know what the choices are? If all you have is a hammer, and all that.

~~~
Jtsummers
One of the best programmers I ever worked with would probably not know of C#
beyond that it exists, maybe that MS makes it.

But I'd hire him over nearly anyone else for embedded work. He did absolutely
no programming or engineering work after hours. He played his guitar, went to
music festivals, and ran a club (a couple different times in his life, don't
know what he's doing these days or if he's still alive, he'd be past 70 now).
C# doesn't enter into the discussion when developing embedded software, and
his lack of knowledge of it indicates nothing about his character or ability.

------
l3m0ndr0p
Do they initiate the reboot with CTRL-ALT-DEL sequence?

