Our standards as developers & testers have gotten pretty low.
I think the only way you'd see outrage here at HN is if the restart involved a real physical crash (as opposed to a software one) and the loss of human lives. Otherwise, we're all, "meh."
The problem is when "have you tried rebooting" becomes the first troubleshooting step and you stop investigating.
But if you realise after the investigation that a reboot reliably prevents the issue, it's a worthwhile approach, especially if fixing the error would require many complicated changes that may lead to further errors.
It's just an unsolved problem. No one never knew how to write complex robust software. Even with memory-safe languages. Even with advanced type systems, generative tests, immutability and advanced concurrency primitives.
And it's not a discipline problem.
I can't say that my car has ever gone 149 hours without a "reboot" of some kind or another.
Or if it were an American company. There wasn’t any outrage when Germanwings’s lack of attention to their suicidal pilot’s mental health even though there were significant warning signs before he crashed a plane into the Alps. Little outrage over Malaysian Air despite their pilot practicing his suicidal crash in the simulator months before. But for an American company, out come the pitchforks.
The fact that Airbus hasn’t yet had a 350 crash is just happy luck, but the issues are still just as outrage worthy. Where is the rage against the EASA that allowed such a glitch to pass certification? Where are the complaints against EASA outsourcing certification tasks as the FAA was criticized about?
If this were an American company, you’d see plenty of outrage here and it would be related to some flavor of the United States not being leftist enough. (i.e. having lower regulatory strictness, lower taxes and thus an “undersized” bureaucracy. There is an undeniable and distinct anti-American sentiment on HN when it comes to issues like this. From everything such as climate, economic policies, consumer issues, defense, and even food and housing choices of Americans, the US always seems to be held in a persistant, low-level contempt, while almost anything European or Japanese gets a positive initial reaction. This might be a controversial opinion around here, but it’s my perception after being around the HN community for almost 7 years and watching it change from a place that celebrates entrepreneurship, business and capitalism to little more than a better moderated subreddit for those that view American startups and technology with derision and contempt. Actual Marxism ideas get upvoted and free market ideas get downvoted. Which is ironic because Marxism didn’t create anything that keeps many of us highly compensated while working on cool technology. This forum itself wouldn’t even exist were it not for a venture capital firm. Most of the tech stack we all use comes from entrepreneurship and innovation from the free market.
Since Airbus is essentially a state-owned enterprise in the practical sense, of course it isn’t going to result in any outrage — or at least not more than one of those dirty, evil, profit-oriented American corporations.
Your claim that American companies are getting the shaft is complete nonsense. It's classic US conservative insanity. It is not a shock, nor is it anti-American bias, that concious engineering decisions which result in loss of life get more scrutiny and outrage. There was plenty of outrage for the VW emissions scandle. Here, not only has there not been any deaths, but an update's been provided. If you seriously think there will be no outrage for a european engineering disaster here, you're crazy.
Otherwise, I agree. Specifically, the criticism of our health care system. How people can see no issue with it is beyond me. There's conservatives that literally think the problem is us treating people in the ER, instead of letting them die. Disgusting.
OP is completely off base on this issue. The examples he gave are not an apples to apples comparison. The idea that we wouldn't criticize an EU company for making boneheaded business/engineering decisions that cost lives is ludicrous.
Fact check: Harsh but mostly true. This used to be the forum for the rationalist and capitalist community in tech, or at least people that tolerated them or were willing to engage without downvotes. Now it attracts those with such distaste for rationalist viewpoints to the point pg himself is altering his essays. His original point was much clearer and more memorable than his revision. Are there any alternatives to this site that look more like the news.ycombinator of 5+ years ago? Maybe a discord or subreddit or telegram group?
EDIT: the replies to your post only reinforce your point, you are being called a kook (“completely insane”) for your post, which is antithetical to civil discussion.
The subhed and the above graf say it's as straightforward to fix as installing Airbus software patches. But I assume this process is not as streamlined or convenient as iOS or Tesla OTA auto-updates. Anyone have insight to what installing patches on an airliner entails? I'm assuming it involves more downtime than powering down and restarting, if that's the current status quo.
Which is a horribly long process, around 6,000 man-hours and puts the aircraft out of commission for a few weeks.
As long as the hardware has been tested, and the software update tested on different hardware, then as long as the test hardware and my hardware are nominally the same, and as long as the software has basic "self test the basics of every component on startup", then I don't see a reason to do more tests.
Certifying a upgrade is the hard part.
But the point is, this reboot process is very well managed and known. So I won't call it scary.
> The remedy for the A350-941 problem is straightforward according to the AD: install Airbus software updates for a permanent cure, or switch the aeroplane off and on again.
Maybe the update process isn't streamlined enough, or maybe there are other reasons not to upgrade, but in general, the airlines need to install the updates.
The issue here is overflow due to time. The time is saved in a variable (don't know how much bits), which overflows after the gives period. Now there are two options
1. Upgrade circuits of every plane. These planes were designed/built a long time back. Bigger registers were not practical due to costs.
2. Document it and have a process for it.
In plane investigations I've looked at (not many) the issue has always been a compounding of several errors or shortcomings .. that strongly suggests you shouldn't let small errors build up in different systems, to me. 
If it's a register which takes down the whole system then surely they'd know that (and could fix it with a watchdog that returned the effected systems to the boot state without reboot) -- other comments seem to be saying "meh, it's complex, doesn't matter what the error is as long as reboot fixes it"; that seems really dangerous in safety critical systems.
 but I acknowledge the "better the devil you know" issue and that pragmatism and cost take over at some point.
I agree with that. But once you know about a specific big, there is always a solution.
How is this news?
Place- Small financial institution in the NE.
Having just finished two weeks on the job, one Friday evening before heading out I decided to reboot my Sun Sparcstation. I of course did not have root but there was L1-A which put you into the BIOS. Then: sync;sync;reboot
Workstation starts rebooting.
30 seconds later the sysadmin is standing over my shoulder.
SA - "What did you do?"
me - "I rebooted it"
SA - Incredulous. Like I just set my hair on fire. "Why?"
me - "Its been two week, you know defrag memory, free up the page tables"
(some vague psuedo cs bs)
SA - "This is a Solaris X.Y/SunOS Q.Y machine.
It had an uptime of 180+days.
I have machines here with an uptime of 2+ years."
me - "Really?"
SA - "These machines do not need a reboot. Ever. Please do not do this again."
How far we have (pro|re)gressed in 20 years!
;) (edits - typo)
This was a bug that was known, what if there others?
From that point on, I make sure test cover counter wrap around in 2^31, 2^32 etc. Until one day, someone told me the system I worked on, deploy in Comcast data center has issue after ~250 days......
I used hearing news about Newton, Win98, 777 etc has issue after 25 days, 49 days, etc. It is very easy to guess what the potential issues were.
E.g. you might want to set a certain bit if the value went outside the defined input range of a function (think for example far from the accepted numerical window of a Taylor series expansion). Instead of dealing error conditions at each and every step (thereby making the timing properties of the code unpredictable) you just collect all error conditions and at the end decide if you discard the value, or just use it partially (e.g. it might still be "good enough" to be used as a parametizing input of an adaptive filter, where it averages out with the rest).
"The remedy for the A350-941 problem is straightforward according to the AD: install Airbus software updates for a permanent cure ..."
And it has the known workaround.
So this has almost nothing to do with Airbus at this point, the directive and the "sighs" uttered by the EU aviation agency are directed at the airlines that won't install the update.
But good distraction from Boeing's woes as long as you only read the headline...
Why won’t EASA ground the un-updated airplanes?
I know, the A350 bug hasn't killed anyone (yet). But I see the parallels in the issue, and yet the reactions here are completely opposite.
“Hello, IT, have you tried turning it off and on!?”
AFAIK that is what happens here.
"The CPIOM is effectively a mini computer; in the A350 CPIOMs run discrete avionics "applications", in the sense of apps. CRDCs themselves do not host or run applications, suggesting that the failure condition detailed in the EASA AD may mean loss of a particular app on a CPIOM after a buffer overflow."
Or do you have any objective basis for suspecting that this is a buffer overflow?
But none of these kinds of observations are fundamental. There's no general conclusion regarding software engineering to be drawn.