
The Bug Nobody Is Allowed to Understand - lisper
https://www.gnu.org/philosophy/bug-nobody-allowed-to-understand.html
======
sgentle
Yes and no. I think Stallman is right in that it's important to realise the
_relationships between_ components are often a bigger and more pernicious
source of bugs than the components themselves. It's not a problem when you can
pretend the components are actually one single component, which is obviously
not possible with service architectures or proprietary code.

I'd expand that to say there's a kind of CAP theorem analogue for code, where
if you want it to be sufficiently modular (partition-tolerant) you'll need to
either sacrifice availability (before you know if it works you have to check
its interaction with every other module) or consistency (you write code
assuming the other modules work the way you expect and maybe they don't)

Unfortunately, beyond that core idea this article is just sort of befuddled.
It _is_ possible to understand the behaviour of a component without seeing its
code (contracts, APIs, specs and tests are all examples of doing this).
Further, it's a problem not at all specific to proprietary software. Even if
you have access to every line of code ever written you're still not going to
be able to understand the million interactions that could exist between all
the things currently running on your computer in their various languages,
frameworks, architectures and programming styles. And, to the extent that you
can, it's probably not a very good use of your time compared with just
emailing whoever wrote it and saying "hey, your software's not doing what I
think it should do".

~~~
qwerta
I upvote, but disagree.

Studying source code is first step. There is difference between "full
understanding", and fixing a bug when software crashes under very specific
conditions. Modern IDEs (and hopefully unit tests with VCS history) are great,
1+ MLOC is not really a problem.

Good luck with contacting original author, I usually give up after a few
weeks.

And even proprietary software can expose enough "source" without compromising
its monopoly. Microsoft bundles debugging symbols, so you can debug their
software and roughly understand whats going on.

------
kcorbitt
If multiple proprietary software packages are talking to each other at all,
there must be either an implicit or explicit specification they're talking
over. And if that interaction is broken, that implies that either (1) the spec
is ambiguous/wrong or (2) one or both parties are implementing the spec
wrongly.

It seems to me that engineers from the relevant companies ought to be able to
get together, talk over the problem and figure out which of those is the case,
even if they're not looking at the same source code.

In any case, a well-defined spec/API is critical to effective integrations
between pieces of software maintained by different teams, even if both
components are open source.

~~~
percept
"...you have teams as well as individual employees in companies able to manage
the technology, but these tech-savvy people are never high enough in the
company's hierarchy. What happens time and again is that those who have the
expertise get overruled by those who don't."

~~~
Aldo_MX
Even when tech-savvy people is high enough in the company's hierarchy, the
business culture usually is to avoid fixing a bug until the bug costs more
money than its fix.

In my honest opinion this sucks, but oh god, most companies live with a "only
money matters" mindset...

~~~
yourad_io
> Even when tech-savvy people is high enough in the company's hierarchy, the
> business culture usually is to avoid fixing a bug until the bug costs more
> money than its fix.

That's not quite 100% correct.

> In my honest opinion this sucks, but oh god, most companies live with a
> "only money matters" mindset...

That's quite 100% incorrect.

Good businesses will do risk assessment. If the risks are communicated
correctly, the choice of mitigating them for a few man-hours is obvious.

There's bad businesses, but I don't think that's most businesses.

~~~
Aldo_MX
I completely agree with you... in theory.

But in my experience, companies tend to look down at issues which affects
customers, but don't affect the company directly, the risks are understood
perfectly, but the choice almost always is "we will be as negligent with our
customers as we can be, no matter if our customers die".

An old example which features "no matter if our customers die" is the Ford
Pinto[1].

Fortunately, there is not a current example as fatal as the Ford Pinto, but
the practice to overlook what affects customers remains in effect. At the end
of the day, Risk Assessment is just a way to say "what matters to partners and
investors", which is was what I tried to summarize with my previous point of
"only <s>money</s> profits matters".

[1]
[http://myweb.whitman.syr.edu/pjcihon/LPP%20255/Ford.Pinto.Me...](http://myweb.whitman.syr.edu/pjcihon/LPP%20255/Ford.Pinto.Memo.pdf)

~~~
sokoloff
The Takata airbag and GM ignition switch issues are comparable, IMO.

"If the air bag housing ruptures in a crash, metal shards from the air bag can
be sprayed throughout the passenger cabin—a potentially disastrous outcome
from a supposedly life-saving device."

------
cmdkeen
The Guardian like to run these stories about how terrible competition is
without acknowledging that no other system in reality actually fixes the
problem, no-one is actually ruled by philosopher kings. IT security is
notoriously bad in many governmental areas, non-profits, universities etc as
Manning showed us. The Soviet Union after all gave us the Stakhanovite
movement.

The difference is that in a competitive environment you get creative
destruction, where things that go wrong benefit competitors who then learn
from the problem. Yes there are all sorts of problems when banks become too
big to fail, it's far worse when there is only one bank.

Competition is an amazing thing - this salesman who moaned who the Guardian
has an opportunity to start selling better encryption software, disaster
planning and testing consultancy services, virtualisation software or toolkits
that stop random people messing with production server they don't understand.

~~~
Retra
Competition is only valuable as a means to attain greater cooperation. Once
you start sacrificing cooperative results by taking shortcuts, worshipping
selfishness, or keeping your failures secret, then your competitive
environment is doing more harm than good.

It is good when a team _gets together_ to beat an opponent. It is good when a
company _works with it's clients_ to build a better product.

So cooperation is the other system that, in reality, actually fixes the
problem. Almost every significant human achievement has been a result our
mastery of language, which is clearly a tool of great cooperation, not of
competition.

~~~
cmdkeen
Why do we cooperate though? Internet Explorer didn't get better through
cooperation, despite having a big team, until there was competition. The Space
Race was competitive with the Soviets.

Cooperation isn't the inverse of shortcuts, selfishness or keeping failures
secret. Cooperation between companies that are supposed to compete, the
forming of cartels, is a really bad thing after all.

Humanity is competitive, social creatures. Cooperation starts at the family
and tribal level which morphed into nation states (eventually), but it is
fundamentally driven by competition.

------
geographomics
One doesn't need access to source code to fix bugs. However, it does make it a
great deal easier, and saves time that would otherwise be wasted reverse
engineering the system to understand exactly what is going on. But it's not a
necessity.

~~~
userbinator
I think this is something that should be understood far more - I've seen far
too many developers basically give up when the execution of the code they're
debugging goes into something they didn't write or an exception happens in
there, and are shocked when I'll just keep on tracing in Asm. In fact for
languages like C++ where a single statement can do a lot of implicit things, I
prefer to debug in Asm since I can see exactly where things went wrong.

Maybe education is partly to blame, as many students are taught that the only
thing they interact with is the source code, the binary being an opaque blob
that comes out of a one-way process, and trying to go the opposite way is
somehow seen as "wrong" (legal issues notwithstanding); the emphasis on low-
level architecture and Asm as "you're not supposed to know this" further adds
to the notion, along with the increasingly closed nature of software and
hardware. Contrast this with those who grew up with early computers of the
late 70s/early 80s, where it was almost second-nature to use a disassembler
and analyse the firmware even without its source code (the source was usually
Asm in those days, so you didn't miss much, but my point still stands.)

~~~
name_censored_
> I think this is something that should be understood far more - I've seen far
> too many developers basically give up when the execution of the code they're
> debugging goes into something they didn't write or an exception happens in
> there

Those other devs may be suffering from a kind of "appeal to authority" fallacy
- they assume that this release-grade third-party code section must be
correct.

Or, it may be that they know that even if the problem _does_ lie in the third-
party binaries, the only fix is to submit a bug report. Since a bug report
with a comprehensive steps-to-reproduce is more likely to get addressed than
one with an ASM trace, it makes sense to focus effort there.

Or, it may be that they're clock punchers/overworked, and tracing someone
else's code jeopardises them leaving on time/shortening their "in" tray.

------
yourad_io
Are we discussing this[1] or did I somehow get lost?

> Then he tries to copy the contents back, which is impossible with encrypted
> files and this is how he discovers what he's done [...]

 _facepalm_

> To unlock the encryption you need special keys, which are stored in one
> central place [...] They went through the system and thank God, the switches
> had not yet been reset, meaning the keys could be retrieved

Thankfully, God duplicated the keys onto... switches?

After a while, my eyes rolled too much and I stopped for fear of epilepsy.

[1] [http://www.theguardian.com/commentisfree/joris-luyendijk-
ban...](http://www.theguardian.com/commentisfree/joris-luyendijk-banking-
blog/2012/may/30/former-it-salesman-voices-of-finance)

~~~
lisper
I actually posted both this and the non-link-baity version at the same time:

[https://news.ycombinator.com/item?id=8799505](https://news.ycombinator.com/item?id=8799505)

It was kind of an experiment to see which one would be upvoted more. Can't say
I'm surprised at the result.

------
chrismcb
The premise of the redirectable blog post is interactions if a variety of
systems. The original post talks about new technology being a problem and no
one has procedures for it. YET the anecdote that is being solved is an
uninformed user doing something he knows little about. Yes new technology
should make it harder to break systems. But a user mixing around in something
he knows little about us an age old problem. And it had nothing to do with
proprietary systems. Interoperability, nor new technology

------
andybak
And how does OSS fix this? It's still possible to create impenetrable and
poorly understood systems even if you take proprietary software and SaaS
completely out of the equation.

~~~
aceperry
The working assumption is that anyone who has a problem somewhere along the
software chain will have access to the source and have the ability to fix
problems when they need to. They're not dependent on one company to make fixes
when there is a problem.

------
SixSigma
My biggest shock in this story is the people in the comment section that find
the premise of the story incredulous; that many institutions - small, big,
large and very large - have inadequate disaster recovery plans and it is only
a matter of time until one makes it into the national news.

------
jamesaguilar
In reality, when this happens, the companies in question call each other up
and fix it, generally speaking.

------
lotsofmangos
After reading the linked article, I am thinking of writing a bash script
called 'lookBusy' that makes a machine look as though it is doing vital work.

This can then can be run on any machines that are idling but important, to
reduce the chance of idiots appropriating them.

~~~
click170
You need to put more effort into being lazy. Google, my good man, Google.
Those scripts already exist, all you have to do is download and run them.

~~~
Karunamon
Is that more or less effort?

~~~
lotsofmangos
Much less fun.

