
Scientists Explain Why Computers Crash But We Don't - jaybol
http://www.physorg.com/news192128818.html
======
jerf
I'm underwhelmed. The reason we build our programs that way is that actually
makes them _less_ likely to crash. We actually have access to programs
structured more like the E. Coli graph. They crash a lot. Because the fact
that one error takes down the whole system has less to do with the
organizational structure of the code, and more to do with the fact that we've
deliberately built the system such that one error will crash the entire
system... because the alternative is worse. There's no recovering from a
segfault not because there aren't enough different implementations of core
functionality, but because there is no sensible way to recover from a
segfault.

Cells have a type of redundancy programs do not have, and may never have, and
it actually hasn't got anything to do with the source code at all. They have
numerous independent copies of things that mostly work, most of the time. They
perform tasks that transform lots of things into lots of other things, most of
the time. The whole system is built so that everything mostly works, most of
the time, usually recovers, and it doesn't matter much whether this cell dies
or that mitochondria malfunctions. It's an entirely different set of
primitives brought on by massive parallelism. (And not the usual biological
parallelism you hear about in the brain, but physically, everywhere.)

It's the difference in primitives that brings about the difference in result.
Difference in optimal layout is an _effect_ , not a _cause_. Cells don't have
the option to work like programs, and programs really don't have the option to
work like cells. (At least not without a lot more computational resources.)

~~~
vsingh
>The whole system is built so that everything mostly works, most of the time,
usually recovers, and it doesn't matter much whether this cell dies or that
mitochondria malfunctions.

What makes you think we can't design reliable software systems that way? In
fact, I think it has already proven to be a remarkably good idea:
<http://erlang.org/>

~~~
jerf
Heh, I actually cut out a segment where I started describing a system I could
build that would work like that, starting with Erlang. (I actually program
professionally in Erlang, though not exclusively.) It got too long and
parenthetical.

It's still not the same. There's just no biological equivalent to a bit of
code that is dividing by zero or referencing a file that doesn't exist, or any
of several other errors I've made that have brought enormous swathes of the
supervision tree down because they're restarting crappy code. Erlang adds this
sort of biology-style robustness at the top of the stack, biology itself works
with it at the bottom of the stack. That changes everything. Programming with
massive unreliable parallelism may indeed someday happen, but it's a long road
between here and there.

~~~
j-g-faustus
Well, we can find or create equivalences.

Someone I knew did a master thesis on the male reproductive system, he told me
that males have 10-15 largely independent biological pathways to produce
sperm. From nature's perspective the point is presumably that if you can still
breathe, you should be able to reproduce :)

If we were to put the same level of effort into a file reading module, we
would have the files replicated over 10-15 different systems with file reading
code written by a dozen different people in a dozen different languages, all
using different heuristics to locate a similar file (or a backup) if the
original wasn't found. Add some sort of selection mechanism to pick the best
result from the 10-15 return values, and you would have a very resilient file
reader :)

I can't imagine we will ever want to write code like that by hand - drawbacks
include development cost and maintenance hassle, and the system becomes very
hard to understand and debug.

But it's still an interesting approach. In computer systems I suppose we could
bring it about by some variant of evolutionary programming. (
<http://en.wikipedia.org/wiki/Evolutionary_programming> )

~~~
jerf
Yeah, that's pretty much how I was thinking of it. Enormous effort, even if
you do get to use evolution. And consider the whole cycle of building a web
page; organically retrieve a file, organically open a connection to a database
(with an organic protocol, of course) to organically retrieve some organic
data, organically convert it to some sort of organic representation (HTML is
too rigid, we'd need some sort of probabilistic representation or something)
and organically render it in a browser; the complication is simply enormous
and the payoff? For enormously more computational power, you have a net
decrease in the reliability of the whole process.

I could see how AI could use such a thing, especially since the best
intelligence we know works that way. But in general? It seems less than
awesome.

------
yaroslavvb
Note that the original paper doesn't say anything about crashing. Their point
is that E.Coli regulatory network has a different topology than Linux call
graph, in particular, a much lower overlap between modules and little reuse of
low-level workhorses. They postulate that higher level eukaryotic networks
would have more reuse and be more similar to Linux call-graph.
[http://www.pnas.org/content/early/2010/04/28/0914771107.full...](http://www.pnas.org/content/early/2010/04/28/0914771107.full.pdf+html)

------
vsingh
I'm trying to understand the lesson behind this result.

I don't think it's the obvious, well-understood fact that biological systems
have massive, redundant parallelism, whereas our software systems do not.

I believe it in fact says something very specific and fairly non-intuitive:
that biological systems have many slightly different copies of key routines,
whereas our software systems as they are designed today do not.

"That’s why E. coli cannot afford generic components and has preserved an
organization with highly specialized modules, said Gerstein, adding that over
billions of years of evolution, such an organization has proven robust,
protecting the organism from random damaging mutations."

For example, imagine instead of having one 'sort' function, you had different
sort functions dispersed throughout every area of your code that performs
sorting, and each one was very slightly optimized (through design or some
unspecified evolutionary process) for the particular characteristics of the
data being sorted at that program location.

Thus, 'sort' is no longer a single point of failure. If one of your sort
routines has an exploitable buffer overflow, then it's probably the only one
that does, which limits the potential damage to the system as a whole --
especially if you've designed your entire system this way.

Could it be better in some cases to copy and slightly modify a software
component, than to simply reuse it?

~~~
sesqu
It absolutely makes sense to make your fallbacks independent from one another,
but implementing fallbacks at all is expensive.

In evolution, you're pretty much guaranteed a breakage at some point that
can't be fixed. That's not quite as true with software - you're still
guaranteed breakages, but you get to fix them, and fixing dependent systems is
a whole lot more economical.

I think the best analogy is interfaces. You code to an interface with multiple
implementations, and if a problem occurs, you switch implementations. Next, if
you have vulnerable and complex components, you make sure they each have a
custom implementation, so the inevitable bugs can't be widely exploited.

------
thisisnotmyname
Couldn't getting sick be considered crashing? We obviously have a lot of
redundancy (10^12 cells) so we don't "crash" unless a lot of those cells get
sick. In E-coli, on the other hand, there are plenty of genes in which a
single mutation kills the whole organism. (302 of them according to Japan's
National Institute of Genetics:
<http://www.shigen.nig.ac.jp/ecoli/pec/index.jsp>)

------
goodside
Counterexamples include SIDS, stroke, epilepsy, psychosis, and suicidal
depression.

~~~
thristian
Ever since I learned about it, I've always thought epilepsy was a astounding
example of the brain's ability to recover from cascading system failures.
After an episode of epilepsy, not only can the brain resume its ordinary
functioning, but it hasn't even rebooted — non-volatile storage (memories and
learned skills) is not corrupted, and even higher-level state like personality
is unscathed.

It's unfair to criticise a system for having a failure-state, because all
systems have failure states. Different systems have different ways of handling
failure-states, though, and the brain's ability to cope with failures is nigh
amazing.

~~~
goodside
Just to be pedantic, non-volatile storage is what survives a reboot on a
computer. For the brain to have "not even rebooted" you'd have to come out of
the seizure with your short-term working memory intact. I'm not an expert, but
I strongly suspect it's not possible to start dialing a phone number, enter a
petit mal seizure for five seconds, and then finish dialing without having
lost your place.

~~~
ahoyhere
I can't weigh in on your statement with any factual additions but...

If you start dialing a number, and somebody calls your name and you turn
around and talk to them for 10 seconds, you probably can't resume where you
left off, either. :)

------
blhack
Uhh? Computers are houses of cards... _everything_ in them is interdependent.
Bodies, on the other hand, are like cities...a few major parts that, if they
fail, kill the entire system, but lots of redundancy, and lots of things that
aren't completely necessary.

------
caf
It reminds me of the way that aircraft avionics are sometimes made more
reliable - they build N complete controllers from scratch, to the same
requirements, then use a voting system to discard the output from a controller
that is malfunctioning (and therefore producing different output from the N-1
others).

It seems to me though that there is a big difference in the problem domains.
Biological systems can produce a range of outputs that are on a sliding scale
of more or less desirable - but software systems are generally specified to
produce a single correct output, and any deviation from that is regarded as a
complete failure.

------
klodolph
"Gerstein said that this organization arises because software engineers tend
to save money and time by building upon existing routines rather than starting
systems from scratch."

Bull. There are two reasons why the shapes are different.

1\. The bottom of the Linux graph is smaller because there are not very many
primitive operations on a computer. Data is homogenous. There are 256
interchangeable values for a byte. By comparison, a bacterium has to use
separate pieces of machinery to handle chemicals made up of the dozens of non-
interchangeable elements that it works with. On a computer, you might use one
single function for searching binary search trees all over the place in
different systems, regardless of the data. In a bacterium, when code gets
copied, the copy is modified. Every new application is a fork of some other
application, welcome to a developer's nightmare (a good argument against ID...
a designer would never duplicate so much code). Developers reuse code because
they can, bacteria doesn't reuse code because it can't.

2\. The top of the Linux graph is larger because computers have to do more.
Computers get selected for features. Bacteria get selected for survival. The
bacteria have to "just work", whereas people expect to be able to configure
computers. I can plug 100 different network cards into my computer, but don't
expect to plug a different flagellum into a bacterium. Maybe it's just a
matter of level, if you picked a higher point in the call graph eventually
you'd get to "main", no?

------
zemaj
That headline is terrible.

My take away: If there is an intelligent designer, they'd be an awful
programmer.

~~~
askar_yu
agree with your first statement.

------
fun2have
Normally in Software there is two approaches a single software solution , or a
best of breed. I.e. Office vs Lotus 123, word perfect, etc.. Or in the world
of ERP having HR, Accounting, Planning software from different vendors. The
interesting bit from this article is that is implying that cobbling bits
together is a more reliable approach.

I have always been amazed in how the unix approach of cobbling bits together
is often more reliable., than trying to write one large program.

------
troels
TL;DR Redundancy increases stability.

True, but it also makes it neigh impossible to make any changes to the system,
which is generally a desirable trait in software.

~~~
meric
That reminds me of what my lecturer said... "Do not start writing functions
before knowing what they're going to do, hoping you can massage them into
doing something useful." It was a `duh!` advice but now the converse seems
appropriate if you're going to design a living organism.

