
Linux Versus E. coli - robg
http://blogs.discovermagazine.com/loom/2010/05/03/linux-versus-e-coli/
======
phaedrus
This is certainly interesting; lately I've been wondering something similar:
what software engineering techniques might also be found in genes or brains in
nature. This could be summed up as "Does the brain have subroutines?"

I'm not sure you can take the conclusions in the article at face value,
though: they seem to be hinting that Linux (or software in general) would be
"more robust" if it were organized more like the control structure in the E.
Coli genes, where there were many more "low-level" instructions with less
"code reuse". But this is just a trade-off that is well known to any
programmer: when you have a new situation, do you try to make existing
subroutines handle the extra case, or do you cut/paste code into a new
subroutine so that changes to it don't break existing things? One has to
wonder how much of the difference here is due to different ways changes must
be made: in the genes, random changes good and bad are sorted out by natural
selection - there is extra pressure to keep things from affecting each other
needlessly. (To be fair they do touch on this in the article.) In a program, a
human has to go make changes manually, so there is a reason to not spread
things out needlessly. What I'm getting at is this may not reflect whether one
is "better design" than the other, but simply that they are the product of
different circumstances.

But another concern I have is how do we know the design of the E. Coli control
structure only reflects what is physically practical rather than what the
genome would logically prefer to have: maybe it duplicates the lowest level
worker subroutines not because that is a better design for it, but simply
because further reuse is _not possible_ in this setting, lacking the range of
abstractions available to a programmer. (For instance, imagine if the genes
represent a language that has only goto for control and no stack (therefore no
recursion): this may be the best that can be done under limited circumstances;
limits we don't have in high level languages.)

On the Linux kernel side, I think the level of abstraction at which the code
is viewed has a big affect. I think we are viewing the "machine code" when we
view DNA, but the Linux code was analyzed at a higher level. Particularly,
wherever there is an inline function or macro used, it will look like one low-
level routine reused many times from a high level view, but if you looked at
the machine code it would not be reuse of a low level routine - the
instructions would be duplicated instead. Perhaps if the compiled binary
machine code of Linux kernel were analyzed instead, it would show almost an
identical structure to the E. Coli gene with far more bottom level "worker
subroutines"!

