
Mathematical Simplicity May Drive Evolution's Speed - beefman
https://www.quantamagazine.org/computer-science-and-biology-explore-algorithmic-evolution-20181129/
======
yetihehe
> DNA is extremely important, but it makes no sense if [it is] not in a cell,
> in an organism, in an ecosystem. Other interactions are at play, and this
> application of algorithmic information cannot capture the extent of that
> complexity.

It reminds me of demoscene, specifically "A mind is born"[1], 256-byte demo
which "looks" complicated and with rich structure, but it's just
nudged/selected random data. Author said that he played around with some
random generated content until he found something nice. Just by itself that
256 bytes is still not enough to replicate behaviour, it must be run on
hardware, which is like "environment" for DNA. Without several supporting
chips which are easy to access, that code would mean nothing and would not
generate such complexity. Moreover, those "patterns" in demo are patterns only
because WE interpret them as meaningful.

[1]
[https://www.youtube.com/watch?v=sWblpsLZ-O8](https://www.youtube.com/watch?v=sWblpsLZ-O8)

------
paraschopra
I recently did an online course on algorithmic information dynamics where
Hector (one of the scientists quoted) was the instructor. The entire course is
available online
[https://www.youtube.com/playlist?list=PLbvAAv4mbxdgp3qZRr34z...](https://www.youtube.com/playlist?list=PLbvAAv4mbxdgp3qZRr34zUTAdaq9Qycuu)

I really, really enjoyed the course. It was very interactive and Hector was
involved actively on forums with the students. Post the course, Hector
suggested a couple of research directions and groups of students (myself
included) are following different research directions. Here’s the Google Group
[https://groups.google.com/forum/m/#!forum/algodyn](https://groups.google.com/forum/m/#!forum/algodyn)
(we have a slack group too).

I’m myself part of a group where we are trying to create a crypto-currency
approach of exploring Turing Machines and their outputs. In case, anyone’s
interested let me know. (Hope if that proof of work wastes computation. What
if we could instead do useful work of computing universal distribution of
algorithmically generated strings). In case anyone is interested in
contributing to the project, our project meeting notes and goals can be
accessed in this doc
[https://docs.google.com/document/d/1np8cuFgfdrtRIaPL3oJPmmf_...](https://docs.google.com/document/d/1np8cuFgfdrtRIaPL3oJPmmf_6jln3DD5LvVh9Gq_AJs/edit#heading=h.2sbe42210bfk)

Overall, I’m very excited that researchers like Hector are making the field of
algorithmic information theory more popular.

------
Xcelerate
Holy cow. It’s amazing to see this on Quanta magazine, because I didn’t think
that many people had heard of or were interested in algorithmic information
theory. I’ve been following Hector’s work for a few years and for fun
implemented his paper on (brute force) iterating over small Turing machines.
This sort of approach is top down — start with the best theory of computation
(and in some ways, AGI) and then work backwards to develop computable
approximations. Contrast this to the bottom-up approach of neural networks,
which is more like “we’ve found these things that work well in specific cases;
can we tweak them to do more?” I think ultimately the top down approach is
going to be more productive, because there’s a strong theoretical framework
already in place that basically sets the upper bound on what you can do.

A lot of papers showing progress on computable approximations have been
published just within the last year. There was one recently showing that the
halting problem is in fact solvable for “almost all” Turing machines, which is
a bit surprising; I would have not thought this was the case. Ultimately
though, we don’t need to find the absolute shortest Turing machine that
produces a given output string — we just need to find a decently short one
that performs better than our existing models. Any sort of algorithmic
approach is bound to be vastly superior to a purely statistical approach.

~~~
amelius
> There was one recently showing that the halting problem is in fact solvable
> for “almost all” Turing machines, which is a bit surprising

Is this in the same sense as "almost all numbers are not natural numbers", and
if yes, then what good does this bring?

~~~
Xcelerate
Yes, unfortunately. And not much other than I suppose it just seems
unintuitive to me that the halting problem is solvable for "most" programs.

The real challenge would be to show that "interesting" programs (i.e. that
generate some sort of real world data) are solvable. There have been similar
efforts in the field of complexity theory that aim to show certain NP-hard
problems are tractable in the case of real world data (as opposed to a
pathologically selected worst-case problem).

------
hyperpallium
I believe algorithmic information theory will turn out to be key to physics,
evolution, life, mind - and language, already. But don't yet see _how_ to do
it.

> approximate the algorithmic information content of a network

Haven't read the paper, but I'm not seeing how you could approximate it
without throwing the baby out with the bathwater.

> bias evolving programs toward simplicity [...] Rewarding something like
> shorter program lengths, for example

Seems the simplest way to incorporate simplicity.

However, e.g. some simple organisms have far longer DNA than us. And as for
larger building blocks, e.g. cells, slime molds are unicellular with many
features of higher life.

The structure observed in nature is a byproduct of structures being a larger
unit to navigate search space: like taking a highway, then main road, then
side street to reach your destination. Or to home in on a decimal integer by
varying any individual digit, instead of only adding +1 or -1 to the entire
number.

The search strategy modifies the probability density over the search space,
rather than restricting it (or, explicity modifying the fitness function over
the search space), yielding emergent structure.

------
im3w1l
But evolution doesn't find optimal solutions in the wild. Invasive species
prove that. They were a solution that would have worked but evolution never
managed to create locally.

~~~
GuiA
Can you elaborate a bit on your comment? I am not sure I quite understand the
distinction you are trying to make.

Much like “weeds”, when referring to plants, is a distinction that is only
relevant in the context of human agriculture, so it is for “invasive species”.
All life is, by definition (reproduction), invasive.

~~~
pif
I think he's talking about animal and plants that prosper in a region they
could reach only due to human intervention (think: trans-oceanic or trans-
continental travel).

~~~
faceplanted
No, what he's saying is that evolution doesn't always produce optimal species
for the environment, if a species is so invasive as to out compete the native
species, then clearly _it_ is more optimal than the one that literally evolved
there. Meaning Evolution doesn't find optimal solutions unless it has the
right starting configurations.

~~~
empath75
Optimal for _what_? It makes no sense in the context of biological evolution.

~~~
Drdrdrq
Optimal for survival in the given environment. However the worth of solution
depends on other solutions (organisms)... Interesting way of looking at the
problem.

------
jfarlow
So, coming from the science side of things where we 'program' DNA in order to
design genes not found in nature, it's clear evolution works at many scales
simultaneously. And those scales each need to be accounted for in order to
traverse the large search-space. And in regards to this particular problem,
evolution did _not_ grant the complexity of multi-cellular life by doing any
kind of random walk across a sequence. Life had a few billion years to come up
with 'useful components' as well as 'good' heuristics and tools that commonly
arrive at useful components. And then it reused that code, and started running
different genetic algorithms on the same underlying code, but at another
abstraction level upwards.

It allowed those components that were useful to be duplicated, redone, and
rearranged to form higher-order tools. Where the components remained
relatively stable, but were remixed, redone, reordered, and otherwise reused -
with only slight modification to lock them into their new tool.

We see this with the results of evolutionary modifications to DNA. One of the
proteins responsible for multi-cellularity is itself a mashup of 4 or 5
proteins found in single-celled organisms, but where those smaller proteins
are repeated, smashed together, and otherwise remixed to form a protein with a
new function that was synergistic with the original components. And that
synergy allowed for the higher-order complexity associated with multi-
cellularity [1]. And once that took off and proteins started to get large, the
proteins would themselves be split up into pathways - and then those pathways
would get duplicated. Etc. And this is just one mechanism real life has used
evolution to scan that underlying space more rapidly than random walks.

This is actually what we do at Serotiny [2] - we don't design novel proteins
by sequence (in the 20^300 landscape) but rather by functional domain (in the
20000^5 landscape). And I think there are mechanisms to incorporate that kind
of scaling layer into a good genetic algorithm that allow you to traverse
short paths by going upwards into the different space, and then coming back
down when you need granularity,

And so, I think when going back to the algorithms, if you want to make them
pragmatic rather than theoretical, many scales should be addressed
simultaneously.

> “DNA is extremely important, but it makes no sense if [it is] not in a cell,
> in an organism, in an ecosystem.” Other interactions are at play, and this
> application of algorithmic information cannot capture the extent of that
> complexity.

Yep.

[1]
[https://www.ncbi.nlm.nih.gov/pubmed/11466531](https://www.ncbi.nlm.nih.gov/pubmed/11466531)

[2] [https://serotiny.bio](https://serotiny.bio)

~~~
e_gal
Notice that the claim:

"DNA is extremely important, but it makes no sense if [it is] not in a cell,
in an organism, in an ecosystem. Other interactions are at play, and this
application of algorithmic information cannot capture the extent of that
complexity."

Completely mischaracterizes the research (Zenil et al.). They are looking at
networks evolving over time of interacting elements. Such elements are genes
but networks can also be metabolic or signaling and everything they say still
holds.

The subsection about chasing dynamic evolving targets even addresses the
question of ever-changing environments that have some raised here. So the
approach is way beyond treating life as 'DNA' but even so, I doubt anyone
would object about the power of CRISPR-CAS9 only because it targets DNA... DNA
is still the most fundamental element of life and other tend to embrace
mysticism and are quite cartoonish about life.

Anyway, my recommendation is, do read the paper.

~~~
jfarlow
You're right! The paper itself is strongly agnostic to its analysis scale,
while the Quanta article focuses entirely on DNA sequence.

> Our embedded computational network architecture can be interpreted either at
> the levels of cells (internal network of molecules such as genes, proteins,
> metabolites) or alternatively as group or module of cells effective
> performing input-output transformation of external signals relative to the
> module. Naturally, this does not exclude a nested architecture ranging from
> cells encapsulating molecular networks to interacting modules composed of
> cells

And in fact the study actually meshes quite nicely with the paper cited before
showing the evolution of signaling networks in multi-cellular systems.

------
carapace
Cf. "Life as Evolving Software, Greg Chaitin at PPGC UFRGS"

[https://www.youtube.com/watch?v=RlYS_GiAnK8](https://www.youtube.com/watch?v=RlYS_GiAnK8)

> Few people remember Turing's work on pattern formation in biology
> (morphogenesis), but Turing's famous 1936 paper On Computable Numbers
> exerted an immense influence on the birth of molecular biology indirectly,
> through the work of John von Neumann on self-reproducing automata, which
> influenced Sydney Brenner who in turn influenced Francis Crick, the Crick of
> Watson and Crick, the discoverers of the molecular structure of DNA.
> Furthermore, von Neumann's application of Turing's ideas to biology is
> beautifully supported by recent work on evo-devo (evolutionary developmental
> biology). The crucial idea: DNA is multi-billion year old software, but we
> could not recognize it as such before Turing's 1936 paper, which according
> to von Neumann creates the idea of computer hardware and software.

\- - - -

Also "Algorithmically probable mutations reproduce aspects of evolution such
as convergence rate, genetic memory, and modularity"

[https://arxiv.org/abs/1709.00268v8](https://arxiv.org/abs/1709.00268v8)

> Natural selection explains how life has evolved over millions of years from
> more primitive forms. The speed at which this happens, however, has
> sometimes defied formal explanations when based on random (uniformly
> distributed) mutations. Here we investigate the application of a simplicity
> bias based on a natural but algorithmic distribution of mutations (no
> recombination) in various examples, particularly binary matrices in order to
> compare evolutionary convergence rates. Results both on synthetic and on
> small biological examples indicate an accelerated rate when mutations are
> not statistical uniform but \textit{algorithmic uniform}. We show that
> algorithmic distributions can evolve modularity and genetic memory by
> preservation of structures when they first occur sometimes leading to an
> accelerated production of diversity but also population extinctions,
> possibly explaining naturally occurring phenomena such as diversity
> explosions (e.g. the Cambrian) and massive extinctions (e.g. the End
> Triassic) whose causes are currently a cause for debate. The natural
> approach introduced here appears to be a better approximation to biological
> evolution than models based exclusively upon random uniform mutations, and
> it also approaches a formal version of open-ended evolution based on
> previous formal results. These results validate some suggestions in the
> direction that computation may be an equally important driver of evolution.
> We also show that inducing the method on problems of optimization, such as
> genetic algorithms, has the potential to accelerate convergence of
> artificial evolutionary algorithms.

