
Is Most of Our DNA Garbage? - r721
http://www.nytimes.com/2015/03/08/magazine/is-most-of-our-dna-garbage.html
======
fuligo
As programmers we're in a unqiue position to get a sense of this, actually.
It's very very old spaghetti code. Some of it essential, there are even some
cleverly re-used common routines, some of it is just ballast, some of it can't
be taken out but isn't getting executed either, some of it now serves a
different purpose, and none of it comes with annotations that allow us to
easily find out which is which.

~~~
charonn0
This is exactly how I like to think of DNA: a twisted mass of legacy code.
Rife with Heisenbugs, the slightest change can have profound side-effects in
seemingly unrelated subsystems. Even sections that are literal "junk" must be
retained since everything uses GOTOs with hard-coded line numbers.

~~~
maratd
Well, let's refine that analogy a bit. Legacy code being used by BILLIONS of
users, where even a single critical error is viciously excised. Non working
code = death. So while it isn't clean, it is functional and relatively bug
free. I do stress the relatively part.

~~~
charonn0
Evolution by natural selection is the original genetic algorithm.

------
damon_c
When you open a binary file in a text editor it looks like garbage.

~~~
msandford
I heard about this "98% of DNA is junk" claim a while ago.
[http://www.livescience.com/31939-junk-dna-mystery-
solved.htm...](http://www.livescience.com/31939-junk-dna-mystery-solved.html)

Right, because a single cell transforms into a whole animal (horse, giraffe,
whatever) and that takes zero information. Really? Somehow I doubt it.

A multicellular organism is a really big, complicated state machine and the
"junk" DNA is what's needed in order to go from a single cell to many billions
of cells organized in the particular fashion that they are.

To suppose that everything that doesn't directly code for a protein is "junk"
is incredibly arrogant; just because you don't _currently_ understand
something doesn't mean that _it cannot be_.

As an engineer I'm guilty of this all the time; something I don't know how to
do is impossible but once someone explains to me how to do it, it's trivial.
I've had this happen enough times that I can now recognize that my
"impossible" isn't necessarily _actually_ impossible, just that it might be
impossible to me _right now_.

~~~
jhallenworld
Don't underestimate the creative capacity of small (to encode) algorithms.
Perhaps development uses something like simulated annealing to create
structures which match required properties. If it works like this, an exact
template is not required to be encoded in the genes. In other words, the
ultimate source of the design is thermal randomness, but guided by constraints
encoded in the genes.

~~~
msandford
Well we're starting to see Turing vindicated on the way that a lot of
biological stuff happens.
[http://genomebiology.com/2013/14/1/101](http://genomebiology.com/2013/14/1/101)

Sure it doesn't have an exact template of _precisely_ how to do everything,
but there's a lot of information there. Maybe it's not huge relative to the
amount of theoretical information DNA contains, but I strongly suspect that
the 98% is junk claim will eventually be found to be laughably wrong.

------
Animats
Flowering plants have huge amounts of DNA, 1-2 orders of magnitude more than
mammals. It's like they haven't evolved subroutines or macros or something
comparable.

~~~
damon_c
The idea that nature may have evolved coding design patterns to make "cleaner
more efficient" code is amazing.

Are flowering plants just written in a language that doesn't have "for loops"
so everything is just written out line by line?

~~~
acqq
No. The compactness of the DNA "program" is an evolutionary trade off, just
like in the engineering. For some species to have a significantly more compact
code now there should have existed some evolutionary benefit among the
ancestors for that.

[http://en.wikipedia.org/wiki/Genome_size#Genome_reduction_in...](http://en.wikipedia.org/wiki/Genome_size#Genome_reduction_in_obligate_endosymbiotic_species)

~~~
damon_c
Interesting, and from the same page, Drake's rule says that similarly to
programming, the larger the "codebase", the slower the mutation (new feature
launch) rate.

[http://en.wikipedia.org/wiki/Genome_size#Drake.27s_rule](http://en.wikipedia.org/wiki/Genome_size#Drake.27s_rule)

------
jonchang
There's an interesting recent manuscript that subdivides the obsolete "junk
DNA" classification into things like "garbage DNA", "rubbish DNA", "Lazarus
DNA", "zombie DNA", etc.

[http://gbe.oxfordjournals.org/content/early/2015/01/28/gbe.e...](http://gbe.oxfordjournals.org/content/early/2015/01/28/gbe.evv021.short)

------
coderzach
This reminds me of a C program I once saw. Certain functions in the code were
dependent on the stack order. If you moved any thing around the (10mm+ LoC)
app would fail to compile. Therefore any time anyone wanted to change these
functions they would just append more code to the bottom, never touching any
of the existing code. Most of it was garbage, but if you touched it at all,
everything would break.

------
MollyR
While there might be some signal noise ratio, I highly doubt we have any junk
dna. I imagine a good modern analogy here would be
[http://en.wikipedia.org/wiki/Evolvable_hardware](http://en.wikipedia.org/wiki/Evolvable_hardware).
[http://www.damninteresting.com/on-the-origin-of-
circuits/](http://www.damninteresting.com/on-the-origin-of-circuits/)

in the two articles I mentioned,I think the way the "offspring" circuit worked
in an extremely baffling(complex) way relying in physics in way beyond human
design (aka the way the disconnected logic gates were still necessary for
functionality) have insight into some of possible complex interactions of dna.
They look like unnecessary garbage until you try to remove them and break a
ton of stuff.

~~~
NathanKP
According to the article:

    
    
        On average, each baby is born with roughly 100 new mutations.
        If every piece of the genome were essential, then many of those
        mutations would lead to significant birth defects, with the defects
        only multiplying over the course of generations; in less than a
        century, the species would become extinct.
    

The point this article is making is that new techniques are finding that some
portions of DNA assumed to be junk might have a use, and removing or changing
them certainly does have an effect. But that doesn't mean we should jump ahead
too far and say that none of the DNA is junk, because there are obviously some
things that have no effect when changed.

~~~
simcop2387
I wouldn't be surprised if some of it really has no effect and that's the
whole point. As you mentioned if there's 100 new mutations per baby then if
the coding is as dense as possible that means each mutation will cause some
significant change. If it's got a lot of coding that doesn't do anything, by
probabilities it'll likely not change anything important and cause no ill
effects. Having that extra unused DNA could pose an evolutionary advantage in
the face of an imperfect copying mechanism.

~~~
ajuc
Assuming mutation probabiity don't depend on the length of the copied DNA -
copying 40 MB you will probably introduce 10 times more mutations than if you
just copied 4 MB. only 10% of them will on average change the important code,
so the end result is the same.

So I don't see how NOP padding helps.

------
jostmey
Couldn't agree more. Evolution as a process has no regard for simplicity or
elegance. The solutions that Natural Selection "comes" up with are random. I
guess you could say that all organisms have a design, but that the design is
irrational.

A lot of people expect that at the molecular level life is organized and
sensible. But why should that be the case? On the macroscopic level, living
organisms are a mess. It is my belief that at the biochemical level the
chemical reactions are as organized as the branches on a tree---that the
chaotic design of life at the microscopic level mirrors its design at the
macroscopic level.

~~~
api
It's best to throw out all these value judgements.

Evolution isn't purely random. There's nothing random about what works and
what doesn't. Over time, differential selection and other effects (e.g. sexual
selection) work to transfer information about the environment into the genome.
Evolution is a learning process. It's just that the representation that it
generates is difficult for us linear thinkers to interpret.

It only _appears_ irrational from an anthropocentric point of view. It's not
rational to us because we, having brains that work in a certain way, are
biased toward seeing things as linear chains of cause and effect. That's how
we think and that's how we like to build stuff, but obviously that's not the
only way stuff can be built.

For all we know elsewhere in the universe there are beings that think in
massively parallel super-holistic causality-matrix terms. To them evolution's
designs would seem perfectly rational, while a UML block diagram would seem
insane. "Nothing has only one cause or one effect," they would mutter... in a
language in which every word implies everything to varying degrees and each
syllable of a sentence must be parsed in parallel with all others.

~~~
sumitviii
I think parent meant that new mutations are random. Or at least look random.

------
rogerbinns
I highly recommend reading "DNA seen through the eyes of a coder" at
[http://ds9a.nl/amazing-dna/](http://ds9a.nl/amazing-dna/)

It is a very nice explanation, shows many similarities, while also making
Intercal seem sane.

------
disjointrevelry
The way I see it is DNA is much like a very long running program meant to run
for millions to billions of years. Our current life is an instance of some
code in the DNA. Due to the longevity of the purpose of DNA, and our short
span in an instance, it is very, very difficult to ascertain what some parts
may be about, or what it's for. Not to mention the high compressibility of
information in DNA, some aspects might never really reveal themselves unless
met with an instance in an environment it is "meant to run in".

------
finnh
Didn't they recently realize that huge swaths of "junk" DNA are actually use
to regulate RNA expression? So, uh, not junk?

Not the original article I'm thinking of, but related:

[https://www.uam.es/personal_pdi/ciencias/genhum/bibliogenoma...](https://www.uam.es/personal_pdi/ciencias/genhum/bibliogenoma/RNAregulationnewgenetics.pdf)

~~~
jostmey
You state it as fact but it remains a matter of heated debate. If all the DNA
were important then why are large swaths of DNA allowed to undergo genetic
drift?

------
OldSchoolJohnny
Perhaps when the right conditions arise the "junk" turns out to be useful
after all?

------
jhallenworld
I thought the introns were not well preserved between generations and that
this is strong evidence that they really are junk.

------
nashashmi
I am surprised at the volume of agnostic comments here. Before I begin, let me
explain my position. I believe in God, and believe in evolution as coming from
God himself. Second, I distrust any behavior of arrogance or pride that comes
from a I-know-it-all person. Third, the very people who often do create, or
innovate, or discover are far more humble and more curious than the previous
people I described.

Having said that, there is some fanboy culture and attachment with science and
DNA tinkering. Some "discoveries" and claims sometimes come from such fanboys.
People on HN are not different. In fact majority of the comments here signal
as coming from fanboys.

Now to the topic of DNA: DNA, if a product of evolution, could not be mostly
composed of garbage because nature has its own "garbage collector." Much of
the DNA is important and significant in some shape or form. It won't be
apparent in a simple experiment, but it may be apparent over a lifetime or
maybe over generations.

There is evidence to back this up: a recent NPR show, i forget what, claimed
that your grandfather's hunger at the age of 9 influenced your chances of
heart attack. This brings up DNA's influence across lifetimes. And the
complications are so much deeper.

Our fight to try to understand DNA is incredible. We are not there completely.
And maybe this is the longtail part of science. But we will understand it more
and more, and imagine the information we extract then. I bet much of it will
debunk everything we so arrogantly claim today.

~~~
taco_emoji
> nature has its own "garbage collector."

Ah, no, it doesn't. If you think "survival of the fittest" will somehow
magically cull non-functional, non-deleterious DNA out of the genome (except
by accident), then you don't understand evolution.

It's kind of hypocritical to call others "arrogant" when you're opining on a
topic you clearly don't grasp yourself.

~~~
nashashmi
Nope. Not being hypocrite here. DNA that is not needed will wither away. And I
do understand evolution. Quite well actually simply because I have debated it
far more rationally than those who have outright rejected it and those who
have unquestionably accepted it. And there happen to be a lot few mysteries
with my explanation.

Think of DNA as computer code. Think of each creature as a robot with computer
code built in. Now think of DNA as also having the code to recreate the
creature. Overloaded DNA will collapse. Or incredibly sophisticated biology
will be able to handle overloaded DNA. Either way, too much complexity breaks.
You may call it "Survival of the fittest" implying chaos theory but I call it
harmonious morphology implying benign force.

~~~
taco_emoji
> DNA that is not needed will wither away.

You are 100% wrong about this. DNA that is not needed does not "wither" away.
If it's benign, it'll most likely stick around.

> Overloaded DNA will collapse.

This sentence doesn't make any sense. How do you "overload" DNA? What does
"collapsing" look like?

> Either way, too much complexity breaks.

What is "complexity"? Can you measure it? How much is too much? What "breaks",
and how?

> You may call it "Survival of the fittest" implying chaos theory

What does chaos theory (roughly, the idea that small changes in initial
conditions have dramatic effects) have to do with anything we're discussing?

> but I call it harmonious morphology...

Please define "harmonious" and describe how to measure it.

> ...implying benign force

I suspect you started with benign force and worked your way backwards,
ignoring any inconvenient evidence.

~~~
nashashmi
> You are 100% wrong about this. DNA that is not needed does not "wither"
> away. If it's benign, it'll most likely stick around.

Why would it stick around if it is not needed? Why wouldn't it be free to
morph if it did not matter? Why wouldn't it change if it did not affect the
being? Evidence exists that DNA changes due to stresses in environments, so
hereditary (e.g. the parent has it, so the child must) is not sufficient to
explain why garbage DNA is still transferring over generations later.

> This sentence doesn't make any sense. How do you "overload" DNA? What does
> "collapsing" look like? > What is "complexity"?

If a creation of an object relies solely on DNA, then there must be some sort
of ignore protocol built into the creation factory for the DNA part that is
meaningless. Overloaded DNA has excessive meaningless data. An enormous amount
of complexity would be required within the creation factory to handle this
overload.

> Chaos theory ... The idea that random changes occur and the only changes
> that survive are the ones that can still adapt to their environment despite
> or in spite of the change.

> Please define "harmonious?

Harmonious would mean the very stable and smooth transitions of creatures to
change from one to another without there being some sort of "noise" e.g. rapid
fluctuations in genes. If creatures can transition uniformly to another
creature, why not several different creatures across several different
environments (sort of like man-made innovations in isolated environments).
Failures and successes would both appear on graphs.

> how to measure harmonious? how to measure complexity?

Measuring harmonious is akin to measuring cold energy. You cannot measure
something that does not exist, but instead you measure it by the lack of its
opposite. (you measure cold by the lack of heat). Complexity is only apparent
when things break. If you don't see things breaking, you cannot witness
complexity.

> you started with benign force and worked your way backwards, ignoring any
> inconvenient evidence.

Actually, its the other way around. A researcher must constantly battle the
concept of unknown/benign/hidden/dark energy/force/matter to figure out why
things work the way they do. A researcher is always digging, always denying,
the concept of an unknown force, in finality to achieve the phrase "That's
how!" I call it peace at a certain point. But for a person with an agenda, the
inconvenient fact may be that there is an unknown force. Sometimes, these
things are cloaked as laws of physics, that some events other than this event
just cannot happen. Other times, they are called theories, until they are met
with an enigma, where then it is broken.

~~~
taco_emoji
Your thinking on this matter is incredibly messy and not guided by rigorous
investigation of the facts.

> Why would it stick around if it is not needed?

WHY WOULDN'T IT??? It's not expensive to carry junk DNA around - it just sits
there, not doing anything, not creating proteins, not affecting anything.
You're proposing that there's some sort of janitor going through and cleaning
up the nonsense and it's _just not true_. If there is such a mechanism, point
to it. Name it. Tell me what it is. Tell me what the evidence of it is. Just
link to a wiki page!

I think the problem is that you're stuck on thinking about DNA as procedural
program code instead of on its own terms, how it actually operates.

I'll try to explain a few things, in a last ditch attempt to appeal to your
intellect:

DNA gets transcribed to mRNA based on START and STOP codons (codons being
sequences of three bases). Then the codons in-between START and STOP on the
mRNA strand get translated into amino acid chains, also known as proteins. The
"ignore protocol", such as it is, is then twofold: 1) if a sequence doesn't
occur between START and STOP, it's highly unlikely to ever get transcribed
into mRNA, and 2) if an mRNA contains nonsense codons, or its number of base
pairs is not divisible by three, then the resulting protein is most likely
benign, or at least not harmful enough to prevent the organism from producing
offspring and therefore propagating the harmful sequence. (If it _is_ harmful
enough to affect reproductive viability, then it doesn't tend to stick around
very long.)

Not all DNA _does_ anything. Plenty of it is just hanging on for the ride, not
harming anyone, and so not selected against by natural selection.

> I call it peace at a certain point.

I call it _giving up the search_. Just because you don't understand something
doesn't mean you give up _trying_ to understand.

