I remember several years ago reading about "junk DNA" or "useless DNA" in sequen...

retrac · on Jan 6, 2023

I'm not sure where I encountered this hypothesis but I find it compelling. As noted by many, junk DNA, acquired from viruses and mutations and genome shuffling, is quite a puzzle. Why does it persist? It takes energy to copy, and misreading it can cause fatal or maladaptive mutations. From that perspective, it shouldn't persist (with slowly accumulating drift) for billions of years, as some shared junk sequences have across species. But it does.

Obviously, because it isn't junk; it is of value to the organism. Even if it's not of any use right now, even if it's completely biologically inactive at present. Because it is still extremely high entropy information. They're remnants of solutions other living systems once used, at some point, to solve the problem of staying alive.

If I were going to try and exploit genetic mutation to produce novel solutions to biological problems, I would start from an existing genome. In fact, I'd start with as much data, from as many organisms, as I could get my hands on and store. Perhaps we carry junk DNA because mutations in existing coded sequences, even mutated, currently useless ones, are far more likely to be functional, and so potentially a useful adaptation, than literal randomness. It's life's portfolio of solutions, badly photocopied little snippets accumulated over the years, and we all carry it around for future generations that might live in an environment where it's useful.

tedunangst · on Jan 6, 2023

We should also consider that simply copying everything, even the junk, leads to fewer errors than selectively trying to identify only the good parts.

systems_glitch · on Jan 6, 2023

Just like backups.

yAak · on Jan 6, 2023

I feel like me keeping copies of all the code I've ever written is a awkwardly good analogy here.

lowdose · on Jan 6, 2023

Git commitments to your autocomplete library?

xenadu02 · on Jan 7, 2023

You must have selective pressure on genome size for organisms to evolve mechanisms to reduce the "junk". The metabolic cost of carrying around the junk is small. The cost of cleaning up the junk comes from much more frequent accidental deletions/truncations of important sequences. Upending that equation requires massive selective pressure for a smaller genome - maybe something like a tardigrade that gets desiccated regularly? In any case no chance any vertebrate species would have that kind of pressure. You'd need insane offspring counts and short generation cycles to afford the selective pressure price.

The fact that we can tap junk at some future point is probbly just an accidental side-effect... though there is another theory that claims having lots of junk provides some protection against environmentally-induced damage because most of the time it is a junk section that gets damaged. Hows that for the next error protection algorithm: pad the message with mostly zeros so occasional bit corruption doesn't matter. Take that Shannon!

If you want a specific example of this mechanism working: primate 3-color vision. In our two color blue-yellow seeing ancestors the yellow pigment sequence got duplicated, then eventually slightly mutated. That's why the red and green receptors overlap so much yet blue is standing way off by itself. It is high likely this started as a useless duplication and was carried around for a long time before one of the duplicates got mutated.

winter_blue · on Jan 6, 2023

> It takes energy to copy, and misreading it can cause fatal or maladaptive mutations

Can maladaptive mutations really be caused by copying DNA that's not used much (as far as we can tell, like the DNA for endogenous retroviruses in our genome)?

jean_tta · on Jan 6, 2023

From the perspective of the gene it makes sense - genes that are more sucesful at making offspring (aka getting copied) should be expected to prosper through natural selection.

alcover · on Jan 6, 2023

Maybe the junk is remnants of old code partly over-written by active, reachable code ? Like fragmented disk/RAM.

dekhn · on Jan 6, 2023

There are parts that are almost certainly not under functional selection and provide no benefit whatsoever- with Alu sequences being the best candidate. Even in tthe case of Alu, they do seem to have some vague effect on regulation of transcription... although they're not what we would call "genes" or "regulatory regions".

In other cases, there are just lots and lots of duplicates of the same genes over and over. Other parts appear to be forges of gene creation- either through gene duplication and divergent evolution, or through some other mysterious mechanism we don't know yet.

Certainly, we've had parts that looked like they were nothing at all and ended up being very important, and other parts that looked like they were incredibly important, but were really just the side effect of some effective parasite.

It's sort of not even an interesting debate any more, as most of the initial positions everybody held were changed when we interrogated more, and better data.

gaboot · on Jan 6, 2023

There are also fairly strict limits, given human mutation and reproductive rates, on the amount of information that can be preserved in the genome. Most of the genome is therefore meaningless (although not necessarily useless). As this article points out, these regions allow for random creation of novel proteins

jl6 · on Jan 6, 2023

Even for the “no benefit whatsoever” parts, is it not possible that they influence (and are possibly crucial to) the rest of the system just by providing spacing between other more-apparently-functional parts?

I’m thinking by analogy of executable programs that have runs of zeros. The zeros don’t necessarily do anything, but remove them and everything else is out of alignment.

dekhn · on Jan 6, 2023

I am open to the idea that "boring duplicated regions" performance some vague function through spacing. Some folks have proposed doing experiments where the spacers are removed, or replaced with other sequences, but they are extremely hard experiments to properly do (in a way that convinces the field).

We already know that enhancers "work at a distance" and it's not clear what "distance" exactly means, and it gets into complicated 3D structure of the genome inside a cell; see https://en.wikipedia.org/wiki/Enhancer_(genetics)

Personally I think that the best way to think about the genome is to unlearn most of the preconceptions you learned in genetics and instead think about it in terms of biophysics and development and machine learning: you'll never realyl be able to understand the true function of every little bit, but you cvan probably create an approximate model that explains the vast majority of biology with relatively few variables, and some deep models that contain all the necessary statistics to model these systems accurately.

MichaelZuo · on Jan 6, 2023

It sounds like because there is a very complex 3D structure that the 'spacing' function could actually be extremely important. Far more so than zeros in machine code.

eternalban · on Jan 6, 2023

This link dives quite deep into what is an ALU, for those interested.

ALU elements: Know the SINEs [short interspersed elements]

Alu elements are primate-specific repeats and comprise 11% of the human genome. They have wide-ranging influences on gene expression. Their contribution to genome evolution, gene regulation and disease is reviewed.

https://genomebiology.biomedcentral.com/articles/10.1186/gb-...

gumby · on Jan 6, 2023

You could make the same claim for structure padding in memory. I wouldn't call that useless either.

dekhn · on Jan 6, 2023

I love the analogy. Many times I think about the genome as a bunch of machine code it's my job to reverse engineer. That was a good part of my career- probably 20 years- before I realized the problem was that it's much too hard to actually "prove" anything about systems like genomes.

gumby · on Jan 7, 2023

That object code has been heavily modified during runtime for billions of years. We have no access to the original source code for any of the patches, though at this point it would be incomprehensible anyway.

kzuberi · on Jan 6, 2023

For folks interested in understanding the subject of junk DNA a bit better, there's an upcoming book [1] that might be worth checking out. The authors blog seems also to be interesting on this and related subjects.

[1] https://utorontopress.com/9781487508593/whats-in-your-genome...

bell-cot · on Jan 6, 2023

Junk DNA, or near-junk DNA (active in theory, but with minimal effects) both:

- Is extremely difficult to remove, at a worthwhile scale, from the genome of any large & long-lived organism

- Can be thought of as a huge pile of tickets for the Extremely Favorable Random Mutation lottery

folex · on Jan 6, 2023

pile of tickets is a very nice metaphor

jeremiep · on Jan 6, 2023

Read "The Structure of Scientific Revolutions" recently, science ignoring what it does not understand is far from a new phenomenon.

Science is fantastic to dig into areas it can already see, and terrible at seeing new areas from the greater unknown.

aeonik · on Jan 6, 2023

We studied "The History of Science in Society" by Andrew Ede and Lesley Cormack, which left a big impression on me.

ISBN-13: 978-1442634992, ISBN-10: 1442634995

rockinghigh · on Jan 6, 2023

Non-coding sequences have been understood as having some functions at least since the early 1990s. Because genome expression is dynamic, tracking the exact mechanisms of action of these sequences is challenging.

andai · on Jan 6, 2023

We need a few more copies of this gene before we can recognize all the patterns ;)

VeninVidiaVicii · on Jan 6, 2023

Hello, ChatGPT

nathias · on Jan 6, 2023

people just need to understand that useless can become useful and vice versa

vbezhenar · on Jan 6, 2023

Junk DNA is a junk DNA. It's not used in any way. We understand it.

svnt · on Jan 6, 2023

This is entirely untrue.

taneq · on Jan 6, 2023

"Junk DNA" brought to you by the same geniuses that brought you "we only use 10% of our brain cells" and "the heart is where the spirit resides, the brain is just useless grey goo."

seydor · on Jan 6, 2023

Those two are not of the same league

User23 · on Jan 6, 2023

That was born of the widely held metaphysical position that evolution is purposeless. Given that assumption one would expect to find plenty of “junk” DNA.

Geezus-42 · on Jan 6, 2023

I don't think that follows as well as you think.

If the primary goal is survival based primarily on efficient use of energy. A lot of evolution is about organisms becoming more efficient by adapting to their environment. So then keeping unnecessary junk around is inefficient and we would expect orgasms that lose to would benefit and out breed the others.

User23 · on Jan 6, 2023

Having our optic nerve run right through our retina producing a blind spot in order to capture an upside down and backwards image is pretty inefficient too. Evolution doesn't maximize efficiency, it maximizes good-enough-to-reproduce-ity.

alpaca128 · on Jan 6, 2023

Better adapted organisms are just that - better. Not perfect, or free of inefficiencies. And even a perfectly adapted organism might not be as good at adapting to changes in the very long term compared to one with "junk" DNA. Also, does unused junk in the DNA really hurt energy efficiency?