
Biology's 'dark matter' hints at fourth domain of life - kingsidharth
http://www.newscientist.com/article/dn20265-biologys-dark-matter-hints-at-fourth-domain-of-life.html
======
bluekeybox
Although there may not be a fourth domain of life, I am pretty excited about
what the metagenomic studies that use next-generation DNA sequencing will fish
out from the depths so to speak. There have been some crazy papers published
that use this approach. The title of one: "Windshield splatter analysis with
the Galaxy metagenomic pipeline." I imagine there is lots of interesting stuff
living in the oceans that is about to be revealed using the high-throughput
approach.

It is also interesting to think about what this approach will miss: if
somewhere on Earth, perhaps near thermal vents deep in the ocean, however
unlikely that may be, there exists a non-DNA-based life-form (which would
easily be the most interesting result ever), metagenomic studies will miss it
because by design they can only analyze DNA sequences. Similarly, if a probe
gets sent to Europa, for example, DNA sequencing would be useless.

~~~
alphaoverlord
Unfortunately, metagenomics is currently the lowest hanging fruit of
genomics/biology right now. Sequencing has gotten exponentially cheaper to the
point that these types of projects are feasible (the paper you cite is
actually a proof of concept for their metagenomics pipeline), that everyone is
just sequencing this shit out of everything. That said, there are quite a few
pitfalls with this current approach:

1\. Big data in biology is still lacking comparable analytical capability.

2\. Most high throughput sequencing technologies (next gen sequencing) is
using polymerase based technologies that have short reads. Polymerase (the
basis of PCR) requires a primer to start the reaction - and the choice of
primer can introduce non-trivial bias into the obtained results. Short reads
are pretty good in dealing with most bacteria, but the reaction gets pretty
hairy with hyperthermophiles (bacteria with regions high in GC content) and
more complex eukaryotes (long stretches of repeated regions are difficult to
resolve with the shotgun approach).

3\. We will probably only see DNA reads that are slightly different from what
we know to exist already. The generalization they make that 99% of species
cannot be cultured with current methods is true - but an analogous scenario
with DNA can be made. The sequencing reactions will probably be optimized for
sequences with similar composition to what we already see. The task of
extracting DNA from bulk material is not trivial - this can remove large
swathes of species from view even if the sequencing is trivial. Complex 3D RNA
and DNA structures such as hairpin loops and noncanonical pairing makes
sequencing hard.

It's definitely still a difficult question to tackle, but more difficult than
doing it are answering the questions "What can we learn from this?" and "What
should we sequence?"

~~~
bluekeybox
"Sequencing has gotten exponentially cheaper to the point that these types of
projects are feasible (the paper you cite is actually a proof of concept for
their metagenomics pipeline), that everyone is just sequencing this shit out
of everything. "

Well that's exactly why I'm excited, but your phrasing makes it seem as if
this is just a matter of course. Game changers in many fields often arise
because some previously expensive resource becomes economically more
accessible (cheaper). You make it sound like "The Model T Ford is just a low-
hanging fruit of cheaper gasoline." Nothing is exciting if you pull that
angle.

Regarding (1): Yes, and that "lack" is why I currently have a job (employed as
a computer scientist helping analyze big data). If nothing was "lacking",
analysis would be as easy as a biologist pushing a few buttons in a
spreadsheet-like program, and nobody would need a computer scientist helping
them.

Regarding (2): That's only a technical issue that is the main contributor to
(1) -- were it not for short reads, analytics would be much easier.

Regarding (3): From some reports (can't confirm right now but can look for a
citation if you're interested), I have seen exactly the opposite -- that the
estimates of biological variation from metagenomic data are actually skewed
towards higher diversity than expected. Could be due to errors in the
pipeline, and is being worked on, but some of that variation could also be
explaining legitimate biodiversity.

------
pyre

      > "There is still debate [over] how to clearly distinguish the
      > three proposed domains of life, and how they are interrelated,"
      > Gupta says. "The suggestion [of] a fourth domain will only add
      > to the confusion."
    

Yes. We should keep this 'dangerous' information from them. It will only
confuse and enrage them.

------
benvanderbeek
99 percent of single cell organisms can't be cultured in a lab, and they're
starting to get some DNA sequences from this 99%.

So a researcher says "They really are the dark matter of the biological
universe."

I thought the article was going to say that dark matter was related to some
unknown form of life.

~~~
TheEzEzz
'Ring' by Stephen Baxter explores the idea of a dark matter form of life. It's
a good hard scifi read if you're into that.

