
The band of biologists who redrew the tree of life - okket
https://www.nature.com/articles/d41586-018-05827-1
======
dekhn
I studied under a professor who worked with Woese. It was great- lots of
emphasis on RNA, using RNA to answer basic biological questions. One of the
things I've never been able to understand is, how did Woese, with just a
limited number of very limited sequence data, manage to carve out the
existence of an entirely new form of life? I ask, because looking at all the
effort to sequential gigabases today, it doesn't seem like we're learning
geometrically more knowledge. Modern gains in sequencing seem highly
incremental, and generally only provide correlation data, rather than causal
data, and the data they do provide is not actionable.

Was Woese's discovery just "easy"? Would any competent scientist working in
the field make that? Where are the Woeses of sequencing today?

~~~
klmr
> Was Woese's discovery just "easy"?

In a way, yes. It’s widely established that 20th century biologists got to
pick the low-hanging fruits. This shouldn’t detract from the sheer genius they
displayed.

In the specific case of the three domains of life (bacteria, archaea and
eukaryotes) — which I’m assuming you’re referring to? — Woese & al. got
“lucky” with their study of 16S rRNAs.

In a nutshell, these RNAs are relatively easy to sequence and analyse. Their
extremely high degree of conservation means that they can be trivially aligned
to each other to study differences. But they also contain “hypervariable”
regions that made it possible to distinguish different species at extremely
high resolution, even using relatively slow computers and unsophisticated
analytical tools. In addition, modern phylogeny mostly struggles with
inconsistencies in classical methods that are due to unaccounted features
(such as lateral gene transfer, which invalidates the typical assumption of a
strict tree structure). Early studies ignored these inconsistencies (or didn’t
even notice them, due to the scarcity of data).

But since you talk about correlative vs causative data, it’s important to note
that this fundamentally hasn’t changed: from a very high-level perspective,
new evidence is either compatible with existing theories, or isn’t. Strictly
speaking, evidence never demonstrates causation — it can’t. But sometimes
there are few viable alternative models, so the available evidence trivially
allows discarding all but one or two models. This gets harder the more nuanced
your models are, and this is the natural progression we have in any scientific
endeavour.

~~~
dekhn
Based on my view of Woese's work (in which I saw the years of effort leading
to better secondary structure alignments), I don't think it was "easy" nor was
it a low hanging fruit.

And I think there's still low hanging fruit that people are missing.

~~~
klmr
… compared to subsequent phylogenetics (we’re all standing on the shoulders of
giants, after all). I agree that, at the time, the work must have been (a)
mind-numbingly tedious, (b) intellectually challenging.

But comparing past papers (including that one [1]) to current ones does show
that the amount of data, computation, and evidence necessary to produce a
contemporary research paper is incomparably larger than it used to be. The
seminal DNA structure paper was a single A4 page, with an estimate of the
double helix height, no calculations or data shown. Good luck getting
something like this published today.

[1]
[https://www.ncbi.nlm.nih.gov/pmc/articles/PMC54159/pdf/pnas0...](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC54159/pdf/pnas01037-0173.pdf)

------
estevez
Another book on the topic that I can recommend:

The New Foundations of Evolution: On the Tree of Life, by Jan Sapp

