What strikes me about both this article and Nick Lane's book is that when I was in school the tree of life was presented as this thing that people had pretty much figured out. Since then it's been altered almost beyond recognition using genomics. I hope there are people still in school who read articles like this one and get inspired to get into this amazing area of research.
When I was reading the book on a plane, a seasoned biologist happened to be sitting next to me. When I said that it's the first book of Nick Lane that I picked up, he said: "I'd rather suggest you to pick up Laine's other book, Life Ascending, and then get back to The Vital Question."
(Edit: Oops, bad me, just noticed that there's already discussion about the book's accessibility in this thread.)
As an ex-biochemist who has read both books, this is excellent advice.
I occasionally read pop-culture articles about, e.g. the discovery of giraffe subspecies, and think about how I'd like to know more about both the human-focused taxonomy side and the evo bio speciation side, but I don't know where to start.
The statistics really only ever amount to Occam’s Razor, I. e. fewer differences in the genome means closer relationship.
Edit: actually free full-text link: http://bioinformatics.bio.uu.nl/pdf/Ciccarelli.s06-311.pdf
In any case, the process consists of basically two steps:
“Align” the DNA of all the species you have sequenced. That’s done using algorithms such as smith-waterman, which minimize the number of edits needed to go from one species’ version of a gene to another.
That number defines a metric that measures all the distances between the species. So, Orang-Utan to Homo Sapiens may be, say, “450”, while Homo Sapiens to Ficus Benjamini is, say “12321”.
(The difference may also be measured on the level of protein sequences. The process is basically the same, only that edits may be assigned different distance scores, because some result in more functional differences, while others are functionally “silent”. Evolutionary pressure would make the first rarer than the latter)
The metric is unitless (I. e. It doesn’t allow translation into, say, “years since species diverged”. But it should fulfill the triangle equality).
Once you have a (triangle) matrix with all the difference, all that’s needed to reconstruct a binary tree that maximizes plausibility, I. E. minimizes the sum of difference at each branching point.
It's a little dated (the first edition came out in 2004!) but to be honest, if you're interested in a "human-focused" intro and how the concepts fit together, things haven't really changed that much.
And so I'm real reluctant to read what he has to say about other fields, because I've got to assume it's all that bad.
You, dmd, are not going to do anything with the knowledge gleaned from those books, except possibly one of two things: Peruse further study, or inspire others (usually children) to be interested in the field. The book opens up the field, the basic concepts of the field, and a bit of history of the field to you. You now have the tools to go off and either learn more or inspire others. You can even get into conversations and debates with those well-versed in the field.
The book is a gateway into a field, not a path to follow in the field.
You can see the original errata here:
I leave it to your judgement whether this makes you "really, really, really loathe" the author :-) Note the book had 20 pages of just bibliography.
I have not yet read _The Vital Question_ but I have read his previous books and I highly, highly recommend:
_Power, Sex, Suicide_
If I had to pick just one, it would be PSS.
The only confounding factor is that among bacteria, there is a process called “horizontal gene transfer”, where one bacterium inadvertently (?) aquires some genetic information from another when they’re really just out for a quick snack. Plus viruses, which sometimes just get inserted into a host’s genome and becomes part of it, sometimes even functional. The way animals acquired mitochondria is also a fascinating divergence from the standard “tree” model, although it’s somewhat derivative of how plants acquired photosynthesis.
(That’s how everything in biology works. It’s the worst case of spaghetti coding, ever, full of self-modifying compilers, state saved in the JIT’s buffer, and programs only working by exploiting CPU flaws depending on the Hall effect, but only on workdays (according to the Julian calendar), fan speed, and your database server’s support of OpenGL).
But it’s really just a tree.
What’s far less of a “real” concept, is that of species. That makes essentially no sense in non-sexually reproducing organisms.
For life forms that procreate by individuals splitting themselves into two copies, while also getting DNA from each other in various random or intentional ways, it gets a lot messier. We still talk about "species", but it's a quite different thing.
The closest to merging is probably mitochondria, which probably were single-cell creatures ingested by some very early (single-celled) ancestor of us and, instead of being devoured, ended up in a symbiotic relationship within its host. The mitochondria to this day have their own genome and function a lot like a cell-within-a-cell. Same for plants and photosynthesis.
It’s important to note that such viral infections generally do not infect germline cells and are thus not automatically inherited (though, like in the case of genital herpes, are easily transmitted to the offspring during birth).
That’s a second point, and I should have corrected the start (to “three-ish, since the point on mitochondria is yet another)
If you’re asking specifically about instances of non-treelike gene transfer, the answer is twofold:
First, and I may be mistaken here (I’ve been out of the field for a few years) I think that any metric fulfilling the triangle inequality has a corresponding, consistent tree. So you’ll get a tree, totally. It may just not be the correct one.
Second, if you are asking for our ability to quantify divergence from that model, I’ll give a short explanation of the codon usage method mentioned before as an example:
The genetic code has 4 letters (ACGT) that translates to a different sequence consisting of 21 Amino Acids (a chain of amino acids is a protein, such as the enzymes doing catalysing all the fancy chemical reactions in your body)
Each three letters of DNA code one “letter” of the protein sequence, meaning one amino acid. Since 4^3 > 21, the code isn’t quite optimal.
There are some three-letter sequences that contain meta instructions like start/stop, and some that don’t have any meaning. But there are also instances of two or more three-letter sequences translating to the same amino acid. They are functionally identical. You can replace them in the lab, without any change in phenotype.
For some reason, some species (or even branches in the teee of life) still show preferences for using one or the other three-letter code for a given amino acid.
If you plot which of the (functionally identical) codes are used within a bacterial genome, you can find sudden changes, where the preference switches dramatically for some lenght, then returns to the previous preference.
That’s indicative of a piece of DMA that jumped across the tree. It’s really not very subtle when you know to look for it.
Another, even more obvious, example is viral DNA: this tends to end up as mangled, non-coding (“junk”) DNA, containing (fragments of) genes that often have nothing in common with the rest of the DNA, but have long stretches nearly identical to, say, a known gene for some viral coat.
In terms of quantification I’m really at the limits of my memory (and, unfortunately, in-flight internet), but I’ll take a stab and the former mechanism can be found in 5-10% of bacterial genomes, and amount to usually less than 8% of he genome, with maybe one or two exceptional cases with 20% or so (some bacterial species may have developed a tendency to exploit this sort of buffer overflow to cheat at evolution)
Phylogenists (the biologists working in trees, but not those “trees”) do consider all this (and much more). Where they encounter “known unknowns”, you will often see trees with nodes branching into more than two branches. That’s essentially what cartographers would lable “here be dragons”. Or, prosaically: it isn’t quite clear which split happened first in evolutionary history.
Have we done the same sort of experiment with genetic code to verify whether it actually does form a tree, or if that's just an assumption we are forcing on the data?
Besides, every other model lets students develop incorrect intuitions. Incorrect intuitions are still useful, and by letting students develop intuitions based on more obviously incorrect models like the Bohr model, you are teaching them the same skills they will need to use when they discared Lewis dot diagrams for molecular orbitals, or any other point in the future when established models are replaced. If you can't do that, you’re not doing science, and if you can’t teach people how to discard theories, you’re not doing a good job at teaching science.
There is a tendency among the older generation (any older generation, at any given time) to complain about how their children are being taught wrong. Parents will pass onto their children a bunch of facts and skills, only to find out that schools teach something completely different. This is natural, of course. You can’t freeze a curriculum in time and still make it useful, you have to throw some parts of it out. But if you take students through the process of throwing something out, like the Bohr model, they’ll be more prepared for a changing future.
In my own field (statistics), I think that this is quite common. We commonly teach ANOVA and linear regression separately because it's always been taught that way, despite the fact that a good appreciation of linear models is far more general, useful, and intuitive.
> IMO, there is no need to explicitly teach something incorrect only to be able to un-teach the very same thing later. At best you alienate students because they feel either confused, lied to, or patronized.
This is an impossible demand for science education, because every model of the atom we could possibly teach is "incorrect". You could teach students only the most recent, most accurate theories of the world, but at this point in time that means starting with something like relativistic field theory and that approach is obviously wrong.
If there is a scientific gospel, the gospel is "test your theories", and you can accomplish that by teaching students about the Bohr model. "Here is a theory about how atoms work, the theory is useful because it opened the door to quantum mechanics, it explained phenomena like the hydrogen spectra given by Rydberg, but it was known to have certain shortcomings." This gives students a case study for the scientific method. We can put an electric current through hydrogen gas and look at the spectrum with a prism or diffraction grating. We can calculate a formula for the spacing of the spectral lines. Then, we can come up with a model for how the atom works which explains those lines. Finally, this leads to more observations which contradict the model, and the process repeats.
This is the scientific process, and if you are not teaching the scientific process, you are not teaching science.
The reason you might use the Bohr model specifically is because many of the related experiments can be replicated in a school laboratory by students of relatively modest mathematical ability... just algebra is enough.
And to be clear here, we are not teaching the Bohr model as fact. It's not going to be "un-taught" or "un-learned". Other theories you would often teach in a chemistry class include phlogiston theory, again, because the experiments disproving it are easy enough to run in a typical high school chemistry class, not because phlogiston theory is particularly useful.
The electron really is a wave-like thing confined a by coulomb potential and thus has a discrete set of solutions. Moreover the discrete indexes of that solution map onto mechanical properties like angular momentum and energy. The Bohr model captures all of that, and even gets the right equations for the energy (at least in simple cases).
Of course the real wave equation that needs solving is much more complicated than the 1-d cartoon in the Bohr model, but that cartoon works because it captures many essential properties of the real system.
But all my peers remember the Bohr model regardless. Because it is simpler, it is easier to remember. People will develop the wrong intuition regardless of what you do. If you start with the complicated ideas, they just give up on learning instead, substituting superstition.
In any case, you can't really do anything with a model of the atom unless you're using equations, and you only get the quantum versions because those are the only ones that work. Everyone else doesn't have any need for a true model of the atom, and thus no incentive to learn it. (Most of the time, you'd rather need a model of molecules.)
Lots and lots and lots of organisms died out during each of those changes. That things have survived so long is a matter of probability not inevitability.
I'm not trying to brush off the impact of global warming, I'm just placing it in the context of 1+ billion years of Earth history.
Anything that has survived for a billion years in a form that could still be considered roughly the same organism or a close descendant must have a relatively stable biochemistry and gene pool that is capable of surviving an extreme range of environmental changes - as close as you can get to a holy grail in natural selection.
Obviously its survival would not be inevitable; not even stars can survive their own age.
Possibly some simpler and more robust organism may lurk under arctic ice for a better time.
It would have piqued my interest much more at a young age if we were taught 'this is what we have found out... so far!'. There's still so much left to discover!
I wish our education system promoted this idea - "what we know about science will change in your life time. It's likely that in the future, most of what you're learning today will be outdated. It doesn't mean what we're teaching you is 'wrong,' but rather, it's the best of what we know now. You're about to enter on a glorious adventure of discovery that will not end at graduation. Science is about hypothesis, iteration, and testing...and the discovery of unanticipated results."
Instead, what's taught is a static version of reality: What we know today is all that is worth knowing for the rest of your life.
We haven't got it all figured out yet, but we're less wrong than we used to be.
"Naturally, the theories we now have might be considered wrong in the simplistic sense of my English Lit correspondent, but in a much truer and subtler sense, they need only be considered incomplete."
Maybe in the past. Today science is about getting grants, publishing as much as possible even if the paper is trivial or you p-hacked your way to heaven.
It's also about finding a niche where there is no competition so that you can build a name for yourself. If any sort of competition appears, quickly run away to a more fertile ground (see the recent article about DeepMind and protein folding)
Politics is a deep vein that runs through absolutely everything and it has a big impact and oftentimes it's not obvious what the impact is.
Science is as much about the politics of science as it is about an empirical quest to find the truth.
Look at the history of Bayes Theorem as a relatively good example.
Even in science ideas don't win purely on merit. They require marketing and political support too.
Maybe the comment had it's wires crossed a little but still it's a valuable thing for people to keep in mind.
>Even in science ideas don't win purely on merit. They require marketing and political support too.
Ideally the world would be a meritocratic place, while still offering both sympathy and dignity to the weakest members. Does anyone disagree with that? But still, politics and cheating always weasel their way into power structures of society. And it should be seen as morally reprehensible within science.
Evolution is interesting but up and down from this level of the tree of life too. I also loved reading "The Beak of the Finch" to learn about evolution in macroscopic metazoans.
For a while now amongst evolutionary biologists it has been clear that the "root" of the evolutionary tree, or the so-called "LUCA" (last universal common ancestor) may not exist, mostly because horizontal gene transfer is so common amongst unicellular lifeforms that it ruins the orderly notions of descent-with-modification that we have formed following Darwin's studies of higher life.
I imagine this is disheartening to most people, and the orderly notion of a "tree of life" with a simple root persists because we want an uncomplicated picture of the world; human needs for ontological clarity trump the disgusting ball of muck and sputum that is actual life.
Answer my own question: other articles suggest it is about 15 μm long.
Actually, it’s probably the closest thing to an actual alien life form that we’ve yet encountered. This is what that kind of convergent evolution would look like.
Humor aside, it is a very interesting idea to find convergent evolution far away on the tree of life to conjecture about exobiology.