When you have a new technology with so many potential applications (adding 152 extra codons on top of the normal 64), it can be overwhelming to figure out where to start. Getting from "we can add letters to the DNA alphabet" to "we can build new amino acids into IL-2 to facilitate PEG binding to stop IL-2 binding to the alpha unit of the IL-2 receptor while still binding the beta and gamma units, which will preserve anti-cancer effects while sparing vascular damage" seems like a leap that would require a cross-disciplinary team to make
IL-2 is a molecule that is effective at treating a variety of diseases but has nasty side effects that make it a poor drug. The article discusses sometimes fatal side effects of high dose IL-2 in treating cancer; low-dose IL-2 is actually an effective treatment for autoimmune disease, although it is difficult to titrate the dose in such a way that you don't accidentally dose too high and make the autoimmunity worse.
Another startup, Delinia, engineered an agonist selective for the alpha-beta-gamma subtype of the IL-2 receptor to treat autoimmune disease  (where Synthorx is hitting only the beta-gamma part of the IL-2 receptor to treat cancer). Delinia was acquired by Celgene last year for $775M, 3 months after its Series A. Synthorx went public this year and has a $450M market cap. All of that from making different versions of a single protein
While applying it to humans directly is unlikely applying it to be some sort of DRM for drugs to prevent generics from working is quite a realistic possibility.
There was probably a simpler two nuclide encoding versus three beforehand. About half of the amino acids only use the first two nuclides and ignore the third.
Other blocks of four codons were split for some reason. We can imagine that originally Isoleucine was determined by AU? so initially AUU, AUC, AUA and AUG encoded Isoleucine, but now only the first three encode Isoleucine and the last one encodes Methionine instead.
This is somewhat based in the blocks of four codons that follow this patter where the first two base determine 16 block that sometimes are split https://en.wikipedia.org/wiki/Genetic_code and because the third base in the tRNA is strange https://en.wikipedia.org/wiki/Wobble_base_pair
Anyway, IIRC this is a reasonable speculation but it's not confirmed. So don't take this explanation too literally.
With this idea, the initial DNA could evolve for a few (zillions) years as list like
and then make the whatever letters also important with a almost backward compatible code, so in most case it still doesn't mater, but in a few cases it is important.
[Note: The official letter for whatever is "N" instead of "?"]
You start with a two-letter code, then something evolves that puts an (initially) rare third letter at a few locations on the tape. All the old "gear" that reads two-letter code can still read most of the tape.
A translational reading frame consists of non-overlapping codons of three nucleotides. If one nucleotide is skipped, the entire downstream message is thus garbled. So how would the translational machinery operate if each codon arbitrarily consisted of two or three nucleotides?
I was thinking of how a hypothetical code might, in the abstract have evolved from binary, through ternary up to the (current base-4).
I haven't got enough biochem knowledge to speculate how three nucleotides per amino acid can evolve to have three.
I've often thought some of that redundancy in the code could be a feature. Important (more sensitive) sequences could evolve to a coding that is more robust against mutations, while things that are less important could be more brittle in their encoding. This seems hard to prove though.
It also allows a particular triplet to have more neighbors, meaning you can go from one amino acid to more options without going through intermediates.
They happened to find an abnormal DNA/RNA base pair, and it happened to be mapped by the ribosome to a PEG amino acid (I think the implication is NH2-PEG-COOH?), and relized if they swapped out an existing amino acid in IL-2 with that PEG-based amino acid, the protein would be less toxic?
Another thought: it might not be strictly a RNA-base-X maps to amino-acid-x type of operation; RNA-base-X might map to either amino-acid-x or amino-acid-y based on concentrations of AA-x and AA-y, or even based on neighborhood structure of the protein that's under construction, and they just got lucky with IL-2 or they figured they could put tons of PEG into a cell and get most IL-2 produced with that PEG-based amino acid?
So that leaves 43 codons, not 2.
One factor is that different codons translate at different speeds, and these can affect how the protein folds.
Another is that base pairs in DNA may have functions or implications apart from their function as part of a codon in a gene.
Simply remapping a codon to a new amino acid and re-writing all the genes in a cell's DNA to avoid that codon will cause many things to "break" in a cell's function. Life is a messy, inter-connected system without the clean modularity we like to have in software.
1) My argument was simply that the limit on recodable codons is much higher than two, as it was claimed in the article. That's hard to argue with. I didn't say the result in every case would be neutral mutations.
2) The consequences of recoding the stop codons are also not neutral. For example, when Isaacs (eventually) did it, there were severe growth phenotypes in the resulting strain.
So I agree that "it's not that simple," which is frankly why I'm not hugely optimistic about this class of endeavors. But the point stands that there is no need in principle for a 4-base code.
It sounds astonishing, but I can't make heads or tails of it. Rosemberg and his team managed to add new letters ("letters") to DNA encoding, and that allows them to make new proteins... how? Did they make new kinds of acid bases for these proteins – the kinds that don't exist in nature? Is that not an even more astonishing achievement?
I'm way out of my depth here, but I'm also intensely curious about anything related to genetic engineering. Could someone explain this to me?
A group added a new DNA base pair ("X" and "Y") to a strain of E. coli.
In genes, codons are triplets of DNA letters that transcription and translation machinery converts into specific amino acids. The mechanisms by which this happens are complex, but well understood.
The new DNA letters were used to make new codons that could be translated to particular, novel, amino acids.
Novel DNA and proteins - after billions of years natural CI... we better be sure to have a damn good incident response capability! But do we?
> Adding new DNA letters make
> Adding make