Hacker News new | past | comments | ask | show | jobs | submit login
Reverse Engineering the Source Code of the BioNTech/Pfizer SARS-CoV-2 Vaccine (berthub.eu)
549 points by ahubert on Dec 25, 2020 | hide | past | favorite | 60 comments



My knowledge of modern genetic engineering is limited to some popsci literature, so I understand only parts of this story. I was surprside by this part though:

"Over many years of experimentation, it was found that if the U in RNA is replaced by a slightly modified molecule, our immune system loses interest. For real. [...] The really clever bit is that although this replacement Ψ placates (calms) our immune system, it is accepted as a normal U by relevant parts of the cell."

This sounds like a serious backdoor or attack vector to me - would there be any risk in introducing this Ψ, allowing some natural occuring permutations to turn wild or dangerous?


I'm not an expert in this issue. But the Wikipedia article has the following statement:

"There are 11 pseudouridines in the Escherichia coli rRNA, 30 in yeast cytoplasmic rRNA and a single modification in mitochondrial 21S rRNA and about 100 pseudouridines in human rRNA indicating that the extent of pseudouridylation increases with the complexity of an organism."

So at first it shows, that Pseudouridine is not something completely new in nature. It exist for a long time already. And it seems a higher amount signals to the immune system, that this RNA is more human like and not coming from a virus. Since this mechanism seems to be established in nature I think one can assume it is safe. If it wouldn't be safe, viruses would probably already exist, which would make use of this "backdoor".

https://en.wikipedia.org/wiki/Pseudouridine


Thank you! I came here to ask the same question.

Now that we are producing and injecting industrial quantities of Ψ, any chance this will get incorporated in actual viruses or maybe teach the immune system not to ignore it (or maybe even go crazy about it)?


They're injecting already-made Ψ, not any of the genetic instructions that would enable cells/viruses to make their own Ψ.

But the broader concern about advances in biotech being potentially very dangerous is I think a very good one..


I think there's no risk since cells can't make Ψ, and probaby Ψ is not super durable as a chemical.


> cells can't make Ψ

Pseudouridine is the most common RNA modification in cells, actually. They produce it all the time.

> and probaby Ψ is not super durable as a chemical.

Quite the opposite. Pseudouridine usually increases the stability of the RNA molecule it modifies.


Taken from the article (probably after an update):

Many people have asked, could viruses also use the Ψ technique to beat our immune systems? In short, this is extremely unlikely. Life simply does not have the machinery to build 1-methyl-3’-pseudouridylyl nucleotides. Viruses rely on the machinery of life to reproduce themselves, and this facility is simply not there. The mRNA vaccines quickly degrade in the human body, and there is no possibility of the Ψ-modified RNA replicating with the Ψ still in there. “No, Really, mRNA Vaccines Are Not Going To Affect Your DNA“ is also a good read.


I should have cleared that up. Normally Ψ would stand for pseudouridine, but the vaccine has a further modified base, 1-methylpseudouridine.


Cells might not, but people can...


The article directly addresses that question, saying it's highly unlikely because the viruses don't have the machinery to manufacture it.


Is there a book or textbook that approaches genetics from a source-code hacking point of view? It's quite interesting and I'd like to learn more


Synthetic Biology: A Primer

Wetware: A Computer in Every Living Cell

(the second for systems biology)


In the the footer of the blog post 'BioNTech/Pfizer SARS-CoV-2 Vaccine', the author links to many of his previous blog posts and talks, videos that do just that.

Key Sentence

'In addition, I’ve been maintaining a page on ‘DNA for programmers’ since 2001.'

https://berthub.eu/amazing-dna


Nice article. There are more interesting things to say about RNA that makes it a bit less like code - for example that it too can fold to a 3d structure, with some cases that can cause problems - hairpins for example where the sequence is palindrome-like (in the complementing letters) ACG followed by CGU can fold onto itself. This caused the problems with the initial batch of CDC's PCR tests


And on top of that RNA can exhibit non-canonical base pairing like G-U. [1]

[1]: https://commons.wikimedia.org/wiki/File:Wobble.svg


as I understood it from the "universal table for mapping RNA codons to amino acids" one can encode the same information differently. So is it just possible define the information you want to encode and use a computer to calculate the correct encoding so that things don't get tangled up?


Yes. Codon optimization is a routine procedure when designing genes and is mentioned in the article. Methionine and tryptophan, however, cannot be optimized since they only have one codon each.


Sounds like a new field of software engineering in the near future. And I like how the author casually mentions that there's an unexplained "linker" code in the AAA sequence that controls the number of copies to be made, with an implicit assumption that everyone's acting in a good faith. Frankly, if I noticed a strange 20 byte sequence inside the tail of zeros in an executable file, I'd instantly assume it's opcode instructions, probably a function, that gets control by a "jmp" instruction hidden somewhere in the main code.


I can't say with certainty, but I learned the DNA printer makes shorter sequences that then self-assemble into the final strand. So if you have too many repeating AAA's, you can't control if there's 20 or 100 because new AAA sequences will keep latching on to the end of the strand.

So, you break it up every so often with some other nucleotides.


Correct they are pattern builders and in repetitive sequences they have trouble terminating similar to recursion without an endcase.


Can anybody explain why/how the resulting protein is recognized as foreign by the body? It is build using the same components and mechanisms as every other one, right? The body can't have a dictionary of 400.000 known proteins, can it?


The full response here requires a deep dive into the immune system and self tolerance but basically, yeah, the immune system learns how to know what is ‘self’ and what is ‘else’. And this is specific to a person’s actual body (ie the reason someone develops organ rejection in the event of a transplant or can develop graft vs host disease in the event of a stem cell transplant (new immune system attacks body bc it isn’t ‘self’).

It’s really an amazing system


One of the best reads I had in months. Please do more like these.


Thanks :-) On https://berthub.eu/articles/ there is more where this came from.


The article reads very similar to articles about reverse engineering machine code in undocumented, obfuscated, or malicious software.


"life is a 4 billion year old software project" :-)


„written by interns who kept changing random stuff until bugs disappeared”


> Over many years of experimentation, it was found that if the U in RNA is replaced by a slightly modified molecule, our immune system loses interest. For real.

Countdown to see this in vivo. For real.


The author may have added this in after publishing:

> Many people have asked, could viruses also use the Ψ technique to beat our immune systems? In short, this is extremely unlikely. Life simply does not have the machinery to build 1-methyl-3’-pseudouridylyl nucleotides. Viruses rely on the machinery of life to reproduce themselves, and this facility is simply not there. The mRNA vaccines quickly degrade in the human body, and there is no possibility of the Ψ-modified RNA replicating with the Ψ still in there.


maybe there's a reason why it can't be exploited easily in nature? It looks too obvious.


Great post!

A non-technical detail that the post misses is the human story here. Katalin Kariko, the biochemist that pioneered the idea of delivering vaccines in mRNA form, got nothing but rejections for her grant applications for this very idea and was eventually demoted from tenure track at U Penn.

Whatever you think of this, it's a misleading understatement to describe it as:

> As with other fundamental scientific research we are now reaping the benefits of, the discoverers of this technique had to fight to get their work funded and then accepted.

From the article[0] the post itself cites:

> By 1995, after six years on the faculty at the University of Pennsylvania, Karikó got demoted. She had been on the path to full professorship, but with no money coming in to support her work on mRNA, her bosses saw no point in pressing on.

[0]: https://www.statnews.com/2020/11/10/the-story-of-mrna-how-a-...


There is a common thread in many discussions (esp. on Reddit) creating a false dichotomy between the greedy private sector and the taxpayer-funded basic research that comes with all the important new ideas.

While not an entirely false picture, the real situation is far from that clear and scientific struggles of Katalin Karikó, who finally left the academia for private sector (she is now a vice president in BioNTech, one of the vaccine-producing corporations) is an illustration of the perils of the contemporary grant systems.

Truly revolutionary concepts are often indistingushable from bullshit. At least from the grant committee point of view. Your best chance to snap up a grant is to come with a project of marginal improvement that produces one or two papers in a reliable timeframe.

Marginal improvements have their indisputable value, but they mostly appeal to risk-averse people; whoever wants to work on something really outlandish, must rely on other sources of financing, often private. After all, there is a risk of utter failure = not producing even that one paper that is, these days, a basic unit of wealth in the Publish-or-Perish world. Or of a delay that breaks the original time plan.


>There is a common thread in many discussions (esp. on Reddit) creating a false dichotomy between the greedy private sector and the taxpayer-funded basic research that comes with all the important new ideas.

Well, in the modern world we have both a greedy private sector and a greedy academia...


>Whatever you think of this, it's a misleading understatement to describe it as:

It might be an "understatement" but it's hardly misleading. You make it sound like the author intentionally dissed them by downplaying the story.

The technique eventually did end up funded and accepted. And she still made millions (from the license), will get a Nobel soon (I predict - there's alredy pressure for that), plus, her story will 99% be made into a movie sometime in the next 20-30 years (I also predict).


Great explanation and wonderful read! Kudos ahubert @PowerDNS_Bert


This was a compelling and inspiring read, thanks.

The way the CS and genetics concepts map with each other is really fascinating and thought-provoking.

How would a curious engineer go about learning more about this particular "computer architecture" ?

Going further, is there a realistic career path for someone with a software engineering background to retrain, pivot and meaningfully contribute to research in this area ?

By meaningful I don't just mean the obvious role of building software tools for genetics engineering & research, but actually writing those RNA bits and designing vaccines or other similar items.

Would you need to basically start from scratch as a biology or medical undergrad before working your way up to a PhD, or are there interesting jobs at the crossroads where a previous career as an engineer might be an asset ?


So .. it is not easy to get into this field. From the outside we tend to see DNA as some freestanding thing that fascinates us. The biology people intertwine DNA with everything they do. The route to knowing about this stuff therefore goes via learning about biology first. One of my favorite books, Molecular Biology of the Cell, covers DNA thoroughly.. by spending time on it on almost every of its 1300 pages. But it is all in between "the rest of biology". In a way I can understand this of course, but it is very very hard work.


Hi Bert, what do you think of http://rosalind.info/problems/list-view/ ?


I watched this cool youtube video by Thought Emporium where he makes custom spider silk: https://www.youtube.com/watch?v=2hf9yN-oBV4

Just seems like he reads the manufacturer's instructions!


I really enjoyed the article, thanks for posting it.

For the ones with the knowlegde I have the following question: how long in days (hours?), does it take for this mRna to degrade and stop functioning and restoring the normal working of the host cells?


Richard Feynman once said:

Poets say science takes away from the beauty of the stars - mere globs of gas atoms. I too can see the stars on a desert night, and feel them. But do I see less or more?

This article was fascinating to read. The deep understanding in genetics, biology, engineering, etc..., and all the work from thousands of people that ultimately culminated in the production of this vaccine, illustrates how beautiful science is.


This was a really amazing and illustrative article. As a programmer, it definitely gave me a better understanding of the vaccine (or at least I believe so).

I was curious about a part that I haven't found a lot of writing on so far - namely, what happens with the artificial spike protein after the cell assembled it.

Does it, like, keep floating around inside the cell? Does it attach to the cell membrane like it would to the viroid? Does it leave the cell and enter the bloodstream or does it something altogether different?

I figure it can't really stay inside the cell as it has to interact with the immune system at some point.

However, if it attaches to the cell wall, wouldn't this cause risk that the immune system considers the whole cell a pathogen and attacks it? After all, teaching the immune system to destroy anything the spike is attached to is sort of the whole point of the exercise.

Finally, if the spike enters the bloodstream, could it wreak havoc with ACE2 receptors?


Fascinating article,

Any suggestions where I could go to learn more specifically about this sort of genetic engineering and more generally about biology (University level and above).


I used: - Vincent’s Racaniello virology iTunesU course (viruses exploit all biology systems to hack the body) - introduction to Genetics on inkling

Many people also recommend the book „Cell”.

You will also want to start with a first few chapters of Organic Chemistry from any course/book


The article is easy to follow and is an enjoyable read!


Thanks! If anyone has questions, I am here for them!


Kudos for the fascinating article. Since you mentioned security, and the well known trick of sending an ambiguous message that is interpreted differently by different interpreters, I wonder what you might think of enlisting the hackers of the world to help in viral research. Seems as though adversarial thinking skills could come in handy?


I definitely think the computing / information security mindset is relevant and applicable to computational biology. There are sadly structural reasons why the crossover is very hard. I spent 18 months at a university doing this kind of research but it is not a natural fit. Also, I do have to tell you, I spent 10 years studying this stuff before I became useful to the field. So you don't just jump in there :-)


Thank you for the thoughtful reply. You read my mind.

I would be happy to devote 10 years (or a lifetime for that matter) to bring the fields to bear on one another, sounds super interesting.

As others have already noted, most work at the "intersection" of the domains is currently superficial, say designing support software as opposed to actually hacking on the genetic code.

Could you go into more detail about the structural reasons that prevent a clean mapping? Can you think of ways to lower the barrier so as to engage more people? Do you see problems that are relatively easy to export?

An example of exporting problems, in this case from digital pathology, is the Camelyon challenge, which brought machine learning to bear on cancer detection. Hype aside, the important aspect is that researchers working to segment images and train neural networks did not have to have a biochemistry background, nor understand cellular functions and provenance, nor understand staining protocols. A clean export.

I am basically looking for a "biocrackme" that is relatively self contained.


Has anyone come up with an answer for the question that you posed mid-text?

The question about the one base change that did _not_ lead to an additional C or G, the CCA -> CCU modification?

(Note to myself: I think the question is: Why CCA -> CCU and not CCA -> CCG)


Would be curious what the differences in the Moderna vaccine are (at this level - I know the lipids are different).


yeah, I'd love to know as well. Both Moderna and BioNTech licensed technology from University of Pennsylvania, but I don't know if that is only the 1-methyl-3’-pseudouridylyl bit. If I find out I'll update the post.


Love it. The explanation is amazing and strikes the right balance when it comes to introducing the things needed by the uninitiated to grok the gist of the vaccine.

How far we have come, as a species, by looking at what this vaccine is, is both inspiring and scary


Amazing read. Leaves me with wondering how do we make sure the sequences injected with the vaccine are exactly the same sequences that are described. Do they have some kind of verification method that there was no encoding errors and they won't inject some random sequences? Also what's the risk of this erroneous sequences to be harmful?

The futurologist in me imagines prescriptions containing RNA sequences which you get to the nearest pharmacy to 3D print the drug. Future of the medicine looks more exciting than ever.


Since the sequence is available, does this mean I could pool some money with some friends and buy a dna sequencer and start printing my own vaccine?


you could get a long way, but it would be extremely dangerous. The real problem is making the vaccine.. and ONLY the vaccine :-) And not inject random bacterial stuff that you used along the way.


What kind of “license” applies to this “source code”?


Patents


This is so cool!


Fascinating & inspiring that so many innovative techniques have gone into these vaccines. I wonder if any are patented?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: