
Microsoft, UW demonstrate first fully automated DNA data storage - myinnerbanjo
https://news.microsoft.com/innovation-stories/hello-data-dna-storage/
======
iso1337
From the paper:

“Our system’s write-to-read latency is approximately 21 h. The majority of
this time is taken by synthesis, viz., approximately 305 s per base, or 8.4 h
to synthesize a 99-mer payload and 12 h to cleave and deprotect the
oligonucleotides at room temperature. After synthesis, preparation takes an
additional 30 min, and nanopore reading and online decoding take 6 min.”

Also the amount of data that can be written is very small in general for
oligosynthesis. The most high throughout methods are microarrays by Agilent
and others (not used in this paper). You can buy about 32 megabits for $6000
([http://www.customarrayinc.com/oligos_main.htm](http://www.customarrayinc.com/oligos_main.htm)).
The actual cost of synthesis is maybe $2000 as a guess.

So currently DNA storage would be good for small datasets that you would like
to store for a long time. Physical density is useless most of the time if you
can’t efficiently generate a lot of data to begin with.

We would need several orders of magnitude increase in the write capacity,
which is slowly being worked on. Typically people would like to compare
synthesis costs with Moore’s law, using Carlson curves (
[http://www.synthesis.cc/synthesis/2016/03/on_dna_and_transis...](http://www.synthesis.cc/synthesis/2016/03/on_dna_and_transistors)
).

However there hasn’t been as much progress in synthesis as in sequencing. Why?
My theory is that Theres not a very big market for DNA synthesis, so the big
investments needed haven’t really been there. Maybe storage on dna could be
that market, but it would need to show quick and easy wins (eg stepping stones
of practicality like what early integrated circuits had).

~~~
toufka
From a biology perspective, the value in DNA synthesis has plateaus. (I’m
curious about analogies to other commodities?)

If you can build small runs (<50base pairs) you can make small mutations to
dna, or read particular sections of dna. If you can make dna larger than the
average protein (~2000bp) you can invent new proteins from scratch rather than
modify existing protein sequences. If you can make dna longer than a plasmid
(~10,000bp) you can creatively invent a minimal viable replicable and
deliverable unit (a plasmid (bacterial virus)). If you can do millions if base
pairs you get to chromosomes and can invent Eucaryotic-transmissable storage.

But until you leap those plateaus, you can likely saturate the intervening
market. So even if there’s massive pent up theoretical demand for the
wholesale invention of genes, there’s no way to really demonstrate it in the
current market.

This inability to estimate demand may make it a tricky spot to invest in.

~~~
waynecochran
Can the replication process that occurs when a cell divides be used for data
copying? It seems like this, if doable, would be a huge boon for DNA storage.

~~~
iso1337
It’s already used: look up Polymerase chain reaction.

Copying isn’t as interesting as de novo synthesis, being able to specify an
arbitrary sequence and have it synthesized.

------
itchyjunk
I find a few different articles about digital data storage in DNA but they
don't seem to tell me why more data can be stored in DNA than classical
medium? Maybe I am not phrasing the question right.

""DNA can store digital information in a space that is orders of magnitude
smaller than datacenters use today.""

Would this DNA also exist in the same conditions as the data center or does it
need more things?

~~~
mbreese
Not only is DNA chemically stable and can last for years, has internal
redundancy, but it is remarkably compact. DNA is an evolutionarily optimal
data storage medium.

To put it into context, each cell in your body contains 6 billion base pairs
of data (two copies of your genome). Each base is one of 4 bits, so that’s
4^6000000000 of data in each cell. Your body has ~37 trillion cells [1]. A
person is about the size of a rack (well, maybe 10-15U by volume), so that’s
3.7e13 * 4^6e9 bits per rack.

A petabyte is 8e15 bits.

That’s a lot of data storage capacity in a small space. Moreover, there is the
potential for introducing more synthetic bases to increase the 4 to 6.

[https://www.ncbi.nlm.nih.gov/m/pubmed/23829164/](https://www.ncbi.nlm.nih.gov/m/pubmed/23829164/)

~~~
hn_throwaway_99
Minor correction: each base pair is one of 4 _values_ (A, T, G, C), so each
base pair is equivalent to 2 bits of data, which gives a number half of what
you quoted.

~~~
mbreese
Yes, I misworded it and gave the wrong numbers. I gave the number of possible
combinations in a genome, which is 4^6e9. That’s obviously not the number to
compare (and a slightly embarrassing mistake).

With 6 gigabases per cell and 2bits per base, the storage capacity is 12
gigabits (1.5 gigabytes) per cell. And with 37 trillion cells (3.72e13) in a
human body, that’s 3.72e13 cells * 1.5e9 bytes per cell which is 5.58e22 bytes
per person or 55 zettabytes. This seems like a more reasonable number.

------
mikerg87
>The team from the Molecular Information Systems Lab has already demonstrated
that it can store cat photographs, great literary works

The meme is true

------
sidcool
What are the potential applications of this tech?

------
KallDrexx
Does anyone know how they ensure read order? Since it's a fluid that's moving
around I'm having a hard time to grasp how they make sure they read the DNA in
the same order it was written in.

~~~
mbreese
You wouldn’t. The read order would be encoded in the “file format”. Kind of
like a Unicode byte order marker.

You’d have to deconvolute the signal first, then computationally determine the
strand you actually read.

------
bookofjoe
[https://www.nature.com/articles/s41598-019-41228-8](https://www.nature.com/articles/s41598-019-41228-8)

[https://www.nature.com/articles/s41598-019-41228-8.epdf?shar...](https://www.nature.com/articles/s41598-019-41228-8.epdf?shared_access_token=hKl-
Uc-
bfmBPLILRELu0JtRgN0jAjWel9jnR3ZoTv0NFgcYtKWpJFonOk61x1LvZXKBaDAXo0NDCb3HBoSQVUQ0TgSFJhLJ4T56BwcRtBONZHQRuFWlVpywak1IE4YFa4PeX-
odE_VJn8jnt5psfSA%3D%3D)

------
jasonhansel
Wouldn't any other copolymer work equally well for this purpose? (I know
nothing about chemistry or biology, so this is probably a stupid question.)

~~~
mxwsn
In principle sure, as long as you use a different polymer with similar
biochemical stability, but a lot of money and tech development has gone into
reading and synthesizing DNA in particular. Why reinvent the wheel, so to
speak.

~~~
xvilka
In fact it might be even better with DNA, since you can use much bigger
"alphabet" for this purpose, than nature gave us. Would tremendously increase
storage capacity.

~~~
iso1337
There are already xenobases being developed, eg Romesburg. Those add extra
bases beyond the 4-5 seen in nature.

~~~
xvilka
I know and was referring to them.

------
amelius
I hope they write the data after performing encryption, so hackers can't find
biological exploits should they exist.

------
social_quotient
Is it feasible (in the future) for the human body to host these DNA storage
medium/devices?

~~~
shpongled
I wouldn't put exogenous nonsense DNA into my body. That's just asking for a
bad time.

~~~
gdy
Computer viruses are about to converge with the biological ones.

------
vcdimension
This could bring a whole new meaning to the words "computer virus".

------
bsaul
There are so many science fiction novel to write using this as a starting
point...

~~~
harmful_stereo
Yeah, like owing "blood rent" to the descendants of proprietary works that
pass their copyrighted material to their offspring. Or land rights, or special
social privilege. Or the answer keys for professional and entrance
examinations. Or classified data. Suddenly you have to have your genome
sequenced to walk through the airport and documented status is conferred by
blood.

I'm not being a luddite here on purpose, but over long time scales there's a
tremendous potential for this kind of technology to push towards a kind of
class differentiated society in the way most of us would despise.

Some technologies are leveling, like roads or mass transit or vaccines or
industrially produced consumables. I don't see public institutions putting
libraries in the seeds of apple trees as a civilizational fail safe, whether
that's centrally planned economies or democracies. But maybe you could get an
ethnostate like Israel to include the talmud in your cells or your microbiome
when you settle on occupied land. Best case scenario with body horror is that
it becomes like tattoos. I await the forthcoming Atwood book with that
slightly alarmist slant.

~~~
pm90
We were supposed to have annihilated all of human life in a nuclear firestorm.
But somehow we did manage to survive. Sure, the Great Firewall and China's
potent surveillance system does alarm me greatly. But as long as the people
who create these technologies also work in a society which places responsible
limitations on their use, we might just be able to get to being an
interplanetary species at some point.

Once humanity becomes capable of living and thriving in different
planets/outer space without the mother planet is when things will start
getting really interesting.

~~~
hodgesrm
I would not rule out nuclear catastrophe just yet. The technology has only
been available for 75 years. So far most of the weapons have been in the hands
of nations with stable command and control structures. That may not be the
case in the next few decades.

