

Researchers Store 90GB of Data in 1g of Bacteria  - dfield
http://www.electronista.com/articles/10/12/22/chinese.researchers.store.data.in.bacteria

======
vehementi
They didn't actually store it, they "showed how to encrypt it" and
extrapolated that they should be able to store that much...

~~~
bhickey
Spot on. The article is garbage.

The Declaration of Independence is about 8000 bytes, without parity you could
encode the whole thing in 378 bases. With a little more care you could avoid
generating open reading frames (ensure that the sequence 'ATG' never appears).
There is nothing novel or exciting about storing this amount of data in
bacteria. If you grow multiple copies of it, you haven't stored 90gb of data,
you've simply made copies of your original 8k.

------
wgrover
"an expensive sequencer is needed to retrieve data"

Given how long it still takes to sequence an individual's genome ("only" 3
gigabases), this may not be practical any time soon.

Anyway, generating astronomical amounts of data really small using DNA isn't
that hard - say you synthesize short pieces of DNA, each 34 bases long,
randomly selecting A, C, T or G for each base (easy to do with a commercial
DNA synthesizer). If you make about a millimole (10^20 molecules) of this DNA,
odds are that each piece of DNA has a unique sequence. You could have your own
little jar containing 10^20 unique molecules. If the presence of a particular
piece of DNA represents a TRUE for a given bit, you have about 10 exabytes of
data storage in a jar.

Now, _operating on_ that data is another story...

~~~
bhickey
It doesn't take particularly long to sequence a human genome. If you want long
reads, you need to spend more time sitting on the machine. Last I checked, the
limiting steps were contention for machine time and library prep.

If you're doing a custom library prep it can be very time consuming, but I
believe the trend is moving toward whole exome sequencing. We should see
sub-$5k exomes (at scale) in 2011.

~~~
wgrover
You're definitely right. I just meant that, until you can sequence a genome
faster and cheaper than you can read 90 GB off of a hard drive, storing
information in DNA and reading it out via sequencing may not be feasible.

------
ghshephard
I'm looking forward to the follow-up post where Researchers restore 90 GB of
Data from 1g of Bacteria.

------
gregable
They stored a few kb of data replicated millions of times. That's much
different.

