Hacker News new | comments | show | ask | jobs | submit login

Ribosomes seem to manage just fine. :)

You just encode a big marker (making sure it's not a palindrome-paired version of itself!) as a header. If you see that, it's a correct order. If not, it's not.

This header idea is great because then you only need to keep one strand and can toss the other, potentially quadrupling the amount of data storage (I'm assuming you can keep single strands of DNA stable).

[Left strand]

A = 00

T = 01

C = 10

G = 11

[Right strand]

T = 00

A = 01

G = 10

C = 11

Anyone know these guys at Harvard, b/c this might be a way to put, at most, 2800 terabytes in a gram? (I don't know how long the header sequences would have to be).

It's possible to have single stranded DNA, but you'd have problems with error correction.

Let's say DNA breaks, or some errors appear in the code. Thanks to the double stranded structure it's "quite easy" to repair the code.

Besides that, it's not the density which is a problem right now, but the access speed. The amount of data in DNA is so immense that doubling the density won't give any practical improvements for decades to come - if ever.

Having said that, if I'm not mistaken, some viruses are encoded by single stranded DNA & ssRNA. I'm not sure, but the density might be the reason for that.

I don't think you can simply toss one of the strands. DNA is so compact because of the way it coils, and you likely lose that if you only have one strand.

I suspect that George's lab used a sensible encoding scheme. They're fairly sharp.

I can't find a copy of the original article, but it's certainly more informative than some science journalism fluff piece.

They might not have used a "coding" scheme at all, if they were interested in characterizing the frequency and types of errors.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact