You just encode a big marker (making sure it's not a palindrome-paired version of itself!) as a header. If you see that, it's a correct order. If not, it's not.
A = 00
T = 01
C = 10
G = 11
T = 00
A = 01
G = 10
C = 11
Anyone know these guys at Harvard, b/c this might be a way to put, at most, 2800 terabytes in a gram? (I don't know how long the header sequences would have to be).
Let's say DNA breaks, or some errors appear in the code. Thanks to the double stranded structure it's "quite easy" to repair the code.
Besides that, it's not the density which is a problem right now, but the access speed. The amount of data in DNA is so immense that doubling the density won't give any practical improvements for decades to come - if ever.
Having said that, if I'm not mistaken, some viruses are encoded by single stranded DNA & ssRNA. I'm not sure, but the density might be the reason for that.
I can't find a copy of the original article, but it's certainly more informative than some science journalism fluff piece.