Ask HN: How much data can be stored in a piece of paper? - LostWanderer
======
dnace
Quite a lot, if you encode information in DNA [1] and then soak the paper in
it. Densities of 5.5 petabits per cubic millimeter of DNA have been
experimentally achieved. [2]

A typical sheet of A4 printer paper is about 6237 cubic millimeters in
exterior volume (i.e. including interstices between fibers within the sheet.)
Say you could soak a sheet of paper in soluble DNA and dry it such that you
ended up with 6000 mm^3 of DNA in and on it. That'd be roughly 4000 petabytes.

[1] -
[https://en.wikipedia.org/wiki/DNA_digital_data_storage](https://en.wikipedia.org/wiki/DNA_digital_data_storage)
[2] -
[https://www.ncbi.nlm.nih.gov/pubmed/22903519](https://www.ncbi.nlm.nih.gov/pubmed/22903519)

~~~
milquetoastaf
Fascinating...how would you extract the data from the soaked paper?

~~~
snewk
that wasn't a requirement of this project

------
tabeth
AFAIK a single character is 1 byte.

An A4 sheet of paper is 62370 sqmm. A good printer can print a 1 sq mm
character. Therefore, at a minimum you have 62370 bytes * 2, so basically 0.12
of a megabyte, which is pretty bad.

However, say you store something on the piece of paper that uses the second
from midnight as a translator, such that the contents mean something else,
depending on which second in the day you read it. Let's also say this
translation is not using a hash type function, but is completely arithmetic
and the formula to do this can be stored on the back side of the sheet in its
entirety. This would take the 0.12MB /2 (you're not using both sides now) and
multiply that by every second of the day, until the next, so 86400 seconds *
0.06MB = about 5 gigabytes.

Honestly I think you could get into the zetabytes. There are other factors you
could use that I haven't considered:

1\. Smell of the paper

2\. Electrical charge

3\. Feel of the paper (who said we're not printing in three dimensions)?

4\. Taste (you could use a single piece of paper formed from different types,
which have unique tastes).

I think the main limitation is as you add more factors to increase the
compression you increase the complexity and time in which it takes to
decompress, or get the information back. I'm sure there's some sort of law on
this. Let's add in the orientation, in degrees of the piece of paper as well.
Let's say you can reliably use all 360 degrees to permute the existing
formula. Now you have 5 gigabytes * 360 = 1800 gigabytes. Let's just call this
2 terabytes.

~~~
odonnellryan
Even with such a function, aren't there limits to how much sensible data can
be stored? Something something Claude Shannon something something. I forget.

Otherwise you'd have infinite storage capacity, on any medium, which is very
obviously impossible.

Why infinite? Well, there are infinite numbers. If your algorithm was just
"keep applying f(i,j) by an increasing integer to get the next page of the
data" then yeah you've actually just discovered infinite storage.

This is literally a compression problem, isn't it? :)

Edit:
[http://mathforum.org/library/drmath/view/65726.html](http://mathforum.org/library/drmath/view/65726.html)

Something I came across a long time ago.

~~~
tabeth
Indeed. However, the hard limit here is the number of "discernible" and
distinct characters that can fit on the page. The whole second thing is really
just a way of permuting the existing data. So I'd say there would be infinite
data if the amount of characters that could fit on the page were also
infinite. Applying the function really was just a way to represent the
mechanism that gave you each combination. The whole seconds from midnight
thing may have been needlessly convoluted.

Though, for that portion of my comment, it would've been easier to simply say
the information is approximately equal to the amount of characters that fit on
a page and all combinations of its arrangement.

Of course, if the actual "processing" of the information lies outside of the
paper, I feel like that's kind of cheating. What do you think?

~~~
odonnellryan
I don't think it'd be cheating, just like using programs to compress files
isn't cheating.

However, there's two ways to look at this.

1) designing "some kind of data" specifically for this problem.

2) a general-purpose solution, where you could could put any data on the
paper.

You could hit some obscenely-high number for 1), using some tricks or
whatever.

But 2 probably has some sensible solution on the order of MB.

Example: if our function is as simple as "raise each number in the sequence to
the next" we'd get some obscenely-large number, and we can put that function
right on the page.

But, finding an obscenely large number representing some kind of data that
actually means something, then coming up with a rule like that to reduce it?

Anyway, my argument would be: no, having an external compression algorithm
isn't cheating, but formulating your data to fit the problem is.

Anyway, there exists no general-purpose compression algorithm, so compression
would largely be out of #2, unless we're taking a subset of the problem: "how
many English words can fit..." which of course we can come up with a good
compression algorithm for (I think!) which would make it work.

------
pasbesoin
Reminded me of this (as in, I thought I remembered something maybe named
something like "optar", and googled):

[http://ronja.twibright.com/optar/](http://ronja.twibright.com/optar/)

which I found via this repository, that makes a comment regarding apparent
abandonment of the original:

[https://github.com/colindean/optar](https://github.com/colindean/optar)

Also from memory, there is at least one other such "ginormous barcode"
utility/application that gained some attention, some years back. It may well
be the "paperback" one that another commenter mentions, here.

At least one of these had configurable resolution and redundancy parameters.

------
Pamar
According to this article:
[https://www.extremetech.com/extreme/134427-a-paper-based-
bac...](https://www.extremetech.com/extreme/134427-a-paper-based-backup-
solution-not-as-stupid-as-it-sounds) you can cram around 3MB of data (in a
reasonably robust way) on the side of a sheet of paper - it is unclear if they
mean A4 or Legal, though.

~~~
michaelflux
In the article they mention that they were using A4 sheets.

------
psyc
[https://en.wikipedia.org/wiki/Bekenstein_bound](https://en.wikipedia.org/wiki/Bekenstein_bound)

------
id122015
related to this question, I want to ask what is the best encoding method to
store as much information as possible on a finite file size ?

------
kleer001
just tomorrow's winning lottery numbers would be good enough for me

