
The Library of Babel as Seen from Within: Reproducing Borges’s Library Online - diodorus
http://www.theparisreview.org/blog/2015/07/23/the-library-of-babel-as-seen-from-within/
======
sethev
How would you name books in the Library of Babel? Giving each book a unique
name would require as much information as the contents of the library itself.
Unless you could re-arrange the library and bring the useful volumes together,
you would never be able to make a catalog for it.

~~~
eseehausen
Books don't need to have a unique name, just like in normal libraries. You can
basically assume a title is the first X characters of the total Y characters
of a given text. The goal of the Library of Babel isn't to be a perfect key-
value store, just to create all the permutations possible within certain
conditions.

------
eseehausen
Of course, the point of the story was not universal accessibility but the
decisions made by people confronted by this incredible randomness (some
burning whole shelves after finding one incomprehensible page, some
frantically trying to protect the works, some looking for the book(s) that
would contain the contents- but of course how to know if they're accurate?).
In some ways, the Borges story stands as a testament to the massive shift
we've undergone subsequently: infinite libraries (functionally) infinitely
reproducible. At the same time, though it was still prescient- one of the main
problem of our times is no longer (re)production but curation. It's no
surprise then that Google is the behemoth that it is.

------
sbilstein
Awesome! After reading this story years ago in college I coded something up in
php to make a universal image generator as well. I love Borges literature; I
really recommend it for anyone who loves math and short stories. Some of his
other greats are "Funes, the Memorious" (about a boy with a perfect memory
that drives him insane) and "The Lottery of Babylon" (about a town that uses a
lottery to coordinate all activities).

You can read all of Borges' short stories in English in one volume:
[http://www.amazon.com/Collected-Fictions-Jorge-Luis-
Borges/d...](http://www.amazon.com/Collected-Fictions-Jorge-Luis-
Borges/dp/0140286802)

I really recommend it.

------
lisper
> which now contains anything we ever have written or ever will write

It's actually possible to demonstrate that this can't be true. You can
calculate an upper bound on the number of different states enumerable in this
universe. Basically you assume that each elementary particle is a computer
operating at a clock rate of the Planck frequency and multiply by the time
between the big bang and the heat death of the universe. The number turns out
to be surprisingly small, about 2^500 or so. Longer than this (i.e. any longer
than a tweet), and this universe is no longer capable of enumerating all the
possibilities.

~~~
jonotrain2
This would only be true if the library stored all of its books on disk. It
doesn't need to because of the algorithm it uses to generate the books, which
is described here:
[https://libraryofbabel.info/theory4.html](https://libraryofbabel.info/theory4.html)
(davegauer linked to this description as well)

You shouldn't think that this means the books do not exist. The concepts of
presence and absence which underlie our understanding of existence were long
ago inflected by the existence of virtual archives. Any digital archive, like
JSTOR for example, exists as a mass of zeroes and ones, and only enters a form
intelligible to the human eye once an article is loaded from its reserve. But
I doubt anyone would say that these articles don't exist before they are
loaded, come into being from nothing when someone opens them, and cease to
exist again when they are closed.

Already when Borges envisioned it, the library was more capacious than the
universe.

~~~
lisper
> This would only be true if the library stored all of its books on disk.

Disk? Who said anything about disk? The calculation assumes that a single sub-
atomic particle can store an entire work. Not only that, but that a sub-atomic
particle can generate a _different_ work every Planck time, and store them
_all_.

That's really the point: our imaginations are much bigger than the universe we
live in.

> the algorithm it uses to generate the books, which is described here:

Sorry, I don't see the description? All I see is:

"I found a successful formula combining modular arithmetic and bit-shifting
operations, and the result is the library you see today."

It doesn't actually say what the formula _is_.

> Any digital archive, like JSTOR for example, exists as a mass of zeroes and
> ones, and only enters a form intelligible to the human eye once an article
> is loaded from its reserve.

How is my model any different?

> the library was more capacious than the universe.

Borges was writing fiction so he was free to ignore the laws of physics. But
libraryofbabel.info actually exists, so it is not.

~~~
jonotrain2
1) You are assuming that the works need to be stored at all. They do not. As
long as the same block of text can be located based on the same input, it is
just as available as it would be if it were stored on disk or on any other
type of storage device you imagine.

2) The PRNG is irrelevant to this point. The article I linked to offers a
detailed description of how to generate text without storing it on disk.

3)Your model is identical to JSTOR's, as is libraryofbabel.info's. That's my
point - all of these are ways of creating a digital archive (or in your case
some sort of quantum archive). The point is that libraryofbabel.info is able
to archive more text than your model - more pages of text than there are atoms
in the universe.

4)The concepts you should think about more thoroughly are: fiction/reality,
existence/nonexistence, actuality/possibility/virtuality. The texts the
website contains have a potential existence - each of them can be summoned up
at any time. But they do not have the sort of actual existence which would
require one or more atoms to be available for the storage of each text. The
script capable of generating them all takes up less than a MB.

~~~
lisper
> As long as the same block of text can be located based on the same input, it
> is just as available as it would be if it were stored on disk or on any
> other type of storage device you imagine.

No, that's not true. There is a salient difference between JSTOR and LofB,
namely, that JSTOR keys are all much shorter than the works they summon. JSTOR
works thus have more information content than their keys. This is a
consequence of the fact that JSTOR only contains a tiny subset of all possible
works.

To get a work out of LofB I have to actually generate all of the information
content of that work in order to produce the key. LofB keys are, on average,
the same length as the works. So I have to do _more_ work to get something out
of LofB than I do to get something out of JSTOR. In fact, I have to do all of
the work. So it is not true that LofB works are "just as available" as JSTOR
works.

LofB is like the parent who responds to a child's request for a bedtime story
by asking the child to describe in every detail the story they want to hear
and then parroting that back to them. Any story the child asks the parent to
tell, it will tell. But neither the child's mind nor the parent's mind
contains all possible stories.

LofB doesn't contain all possible works any more than a floating point
register contains all possible floating point numbers. LofB only contains (to
the extent that it can be said to contain anything) those works which are
actually summoned, just as a floating point register only contains those
floating point numbers which are actually stored in it, just as a human mind
contains only those stories it actually thinks about. "Contains" is not a
synonym for "can potentially generate given the right input."

~~~
jonotrain
First off, we should acknowledge that your argument has shifted significantly
- from an ontological to a practical concern. The ontological question,
whether the archive exists or not, is independent from the practical question
of whether or not it is useful. Initially you said that no archive could
contain more than ~10^80 objects because the number of atoms in the universe
placed an upper limit on its contents. Now you say that this website does not
contain its books because of the amount of work necessary to retrieve them.
Would you consider this to be different if the quantity of texts was less than
10^80? If not, we should recognize that you have shifted from your earlier
argument, which was untenable.

As for your present argument, i would first point out that you should not
treat quantitative differences as essential differences. Any archival system
requires some sort of key system for retrieval. Print libraries have the Dewey
decimal system, the internet has URLs, and libraryofbabel.info has its book
locations, strings of letters and numbers which you are correct to point out
are just as information-rich as the texts themselves. But you claim that
because of the size of keys, the texts in this digital archive do not exist -
this is fallacious. How large must a key-value system get before it ceases to
exist? If a poem's title is longer than its text does it cease to exist? Greek
philosophers parodied the confusion of quantitative and qualitative thought by
asking how many grains of sand we needed to add together before we had a heap.
Once again, you have confused a practical and an ontological argument - the
length of keys (and quantity of text) may affect the usefulness of
libraryofbabel.info, but they cannot affect its existence.

You consider three heterogeneous things in your examples - a floating point
register and the set of possible floating point numbers, libraryofbabel.info
and the texts it contains, the human mind and what it "actually thinks about".
Let us begin with the last - you claim that the human mind only contains what
it "actually thinks about." If this is so, what is a memory? Your failure to
grapple with the difficult category of possibility leads to the ontological
confusion in your argument.

If we limit our consideration to practicality, I would agree that the
libraryofbabel.info is not useful in the same way as a digital archive like
JSTOR. In other libraries, one can find texts by subject matter, author,
period, etc. and learn something new from their contents. libraryofbabel.info
exists to show us the same language we have encountered already in a new
context. This allows us to reflect on the essence of language, which is not
restricted to the intentions of a conscious and rational speaking subject (as
you imagine when you speak of what we "actually think about"), but is always
open on irrational excess.

~~~
lisper
I'm actually making an ontological argument, though I concede that I was not
entirely clear.

Imagine I give you a box which I claim contains all the positive integers. You
object because there are an infinite number of positive integers and no
physical object can possibly contain an infinite number of things. So to prove
my claim I invite you to query the box and ask it "Do you contain X" for any X
you care to name.

So you ask: "Do you contain 1?" and the box answers "yes". "Do you contain
842198843?" Yes. "Do you contain Graham's number?" Yes. The busy-beaver number
for a million-state Turing machine? Yes.

But then you ask it, "Do you contain negative one?" and the box answers "Yes."
And now you protest: I claimed that the box contained only positive integers.
No, I reply, I claimed that it contained _all_ of the positive integers, not
_only_ the positive integers. In fact it contains all the negative integers
too. But this in no way diminishes the power of my initial claim, because
surely if it is noteworthy that I built a box that contains all of the
positive integers then it must be twice as noteworthy that I built a box that
contains all of the positive integers _and_ all of the negative integers too!

So then you ask it, "Do you contain one half?" Yes. Again you protest because
one half is neither a positive nor a negative integer, and again I respond
with the same argument. "Do you contain pi?" Yes. Same argument. "Do you
contain the Godel number of the proof that Peano arithmetic is consistent?"
Yes. Ditto.

So you ask, "Do you contain a unicorn?" Yes. Aha! Now you've got me, because
whatever else the box may contain, it obviously does not contain a unicorn.
No, I reply, of course it does not contain an actual physical unicorn. It
contains the _phrase_ "a unicorn". In fact, it contains all possible phrases.
It is not merely a box that answers "yes" to any question that is put to it.
It is a box that in fact contains all numbers and phrases (is this starting to
sound familiar?) and so "yes" is actually the _correct_ answer to _any_
question of the form, "Do you contain X" for any possible utterance X.

The LofB is just my box with a slightly different UI. It is a UI that does a
better job than mine of obfuscating the fact that what it underneath the UI is
completely uninteresting, and the question of whether or not my claim that the
box "really contains" all the things that I claim it contains is just
wordplay. The interesting question is not "to what questions will the box
answer 'yes'" (or "What works does the LofB contain?") because the answer is
"all of them." The interesting question is "What queries can be made of the
box/LofB?" Because the _queries_ have to exist _in this universe_ and so
_they_ are subject to the constraints of the laws of physics. And in
particular, their number is finite. Not only finite, but fairly small: less
than 2^500 or so. It's a number that's so small you can actually write it out
by hand in a matter of minutes!

It is that tiny subset of the vast space of possible queries that is the
interesting thing. Of all the possible questions we could potentially choose
to ask, we will only ever be able to ask a tiny, tiny subset of them. So we
should choose wisely. And, I submit, quibbling further over the LofB would not
be a wise choice.

------
minikites
If you're interested in the math behind the library, you can't do better than
this book: [http://www.amazon.com/Unimaginable-Mathematics-Borges-
Librar...](http://www.amazon.com/Unimaginable-Mathematics-Borges-Library-
Babel/dp/0195334574)

~~~
davegauer
Or the library's own explanation:
[https://libraryofbabel.info/theory4.html](https://libraryofbabel.info/theory4.html)

The thing that really got me going was finding and using the _search_ feature!
Until then, the library looked to me like an afternoon of programming fun, but
little more.

------
edko
I prefer the Book of Sand, which collects an infinite selection of the most
legible pages from the Library of Babel.

------
mkehrt
Quine's commentary, relevant to computer science:
[http://hyperdiscordia.crywalt.com/universal_library.html](http://hyperdiscordia.crywalt.com/universal_library.html)

------
JulianMorrison

      strings < /dev/urandom | grep wisdom

------
prewett
I wonder what percentage of the library is actually an intelligible book? I
guess for purposes of calculation an "intelligible" book could be one that is
completely composed of valid words in its target language; that would at least
put an upper bound.

Sadly my combinatorics and number theory is not up to the task.

~~~
jlarocco
The exact math would depend on the definition of "intelligible" used and how
big of an "intelligible" fragment you're looking for.

Practically speaking, though, the answer is basically 0%. It would be a
fraction with a big number on top and a very. very, very huge number on the
bottom.

~~~
jonotrain2
Consider what Borges says about the impossibility of meaninglessness before
feeling confident in such a calculation:

In truth, the Library includes all verbal structures, all variations permitted
by the twenty-five orthographical symbols, but not a single example of
absolute nonsense. It is useless to observe that the best volume of the many
hexagons under my administration is entitled The Combed Thunderclap and
another The Plaster Cramp and another Axaxaxas mlö. These phrases, at first
glance incoherent, can no doubt be justified in a cryptographical or
allegorical manner; such a justification is verbal and, ex hypothesi, already
figures in the Library. I cannot combine some characters - dhcmrlchtdj - which
the divine Library has not foreseen and which in one of its secret tongues do
not contain a terrible meaning. No one can articulate a syllable which is not
filled with tenderness and fear, which is not, in one of these languages, the
powerful name of a god.

------
m0hit
Exciting to see this on HN. npdoty and I have been cataloging this and similar
experiments at [http://webadaptation.org](http://webadaptation.org) along with
working on some of our own. If you're interested do send me an email!

------
intopieces
If you're going to pick up Borges, do yourself a big favor and get the Norman
Thomas di Giovanni translations. They were written with Borges and remain out
of print because of a copyright dispute.

------
scarmig
Library falls short on I18N standards, doesn't use Unicode.

~~~
dguaraglia
Interestingly enough, Borges decided to use the smallest possible alphabet for
the library, not even including characters in the Spanish alphabet. Then
again, this library is set in a completely separate universe... and it's the
universe itself, so I don't think I18N is a problem there.

~~~
mkehrt
Not the smallest--Quine on Borges:
[http://hyperdiscordia.crywalt.com/universal_library.html](http://hyperdiscordia.crywalt.com/universal_library.html)

