

Can all books be found somewhere within the number Pi? - jkbyc
http://www.jakubkotowski.com/2015/03/can-all-books-be-found-somewhere-within.html

======
brudgers
The idea of encoding a book as a string is incoherent once we look at it with
intellectual rigour. How are images in illuminated manuscripts or graphic
novels encoded? What does "all books" even mean? Are we talking about each
individual artefact past present and future or some idealized representation
of each: e.g. is my current copy of _Catcher in the Rye_ encoded separately
from the copy I was assigned in 10th grade English class?

It's all a matter of interpretation.We can just as easily choose a different
arbitrary encoding and claim to have found all the books in π. There's no need
to make things complicated. We are free to pick any interpretation we want
once we are claiming that numbers represent books. Let:

    
    
      B = {b1, b2,...bn}
    

Such that it contains the set of all books. And let:

    
    
      def Find-books(num)
          if 3.14 < num < 3.15
          then return B
          else return "all the books not found"
    

The article assumes that there is some natural way of encoding books. But
digits of π are not Unicode or Ascii characters. Though we can interpret a
digit or string of digits as such, that encoding is arbitrary not a property
of the natural or mathematical world.

~~~
Fargren
The thing about normality is that if a number is normal, it will contain all
books not only for a given encoding of a book, but for _any_ given encoding of
any book. That is true only for normal sequences, of which pi is assumed to be
an example.

~~~
millstone
Nitpick: This is NOT true for only normal sequences. See disjunctive
sequences:
[http://en.wikipedia.org/wiki/Disjunctive_sequence](http://en.wikipedia.org/wiki/Disjunctive_sequence)
.

pi may contain all numbers as substrings and yet not be normal.

------
plikan13
Yes, but the number which indexes the position in PI where your book starts
will probably be longer than the book itself.

~~~
gus_massa
More precisely: Let's assume that pi is absolutely normal and we use a 256
letters alphabet (ascii, ¿latin-1?).

I'm almost sure that if we have a "book" with N characters, in average, the
number of (decimal) digits of the index of that string in pi is N * log(256) /
log(10). (I'm too lazy to write the proof now, so perhaps I'm making a
mistake.) This is (esencially) equivalent to that the expected position is
256^N. (But I'm taking averages willy-nilly.)

If this is correct, the position increase exponentially with the book length,
but the numbers of digits in the position increase linearly.

------
BuildTheRobots
Theoretically, everything can. And there's a FUSE implementation:
[https://github.com/philipl/pifs](https://github.com/philipl/pifs)

~~~
dannypgh
Theoretically? Citation needed. I believe this is only true if pi is normal,
and while there's a lot of conjecture that pi is normal and no evidence that
suggests it isn't, there is no proof.

~~~
PeterWhittaker
Normal or not normal? If a work can be found in pi, then one would think a
Finite State Gambler (FSG) would succeed on the portion of pi equivalent to
the work. But FSGs cannot succeed on normal sequences, meaning this sequence
would be non-normal.

Of course this begs the question of whether a number normal over an infinite
sequence can contain sub-sequences that are non-normal - sequences of 100
tails will occur in sufficiently large completely normal (randomized)
sequences of random coin tosses, e.g.

~~~
Fargren
A normal sequence includes all non-normal sequences, by definition of
normality.

------
axblount
The question is: is pi a 'normal' number? It's an open question. Note that
it's not particularly difficult or illuminating to generate a normal number,
example: 0.(binary digits of 1)(binary digits of 2)...(binary digits of n)...

[http://en.m.wikipedia.org/wiki/Normal_number](http://en.m.wikipedia.org/wiki/Normal_number)

~~~
Fargren
That number is binary to base 2, but not necessarily to, say, base 10. Making
an absolutely normal number (that is, a number that is normal to any base) is
more involved[0]. It has only been recently shown that this can be done with
polynomial time complexity[1]. My licentiate thesis was an implementation of
the second algorithm.

[0][http://www.sciencedirect.com/science/article/pii/S0304397501...](http://www.sciencedirect.com/science/article/pii/S0304397501001700)
[1][http://www.dc.uba.ar/people/profesores/becher/poly.pdf](http://www.dc.uba.ar/people/profesores/becher/poly.pdf)

------
a3n
Disclaimer: I don't know what I'm talking about.

My dim understanding of the issues leads me to consider a conflict. Pi is more
or less considered to be more or less random, or some flavor of random,
notwithstanding known patterns of Pi. "Random," to me, sounds a lot like
"unorganized."

A book is definitely organized. A larger book is highly organized
(entropically speaking). So while you probably can find the same sequence of
words in a two-word or ten-word or other small book in Pi, at some point you
get a book that's too highly organized to appear in Pi.

However, Pi is also infinite, so it's infinitely possible to find any
sequence. (This sounds really hand-wavy to me).

But since Pi is infinite, then isn't it also infinitely unorganized?

~~~
jjoonathan
> "Random," to me, sounds a lot like "unorganized." A book is definitely
> organized.

Search for "microstate vs macrostate" and keep checking links until you hit an
explanation that you like.

Alternatively, let me give it a go: Enumerate all 10-character strings. The
string "Hi, there!" appears once, just as often as "l9.gn;omeh" (which also
appears once). We call these "microstates". However, if we label 10-char
strings green if they look like valid English and red if they don't, then
clearly most of our table is going to be colored red. Red and Green are
"macrostates". Unlike microstates, red and green do not have equal probability
if we choose an element of the enumeration at random.

    
    
        p(red)>>p(green)>>p("Hi, there!")=p("l9.gn;omeh")
    

OK, now you were probably picturing a "entropy trace" computed over windows of
digits of pi, like you would get out of binwalk. ''Entropy'' is a macrostate,
not a microstate. Even though all sequences of digits within a window are
equally likely, if you choose a sequence at random it will probably (!!!!)
have high ''entropy''. But possibly not. Having high entropy is like being
labeled "red" in the 10-character enumeration table: if you pick an entry at
random, you're probably going to hit red, but if you keep doing it then
eventually you will hit green, and if you do it even longer then eventually
you will hit "Hi, there!". Just like if you keep looking at digits of pi,
eventually you'll find a block of them where the ''entropy'' is low. In fact,
if pi has the properties that Mathematicians conjecture it does, you'll find
infinitely many such blocks of anomalously low ''entropy''. If you keep
looking long enough, you'll find one with your book in it. These anomalies
aren't due to pi being less than perfectly random, they are due to the
definition ''entropy'' being statistical in nature. It's only "right" about
randomness most of the time.

> However, Pi is also infinite, so it's infinitely possible to find any
> sequence. (This sounds really hand-wavy to me).

Here is the more precise mathematical analog from the article:

> PI is believed to be a normal number and therefore all possible finite
> sequences of characters appear equally often in it

it's still an open question, but the trouble doesn't lie in the definitions,
it lies in finding absolute proof, which they haven't done yet.

~~~
a3n
That's actually pretty good, thanks.

------
danbruc
Maybe it could be more efficient to go the other way round - instead of
searching in the digits of pi kind of invert the BBP formula [1] and try to
calculate the position of what you are looking for. But because the BBP
formula involves rounding this is certainly not a simple task and I have
really no idea if anything could be gained.

[1]
[http://en.wikipedia.org/wiki/Bailey%E2%80%93Borwein%E2%80%93...](http://en.wikipedia.org/wiki/Bailey%E2%80%93Borwein%E2%80%93Plouffe_formula)

~~~
PeterWhittaker
I'm not sure the problem is invertible like that:
[https://researchspace.auckland.ac.nz/handle/2292/3516](https://researchspace.auckland.ac.nz/handle/2292/3516)

There may be no shortcut.

------
gchokov
Theoretically, you can also find all books in a random N-long number as well.
Just keep on generating.

------
antaviana
At which digit of Pi can I download the code used by the author of the blog
post?

------
hurin
It's been argued about a lot - with far brighter people unable to come to
agreement than the author of this blog.

The reason the linked article in itself is quite worthless is because it
trivializes the question from philosophy of mathematics to _oh pi is random
let 's calculate probability_ \- but obviously that has nothing to do with the
real problem.

