
Calculating Pi: My Attempt at Breaking the Pi World Record - olvy0
https://blog.timothymullican.com/calculating-pi-my-attempt-breaking-pi-record
======
simias
I appreciate the author sharing his experience but I must admit that it was
less exciting that I expected it to be. He basically used off-the-shelf
hardware, ran an off-the-shelf program and then waited for a while. There's no
obvious innovation, anybody who cares enough to break that record and has
~$20k lying around can do it.

If anything it got me a lot more curious about this y-cruncher program and all
the fun optimizations it must implement.

~~~
krzepah
I understand your disappointment but I'd like to point out this is a really
nice achievement on it's portfolio as a sys admin

~~~
prox
It also is a great side project; And having tested his system he donated it
for STEM research. There’s lots to commend in this project.

------
tromp
Reminds me of my attempt to compute the number of Go positions, which
similarly needed many TB of disk space and ended up generating 30 petabytes of
disk IO.

[https://news.ycombinator.com/item?id=9167781](https://news.ycombinator.com/item?id=9167781)
Number of legal 18x18 Go positions computed. One more to go

[https://news.ycombinator.com/item?id=10950875](https://news.ycombinator.com/item?id=10950875)
Number of legal Go positions computed

Not much point in breaking that record, as 19x19 is the largest (and standard
size) Go board.

------
mikorym
For those that are interested in reciting digits of Pi (I would guess you can
either memorise the digits or calculate as you go) the current record is held
by a South African:
[https://www.youtube.com/watch?v=_mGjJMVKWcU](https://www.youtube.com/watch?v=_mGjJMVKWcU)

~~~
las_balas_tres
He only won the record for the fastest recall of pi. 1500 digits in 4 minutes.
Daniel Tammet recited 22514 in 5 hours.
[https://en.wikipedia.org/wiki/Daniel_Tammet](https://en.wikipedia.org/wiki/Daniel_Tammet)

~~~
begemotz
the Guinness world record is over 70k digits set by Rajveer Meena:
[https://www.guinnessworldrecords.com/world-records/most-
pi-p...](https://www.guinnessworldrecords.com/world-records/most-pi-places-
memorised)

~~~
imglorp
Such an awe inspiring achievement, to store that much structured data
accurately. What else is the brain capable of doing?

~~~
Ma8ee
Structured?

~~~
imglorp
Lossless, ordered set of symbols with no errors.

It's much different than remembering a story or a travel route or an image.

~~~
Ma8ee
Yes, but structured is in general in opposition to random, which the digits of
pi seem to be. No one knows of any structure in the digits of pi.

~~~
athriren
But pi defines its structure. By being seemingly random and irrational, it has
created a specific ordered set of symbols and imbued that set with meaning
because it has a referent: we call it pi.

~~~
Ma8ee
It’s not random, but that doesn’t mean it is structured. And there’s
absolutely no difference between the digits of pi and a completely random
sequence for anyone trying to memorise them.

------
gesman
>> As luck would have it, a transformer next to my house blew and the power
went out yet again. This means I had to restart from the last checkpoint yet
again, losing another 2 weeks worth of work.

Ouch!!!

~~~
nixpulvis
Man, it fucking sucks.

I remember a time back freshman year of collage... It was an extra credit
assignment. Who could factor the largest number (or something like this) in
the least time before the due date. My friend and I devised an algorithm, and
launched it on my computer. I honestly don't remember anything other than the
fact the god damn power transformer downtown blew and fucked up our tests
because of a large scale power outage. I assume we'd have not even come in top
#5 for that assignment (who knows), but it's just so frustrating.

Basically, OSs should have better fail recovery mechanics than are the
default.

~~~
malux85
The OS recovered right?

It was your program that did not

~~~
kragen
Running under EUMEL, L3, or KeyKOS would have enabled the program to continue
from a recent checkpoint, without requiring any logic for this in the program
itself.

~~~
malux85
Oh of course! EUMEL, L3 or KeyKOS! Why didn’t I think of them?!

~~~
kragen
Probably you're so familiar with modern computing environments — and
unfamiliar with any others — that you take their drawbacks for granted,
perhaps even assuming they are unavoidable.

~~~
malux85
This got me thinking, and I have a few weeks spare time - can you please
recommend something that’s as fundamentally different as possible so that I
can get way out of my comfort zone?

I can’t reply to your below comment I’m not sure why - but you seem
experienced in alternative computing environments, whereas I’m mostly a HPC
python / c++ developer that’s spent the last 10ish years doing deep learning
and scientific computing - the newer environment doesn’t have to be practical
at all, I’m interested to use it for a change in perspective

~~~
kragen
Well, what's your comfort zone?

I wish I had a recommendation based on experience for one of these really
strange operating systems like EUMEL, Guardian, OS/400, and L3. But I don't.
I've used CP/M and MS-DOS, but those are just really limited, not really
interesting. Although, with ZCPR and 4DOS, you could make them reasonably
usable, it was like coming out of Plato's cave when I switched my primary
operating environment from 4DOS to csh on Ultrix.

Squeak is a pretty different operating environment that isn't simply
primitive. Oberon is another. They can both run as user processes on top of
Linux, as well as on bare metal. Both of them are somewhat alien.

Are you comfortable with embedded development? If not, try Arduino. It starts
out easy, since you program the boards in C++, but you have the opportunity to
build things that will run for months on a AA battery with submicrosecond
interrupt response time — because there's no OS. (It's routine for even
programming novices to write their own interrupt handlers.) Arduino instantly
gives you the ability to measure things on microsecond timescales, a thousand
times faster than you can normally see. Modern boards like the Blue Pill have
response latencies in the 100-nanosecond range when they're awake. That's the
time it takes light to go 30 meters, as you're probably aware.

In retrocomputing land, VMS was the first OS I used that was really usable.
The OpenVMS Hobbyist Program still exists, and it's actually possible to run
old versions of Mozilla on it. F-83 was an interactive Forth IDE that provided
higher-order programming, virtual memory, and multithreading under MS-DOS, in
1983 — without syntax or types. Turbo Pascal was also an IDE, in a way the
first modern IDE, around the same time; the first versions ran on CP/M and MS-
DOS. But I think that you kind of had to be grappling with the limitations of
BASIC on those systems to appreciate that.

There are Pick systems that still have enthusiastic users:
[https://www.pickwiki.com/index.php/Pick_Operating_System](https://www.pickwiki.com/index.php/Pick_Operating_System)
but they don't sound appealing to me. Other systems with cult fanbases include
FileMaker, HyperCard, and Lotus Agenda, which last I think you can run
successfully under FreeDOS. Agenda is interesting in part because it's so
alien. (It's easy to forget that it was normal at the time to have to use the
program manual to figure out how to exit.)

There are a bunch of modern specialized development environments that can do
strange things. Radare2 is an environment focused on reverse engineering.
Emacs is focused on text editing, but for some reason it's also the main user
interface for interactive proof assistants like Coq and Lean, which are
shaping up to be pretty interesting. R is focused on statistics. Jupyter is
sort of focused on data visualization, although not really. (Now I see you've
been doing deep learning for 10 years, so I guess Jupyter is your best
friend.) LibreOffice Calc is focused on rectangular arrays of mostly numerical
data (although in many cases their most advanced users use Excel instead). You
can develop applications in all of them.

How about math? It's one thing to invoke a Runge-Kutta integration method;
it's another to be able to prove convergence bounds on it. And machine-checked
formal proof is shaping up to be an interesting thing, like I said.

How about cryptography? That has the advantage that there are right answers
and wrong answers, so you can test your code.

How about shaders? Shadertoy is accessible and super fun. Maybe that's too
similar to HPC, but the shader parallelism model (similar to ispc) is pretty
different from both AVX and MPI.

How about mobile development? SIGCHI papers are full of experimental user
interface ideas to explore, and Android Studio is free and relatively usable,
if clumsy. Have you seen Onyx Ashanti's Beatjazz?

In the neighborhood of beatjazz, there's livecoding. It's a thrill to get a
nightclub full of people dancing to your code, and there are a bunch of
different environments.

GNU Radio with an RTL-SDR makes it possible for you to run DSP algorithms on
RF signals over a pretty wide frequency range, with applications in
communications and sensing. Maybe if you've been doing HPC, DSP is already
second nature, but if not it might be rewarding. And DSP has close connections
to control theory and image processing, as well as the more obvious
applications.

How about alternative programming paradigms? If you're comfortable in
procedural and OO programming, how about extreme alternatives — answer-set
programming like miniKANREN, constraint-logic programming (as supported by
modern Prologs
[https://www.metalevel.at/prolog/clpz](https://www.metalevel.at/prolog/clpz)
not just Mozart/Oz), Erlang-style fault-tolerance-focused programming, APL-
style array programming (though maybe you're familiar enough with that to take
it for granted), or Forth? How about strongly typed programming like Haskell,
Rust, or OCaml? (And of course Haskell is purely functional, and OCaml is
mostly so.)

And STM solvers like Z3 can easily solve problems now that were infeasible
only a few years ago.

Also, wasm.

Or maybe try hacking together some games in Godot.

I don't know, myself I find that it's hard to avoid getting out of my comfort
zone in some direction, just because the world is _so big_ and my knowledge is
so small. Deep learning is the out-of-my-comfort-zone programming thing I want
to try next!

~~~
malux85
Holy moly. Over the past hour I've come back to your response and read it
several times. There's so much to unpack there.

I started writing several replies but felt I wasn't able to give you the
praise you deserved, but the passage of time compels me to respond so please
know you have my full gratitude

Thank you so much for such a detailed reply

~~~
kragen
I'm so glad it's helpful! I was worried it might be overwhelming.

------
kozak
By "compressed Pi digits", does the author mean simply an efficient encoding
of decimal digits in binary, or there is some way to compress Pi digits
further? I thought they were incompressible like random.

~~~
Someone
Typically, that means storing two digits in a byte (halves storage size
compared to a text string), 9 in each 32 bits (gains another 11%), 19 in each
64 bits (gains another 5%), or something similar (at this scale, I would guess
it uses at least the ‘19 digits in each 64 bits’)

Idea is to not use “one digit per byte”, but to keep addressing individual
digits cheap.

 _”I thought they were incompressible like random.”_

They’re easily compressed, if you accept taking this program and it’s
configuration file as a compressed version.

(And yes, the output of a pseudo-random number generator compresses extremely
well, too)

~~~
maweki
That's the notion of kolmogorov complexity. The smallest program that can
generate this output. Pi and I guess any other algebraic number, no matter how
randomly distributed its digits are, are not that complex.

~~~
sp332
Pi is transcendental, not algebraic.

~~~
maweki
Yeah, sorry, I was not clear on that one. I meant the downwritable ones. Of
course, pi has no finite extension.

------
matsemann
But

> Did you win the Putnam? [0]

[0]:
[https://news.ycombinator.com/item?id=35076](https://news.ycombinator.com/item?id=35076)

------
thinkloop
> I also saw that it cost them around $200,000, which is very expensive. I’m
> aiming to stay below 5% of that overall amount.

He may have stayed below $10K in hardware, but there is no way that includes
the electricity needed to run the machines 24/7 for half a year.

~~~
dkdk8283
Assuming 3kw and a PUE of 1.5 for cooling comes out to approx 20,000 kwh of
power. Assuming a high-ish rate of $.13/kwh it comes out to around 2.5k for
6m. Not too bad.

------
fizixer
19-digit-per-64-bit compression storage requires about 19TiB [0].

1-digit-per-byte requires about 45TiB [1].

Can anyone explain how it requires 38TiB for final output?

[0]
[https://www.google.com/search?q=8+*+%285e13+%2F+19%29+%2F+10...](https://www.google.com/search?q=8+*+%285e13+%2F+19%29+%2F+1048576+%2F+1048576)

[1]
[https://www.google.com/search?q=5e13+%2F+1048576+%2F+1048576](https://www.google.com/search?q=5e13+%2F+1048576+%2F+1048576)

~~~
redcalx
If you cross byte boundaries, i.e. compactly pack the bits such that there are
no wasted bits; then you have 4 bits per decimal digit, 50T digits becomes
200T bits == 25TB. Technically there is still some slack in there because each
4 bit block represents 16 values, and we only use 10 of those.

------
Pigo
I got curious about the BOINC platform he mentioned. It reminds me of the
SETI@home screensaver I used to run many years ago. I guess I'd just forgotten
about it at some point, but I used to enjoy watching the data it was
processing. It's pretty cool that there's another platform out there that lets
you contribute to other programs.

~~~
prox
The boinc wiki mentions seti@home is part of this platform. It probably grew
out of this initiative.

------
kbob
Did he calculate the value in decimal? The record would probably be a lot
easier to achieve if he'd gone with, say, base 14. Everybody does decimal,
even though there's no theoretical advantage, and no practical advantage since
nobody actually needs trillions of digits of pi.

------
mNovak
Curious now to know how much would this cost to compute in AWS?

~~~
thesandlord
Using the setup for the 31.4 trillion digits that Emma calculated with GCP
(previous world record):
[https://cloud.google.com/blog/products/compute/calculating-3...](https://cloud.google.com/blog/products/compute/calculating-31-4-trillion-
digits-of-archimedes-constant-on-google-cloud)

GCP Pricing Page:
[https://cloud.google.com/products/calculator#id=2eca3cef-746...](https://cloud.google.com/products/calculator#id=2eca3cef-7460-4422-8dff-b267dae5a768)

So ~$200 - $250k

Could probably save a good bit with committed use discounts.

Spot / Preemptible instances would not work, in fact before Emma did this
calculation a lot of people thought this kind of thing wasn't possible on
public cloud because of perceived instabilities in a multi-tenant system.

~~~
mNovak
Does AWS still give $100k credit to YC startups? I think a couple should team
up..

------
layer8
I wonder how many bit flips are to be expected on HDDs or in RAM for that
amount of data.

------
bn7t
Is there a link somewhere where he published the result?

~~~
ficklepickle
I found it, but clicking it caused my computer to run out of available memory.

Edit: on a more serious note, a site[0] that tracks these records says:

> Downloading of digits is no longer available due to the massive bandwidth
> requirements. Your best bet is to directly contact one of the record holders
> and see if they still have a copy of the digits.

[0]
[http://www.numberworld.org/digits/Pi](http://www.numberworld.org/digits/Pi)

~~~
LeifCarrotson
The compressed representation requires 44 TB of disk space.

Assuming the author has a typical home internet connection with about 5 Mbps
upload rate, the transfer would take 2 years longer than it took to actually
run the calculation in the first place!

~~~
philshem
Here’s more info about the race between calculating and downloading digits of
pi

[https://opendata.stackexchange.com/a/4024/1511](https://opendata.stackexchange.com/a/4024/1511)

------
saagarjha
Have the results been verified yet?

~~~
aaronbwebber
It appears they have, the y-cruncher program he used does verification.
Alexander Yee, who wrote the y-cruncher program (which has been used for the
previous 4 world record calculations of digits of Pi), has accepted it and
posted it on their site, along with screenshots of the output of the program.

[http://www.numberworld.org/y-cruncher/](http://www.numberworld.org/y-cruncher/)

~~~
hamiltont
One cool note per my reading here - prior record holder Emma Haruka Iwao was
working for GCP as a developer advocate, and GCP makes her 31.4 trillion
digits available via cloud images. Neat :-)

For those of us with less insane needs, they also exposed an API to grab the
specific digits of interest - [https://pi.delivery/](https://pi.delivery/)

------
saltyfamiliar
What an irresponsible use of energy. At least mine Bitcoin...

~~~
stjo
Of all the “irresponsible” uses of energy, this is the one I endorse the most.

~~~
goldenkey
It's not irresponsible if the dude needed heat in his home. Far and few people
realize that all electronics are heaters that happen to do computation in the
process of conversion of electricity to heat. Sure, it's not convective if
there aren't fans to disperse the heat.. but it's a beefed up computer so I am
sure he had good PC fans.

In any case, heat that is created without convection fans to spread it
quickly, will still diffuse into the home albeit at a slower pace. It has to
go somewhere..

~~~
saagarjha
> Far and few people realize that all electronics are heaters that happen to
> do computation in the process of conversion of electricity to heat.

Surely anyone who's used a computer with a fan in it knows this…

~~~
goldenkey
That's easy to say but pretty much everyone I've talked to, albeit the few
oddities, thought that dedicated space heaters were way different than X-Box
getting hot, or computer getting hot. Think of the cognitive dissonance to
switch to all LED bulbs in a cold climate, because you're told it's more
efficient, meanwhile using electric space heaters at the same time... It
happens more often than you think, given the politics.

Of course, there are some heating methods that are cheaper than electricity -
like natural gas. So one would have to factor that in, along with power plant
emissions vs home emissions.

It's a complex entrenchment. It'd be interesting to produce a video of street
interviews just to see what the average person thinks about electronics
getting hot. It would make for a good case study of the intersection between
common sense/basic physics knowledge/and energy/eco politics.

