I had an idea for a file transfer system based on digits of pi. You'd give every...

SillyUsername · on June 16, 2023

I think a lot of people had similar ideas, I know I did.

The idea of using Pi is essentially the same as having a shared dictionary as used in both regular compression and in lossy as waveforms (in layman's, terms - technically it's a best match) both ends have the dictionary and you simply index it.

In fact whole network protocols use this concept too such as protocolbuffers which are part of grpc.

The difference here being the dictionary is infinite and until we get quantum computers on the desktop indexing into pi will always be slow. Some of the address space may not be feasible too, e.g. your file may require a billion bit address/index.

It may still be feasible to do partial matches for a file, 50% one index, 50% another for performance improvement.

I guess the trade off is a balance between performance and number of indexes Vs the original file length, and where in Pi the address space becomes unfeasible.

adamgordonbell · on June 17, 2023

It's interesting how zstd can use a shared dictionary to compress small amounts of data.

It's a more practical ( then my idea ) shared dictionary.

wahern · on June 17, 2023

zlib also supports dictionaries, as does the deflate algorithm. The problem is that zlib doesn't provide any tooling to generate an optimal dictionary from a large sample, whereas zstd does provide such a utility.

Dylan16807 · on June 16, 2023

> I discovered that often the offset into pi is much larger than the data you are sending.

Shouldn't the offset be roughly the same size? If you look for a 1000 digit sequence, you need to try about 10^1000 starting points, which makes your offset about 1000 digits.

adamgordonbell · on June 17, 2023

Oh, is that true? Either way, certainly not an efficient compression scheme.

ithinkso · on June 16, 2023

> Also, it turns out that this area was already well understood. There are no free lunches with entropy.

If I ever go mad and I breach the 'crankpot line' this will be it. There is something in entropy that I just can't fully grasp even though I understand (or I can convince myself that I do) Shannon, Kolmogorov etc.

adamgordonbell · on June 17, 2023

I hear the compression tends to attract lots of 'out of the box' thinkers. So you'll have company.

fsckboy · on June 20, 2023

technically "outside the box," meaning "their creativity is expansive, not within the narrow confines of a box".

"out of the box" means works right away, as soon as you take it out of the box, no assembly required.