Hacker News new | past | comments | ask | show | jobs | submit login
New Proof Settles How to Approximate Numbers Like Pi (quantamagazine.org)
56 points by theafh 5 days ago | hide | past | web | favorite | 40 comments
 help




Ugh. In the title:

"Ancient" means 1941. "Simplify" means approximate. "Numbers like pi" means irrational numbers. "How to" means "how accurately you can".

Background to the actual conjecture: if you take any irrational number and ask how well you can approximate it by rationals p/q, then it turns out (the proof isn't trivial but e.g. plenty of undergraduate mathematicians would be able to follow it without too much trouble) you can always find infinitely many approximations where the error is no bigger than of order 1/q^2.

The actual conjecture: suppose you want it to be true that for _almost all_ irrational numbers, you can find infinitely many p/q with an approximation error no more than f(q). Then (so conjectured Duffin and Schaeffer, and so apparently Maynard and Koukoulopoulos have proved):

you can do it if and only if the sum (over all positive integers q) of f(q) phi(q) is infinite, where phi is the "Euler totient function": phi(q) = number of things <= q and coprime to q.

Very crudely, how fast does this allow f to decrease with q? Well, the sum of 1/q is infinite but "only just"[1], so we want f(q) phi(q) not to decrease much faster than 1/q. phi(q) varies fairly wildly but crudely is "not usually much smaller than q", so we want f(q) q not to decrease much faster than 1/q; that is, we want f(q) not to decrease much faster than 1/q^2. In other words, the "classical" result I started with is kinda-sorta about the best you can do. (I think that at this level of handwaviness this has been well known for a while; the Duffin-Schaeffer conjecture is about making it extra-precise.)

[1] The sum of 1/n is infinite but that of 1/n^(1+h) is finite if h>0. The sum of 1/(n log n) is infinite but that of 1/(n (log n)^(1+h)) is finite if h>0. The sum of 1/(n log n log log n) is infinite but that of 1/(n log n (log log n)^(1+h)) is finite if h>0. Etc. So in some sense the boundary between finite and infinite is very close to the (kinda-nonsensical) 1 / (n log n log log n log log log n ...).


To be clear, in the statement of the conjecture, we do not need f(q) to be a polynomial or “nice” function of q like 1/q or 1/q^2 — it may not even be an increasing function of q (so statements like “we want f(q) not to decrease much faster than 1/q^2” can be a bit misleading) — one can have arbitrary functions like, say, specifying that f(12)=1/4, f(20)=1/8, and for other q, f(q) = {0 if q is a multiple of 7 (so basically such denominators are never allowed), 1/3 if q does not contain the digit 9, 1/(q log q) otherwise}, etc. I imagine the fact that the proof has to work with arbitrary functions was part of the difficulty.

Yes, f can be any function at all here, and indeed this allows for the sort of more complicated things you mention, and indeed this is where a lot of the difficulty lies. Duffin and Schaeffer, for whom the conjecture is named, had already proved it subject to a restriction on f that kinda-sorta requires it not to vary too wildly.

(One other bit of precision that I glossed over: the conjecture is actually about approximations p/q for which p and q have no common factor and that genuinely makes a difference here.)


We've switched to the article title now. It was submitted as "New Proof Settles Ancient Question of How to Simplify Numbers Like Pi". Perhaps the publication got the message and changed it?

I feel this way about every quanta article. It's like they just take a bunch of mathy words that have meaning on their own and assemble them in ways that don't actually say anything about anything

I've felt that way about other Quanta articles, but this one seems substantially worse than usual to me.

Thank you. That title was completely nonsensical to me and it's bizarre that it would be worded that way.

I understand that the article is trying to simplify the topic for a general audience, but there are some things that are just plain counterproductive (even if the article clarifies what they mean later):

> New Proof Settles Ancient Question of How to Simplify Numbers Like Pi

The title, obviously. This should say "approximate", since we're not simplifying pi.

> Under what circumstances is it possible to represent irrational numbers that go on forever — like pi — with simple fractions, like 22/7?

Never, unless you're approximating.

> He proved that for every irrational number, there exist infinitely many fractions that approximate the number evermore closely. Specifically, the error of each fraction is no more than 1 divided by the square of the denominator.

The first part is obviously true and should have been merged with the second part.


To the article's credit, I was able to back its simplifications into a reasonable understanding of the underlying problem. I think the author does genuinely understand the underlying mathematics of at least the problem statement, because when they don't the attempt to simplify for a high-school-level audience usually renders the results total gibberish, as we've seen many times on HN.

Mathematically speaking, Pi can’t be “simplified”, but I’m personally curious what ratio is Pi to the highest precision necessary to calculate anything tangible (the circumference of the universe to the Plank length). Practically speaking one could say that is the simplification of Pi.

https://www.jpl.nasa.gov/edu/news/2016/3/16/how-many-decimal...

Let's go to the largest size there is: the visible universe. The radius of the universe is about 46 billion light years. Now let me ask a different question: How many digits of pi would we need to calculate the circumference of a circle with a radius of 46 billion light years to an accuracy equal to the diameter of a hydrogen atom (the simplest atom)? The answer is that you would need 39 or 40 decimal places. If you think about how fantastically vast the universe is — truly far beyond what we can conceive, and certainly far, far, far beyond what you can see with your eyes even on the darkest, most beautiful, star-filled night — and think about how incredibly tiny a single atom is, you can see that we would not need to use many digits of pi to cover the entire range.


You'd need something along the lines of 60ish digits: https://www.wolframalpha.com/input/?i=diameter+of+the+observ...

Yes, but what’s the ratio that’s precise within 10^-60

Here's one: 3141592653589793238462643383279502884197169399375105820974945/10^60.

In all fairness, though, when you read on, the word “approximate” is seen more frequently, and the word “represent” less so.

Here's a demonstration I made recently, of best rational approximations: https://shreevatsa.net/fraction/best — much of this Quanta magazine article sets up background on such approximations, so playing with the numbers on this web page may help build some intuition about the problem.

In particular, the Duffin-Schaeffer conjecture that has just been proved involves asking about the set of irrational numbers that are approximated to within an error of f(q) for each denominator q, while the webpage above simply finds the best possible approximations for a single fixed irrational number (well, rational…) from the (semi-)convergents of its continued fraction — so any denominator is allowed and the error (for convergents) turns out to be the Dirichlet bound 1/q^2.

(For best results, choose a really large bound on the denominator and only ask for the “bestest” rational approximations. The source code is just a single HTML file and a bunch of plain JavaScript files to do the computation, so if you can improve it, please do so and let me know: https://gitlab.com/svat/web-fraction/tree/master/website .)


In your calculator, what does "bestest" mean?

Best rational approximation of the second kind (as defined in the text higher up on the page). Partly a joke, as there's already a natural idea of “best rational approximations” (the first kind), and these (the second kind) are even better than that.

I'd like to see rational numbers much more used in computers instead of floating point. Not just for improved representability of widespread numbers such as 1/3. Also what the article hints towards: accuracy of pi in half-precision 16-bit floating point is around 1/(2^11) ~ 0.0005, while 355/113 is more precise 1/(113^2) ~ 0.00008 and the fraction is still possible to be stored in 16-bit.

$YOUR_FAVORITE_LANGUAGE almost certainly has a rational library for it. Give it a try and use it for some serious mathematical purpose; nothing will show you why we don't do that all the time like trying it.

I invite you to ignore any syntactic issues that may arise, like perhaps having to use x.Add(y) instead of x + y. If those were the fundamental problems, the solution would be trivial. That's not the fundamental problem with rational numbers being the pervasive default; the problem is that for your increased precision, you pay for algorithms that you are used to being O(1) in space and time, albeit with potentially imprecise results, suddenly taking on seemingly-random values for both, with disturbing amounts of things like "O((log2 denominator + log2 numerator)^2)" showing up and then stacking on top of each other (^4, ^6, ^8, etc.).


And yet if you announced to a general audience of programmers that you had a marvellous new programming language that guarantees performance but doesn't guarantee correctness, you'd be laughed out of the room.

Of course, a general audience of programmers prefers JavaScript and Python, because we want our languages to be both slow and incorrect.

More seriously, general programming always prefers performance over correctness, because correctnes is expensive and very few programs are rocket surgery where perfect correctness matters. We have languages decdicated to correctnes, like Ada and Coq. They are very niche because they are slow and they require a lot of work the programmer doesn't want to do to hold up their end of the bargain.


What operation takes O((log2 denominator + log2 numerator)^2) for fixed-length fractions? Basic arithmetic operations are constant time. Rounding is too mathematically well-understood and should have efficient implementation.

All the rest you say applies to floating point alike, plus with fractions is much easier to test whether the result is accurate or not.


Your original post used the term "rational" when I read it, which is usually used to describe a library in which numbers are represented as arbitrary-length integers, expanded as needed to maintain accuracy. Those tend to explode pretty quickly in real usage as the arbitrary-length integers grow quickly.

In this post you say "fixed-length fractions", but I don't know exactly what that is: https://www.google.com/search?hl=en&q=fixed%2Dlength%20fract... Some of my natural expansions I can see how they'd have algorithmic complexity equal to floats, but they're going to have similar (not identical, but similar) pathological cases.

Based on what you sketch in your post, if you mean, two fields of fixed-length integers, that's got all kinds of problems that will be a non-starter; redundant representations means the equality operator becomes more complicated than a bitwise comparison involving factoring (IEEE floating point has some issues here but they're at least designed to be amenable to hardware, this factoring thing is going to be a problem at base silicon levels), plus redundant representations means you're wasting valuable bit patterns, you can't represent anything bigger than a MAX_NUM / 1 so you're going to pay a lot for larger numbers, etc., and you still have the problem that your operations won't be closed and you'll be approximating anyhow; you're going to end up with most of floating point's problems anyhow.

I know there are alternatives to floating point, such as "universal coding": https://insidehpc.com/2018/05/universal-coding-reals-alterna... They would also have similar complexity to floats, but move the pathological behaviors around. (The question in terms of trying to replace floats isn't whether you can get rid of the pathological behaviors; you can't, just by the nature of the beast. But nothing says you can't hide the pathological behaviors under the carpet, under the couch, in the corner of the room, instead of having it pretty close to sitting in the middle of the commonly-traveled path.)


FWIW, Perl 6 has rational numbers with limited precision built in: https://docs.perl6.org/type/Rat . As soon as the denominator exceeds 64bit, then the Rat will be changed to a Num (aka float). If that's not desirable, one can also use FatRat: https://docs.perl6.org/type/FatRat . In that case, both the numerator as well as the denominator are BigInts.

> 355/113 is more precise 1/(113^2) ~ 0.00008 and the fraction is still possible to be stored in 16-bit

log2(355)+log2(113)≈15.3, so it seems like you could squeeze this into 16 bits, but what about a sign bit? And since 355>2^8, I think you also need to be reserving some bits to say how the remaining bits are partitioned between the numerator and denominator (see [1] for an interesting writeup of some experiments with "floating bar").

A fundamental fact of any fixed width number format is that you can only represent 2^n values with n bits, so designing a number representation is really all about deciding how you will distribute those values. Some properties you might want:

1. Enough dynamic range for your application 2. A smooth distribution of values within that range (clustering some values too close together implies gaps somewhere else). 3. Efficient arithmetic operations 4. Exact results in some important cases

Floating point optimizes for 1-3, somewhat at the expense of 4 although it is exact for (small enough) integers and fractions with a (small enough) power of 2 in the denominator.

Fixed-width rationals are less good at 1-3, but are better for 4. The distribution of rationals on the number line is pretty funky, and there's even redundancy because 2/4 and 1/2 represent the same value.

I don't think optimizing for exact results in more special cases is really the right move in most applications, so I don't think we should be surprised by floating point's dominance. On the other hand, I have worked on a project where we decided exact results in special cases really was important [2].

[1] https://iquilezles.org/www/articles/floatingbar/floatingbar.... [2] https://engineering.desmos.com/articles/intuitive-calculator...


Rational numbers are an absolute pain to work with once you perform more than a couple of additions with denominators that are relatively prime.

I like Go’s strategy. Consts have no type and are the only thing with implicit conversion. math.Pi is defined explicitly with a whole lot of digits and the type system can somehow do inline math on all those digits with full precision before downcasting the final result to an int, float32, float64, etc.

I prefer Common Lisp's way, where fractions are first class citizens. Pity it is not supported directly in hardware.

I'm not convinced supporting it in hardware would make it any faster, compared to doing what we do now with fast integer arithmetic and reasonable software algorithms. Hardware only makes things faster if there's some good parallelism you can't express in software, like doing integer addition using modern hardware versus doing it the long way bit-by-bit.

Here's a whole post on the issue by an actual hardware designer:

https://yosefk.com/blog/its-done-in-hardware-so-its-cheap.ht...

Quote:

To see the limitations of hardware support, let's first look at what hardware can do to speed things up. Roughly, you can really only do two things:

1. Specialization - save dispatching costs in speed and energy.

2. Parallelization - save time, but not energy, by throwing more hardware at the job.


I'm not sure how useful that would be. Pi might be represented more precisely, but the result of every operation you do with it is still going to have the issue of varying precision (except for powers of your base/ratio for the representation).

It's not exactly what you're talking about but there is some use for "block" floating point, where you manually track the decimal point through your arithmetic via shifts, which allows you to explicitly control the precision you need for a given portion of your arithmetic. The code is disgusting however.

I wonder though if there's a numeric representation that could make trigonometric functions into a shift/mask operation, kind of like multiplying by powers of the base for integers/floats. That would be real neat.


It doesn't work because the sizes of the numerator and denominator generally increase at an exponential rate, even if using variable-width integer containers for them, you end up squaring the running time of an algorithm.

Hang on, how do you store that fraction in 16 bits?

The naive way, 8 bits each for numerator and denominator, doesn’t work as 355 is too big.


Naive: 9 + 7 bits (admittedly, a bit of stretch)

Non-naive: https://en.wikipedia.org/wiki/Farey_sequence

If I'm correct with Fn, to represent all fractions with denominator 113 or less it requires 12 bits, this leaves 4 bits for the whole number part.


That Farey sequence is cool, but I’m a bit skeptical that it’s really useful as a general-purpose numeric format.

It’s dense around fractions with largish numerators and denominators, but sparser around small numerators, like 0, 1/2 and 1. That seems bad for many applications. It may be able to approximate pi quite well, but how about small angles close to 0?

If you know the range of numbers you’re going to store, just spreading the representations evenly seems like a natural choice -- normal fixed-point reals.

Floating-point seems like a pretty reasonable extension of that, when you don’t know in advance what the scale will be.


Sure there are tradeoffs. Everyone in the thread implies I want to completely obsolete FP or whatnot :(

Have you thought through the small angles example? Because sin(x) is equal to x for small angles. Then, as x raises further, we are entering densest area of farey sequence, so the precision is fine there too.


Sorry to be overly harsh! I think I was initially excited by the suggestion, then disappointed when I thought it through and found it less compelling that it sounded. :)

Re small angles, I don’t quite follow; if I have a couple of small angles, say 0.01 and 0.015, isn’t it useful to have good resolution around 0 so I can represent them reasonably accurately? The Farey sequence seems to have its biggest gap right at that point, between 0 and 1/N.


I don't follow either. When sin(0.01rad) = 0,0099998333... and you do need the 0,00000026 precision you just choose representation with enough bits, same as with floating point.

I hate how the article waffles between "almost none", "virtually none" and "negligible".

There is a word with precise mathematical meaning that matches naive non-mathematical intuition: "almost all" / "almost none" (aka full measure set).

This concept is easy to explain to lay people: count up lengths of intervals, do some limiting process trickery to deal with the fact that the set cannot be written as a finite union of intervals. Alternatively, choose a real number uniformly at random from some interval, probability one will hit the "almost all" set and probability zero will hit one of the others.

"negligible set" is often used to describe sets that have zero measure and small Baire category. "Virtually" also has different mathematical connotations.

Upside is that it was well explained that the contents of the new result is "don't worry about overlaps between different denominators".


Wow. Maynard FTW yet again. A Fields medal down the track would be no surprise at this rate. He has several more years to qualify.



Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: