Hacker News new | past | comments | ask | show | jobs | submit login
Arb: Efficient Arbitrary-Precision Midpoint-Radius Interval Arithmetic (2016) (arxiv.org)
78 points by helltone 26 days ago | hide | past | favorite | 42 comments

Author here. I will take the opportunity to advertise my blog at https://fredrikj.net/blog/ where I post occasional development updates about Arb and other projects.

Some improvements not covered in the 2016 preprint include much faster code for arithmetic and matrix multiplication (https://arxiv.org/abs/1901.04289, https://fredrikj.net/blog/2018/07/high-precision-linear-alge...), and code for rigorous arbitrary-precision numerical integration (https://arxiv.org/abs/1802.07942, http://fredrikj.net/blog/2017/11/new-rigorous-numerical-inte...).

I dream of a programming language where I would be able to specify a computation and maybe error bounds on input/output and get efficient code respecting the bounds automatically. Arb takes us close to that, but I still have to handle recomputing with greater precision by hand, and things get complicated with multiple inputs. What are your thoughts on making arbitrary precision/rigorous error bounds more ergonomic to use for programmers?

That's a great question. Of course, you can express the computation as a function or a symbolic expression and evaluate it using a function that recomputes with greater precision automatically (or automatically compiles efficient code with bounds). Integrating this kind of functionality into the core semantics of a programming language is a very interesting problem.

Any chance you could offer an RSS or Atom feed?

Good point. I added an RSS feed link to https://fredrikj.net/blog/. I just hacked together some quick code to generate it, so let me know if it doesn't work.

You should also link your rss feed in the meta tags of the blog page:

    <link rel="alternate" type="application/rss+xml"
    title="RSS" href="http://www.example.net/feed.rss" />
Also in the rss, this line is pointing to the wrong URL.

   <atom:link href="https://fredrikj.net/rss.xml" rel="self" type="application/rss+xml"/>

Should be fixed now. Thanks!

Thanks! Just tried to add it to feedly but unfortunately it still gives me an error, reporting that it can't find feed. Will have a look at the source to see if I can help figuring it out, once I am back at my computer.

For anyone wondering about the difference between endpoint-based interval arithmetic ([a,b]) and midpoint-radius ([m +/- r]) arithmetic (ball arithmetic): they are often interchangeable, but they have different tradeoffs. Roughly speaking, standard interval arithmetic is better for subdivision of space, while ball arithmetic is better for representing individual numbers.

A good technical introduction to ball arithmetic is this paper by Joris van der Hoeven: https://www.texmacs.org/joris/ball/ball.html

Great write up!

I’m amused by your subdivision of space note, given Figure 1 :). There are clearly some problems and transforms that are better on N-spheres (N-balls?) and others on rectangles / N-cubes. Do you have a deeper intuition for which? (The x^2 example was simple and cute).

Years ago [1], we applied interval arithmetic to tracing groups of rays (and compared to geometric bounding via planes/frustums). I’d be curious to think through an equivalent with your midpoint ball arithmetic, but I feel like it would “need” to be parameterized as a ball of origins (easy) and then something else for the cone of directions —- maybe theta/phi clusters or “cluster of points on the unit sphere” (but converting back and forth is more expensive than the gains).

[1] http://graphics.stanford.edu/~boulos/papers/ia.pdf

Joris's paper discusses high-dimensional balls with various metrics (where the max-norm gives equality between the "ball" and "interval" concepts), but I skimmed your paper and it looks like you're doing only doing 1d. It isn't immediately obvious to me how one would perform computations on, say, 3-dimensional euclidean balls using your library. Is this something you've thought through (or was my skim too shallow)?

maxnorm in 3-dimensions is not euclidean. It is however, a metric space (which is why they call them "balls"). Note that the following formula for vectors x and y over R is a valid metric (obeys triangle inequality) for _all_ n in _all_ dimensions, with n=2 being the usual euclidean metric:

    d(x, y) = |sum_i((x_i - y_i)^n)|^(1/n)
The maxnorm is the limit as n -> infinity, and is also a proper metric space. (the collection of these spaces are called Lp spaces)

Of course in one dimension, all of these Lp metrics are identical.


The library doesn't support this directly. What you can do for normed vector spaces over R (or C) is to use the number types of Arb for the coordinate vectors, implementing your own norm and metric functions for the vector space on top of this. It's fairly easy to do since Arb handles all rounding errors in the underlying real arithmetic.

What's the history/provenance of ball arithmetic? Was this a new insight out of INRIA or a well known approach not previously implemented? All links seem to circle back to you and Joris van der Hoeven, and arb [in the 2018-present time line].

I think the name "ball arithmetic" is Joris's idea. I don't have a good reference at hand, but I believe the idea of using a centered form of intervals is as old as interval arithmetic itself. Ball arithmetic has been used quite a bit for complex interval arithmetic, where it sometimes offer better enclosures than rectangular intervals, and another nice point of ball arithmetic is that it generalizes to more general normed spaces. However, before Joris's and my own work, I think very few people realized how useful ball arithmetic is specifically for arbitrary-precision arithmetic.

Thank you.

So that sent me chasing interval arithmetic's history -- https://en.wikipedia.org/wiki/Interval_arithmetic -- and that page is also silent as to where did it come from. But academic.ru has the (surprising to me) answer that none other than Archimedes (not a surprise) used it:

"Interval arithmetic is not a completely new phenomenon in mathematics; it has appeared several times under different names in the course of history.

For example Archimedes calculated lower and upper bounds 223/71 < π < 22/7 in the 3rd century BC. Actual calculations with intervals has neither been as popular as other numerical techniques, nor been completely forgotten."


Kinda reminds me of the difference between arithmetic coding and asymmetric numeral systems

Arithmetic coding encodes into range, ANS into single natural number. BTW, JPEG XL on ANS just got standardized.

Looks cool, but how does it get around the table-maker's dilemma? https://en.wikipedia.org/wiki/Rounding#Table-maker's_dilemma

"No general way exists to predict how many extra digits will have to be carried to compute a transcendental expression and round it correctly to some preassigned number of digits. Even the fact (if true) that a finite number of extra digits will ultimately suffice may be a deep theorem."

My guess is they aim for convergence and it works well in practice but it's still unknown if that's guaranteed to be correct... is that right?

Author here. The table-maker's dilemma is a problem if you insist on correct rounding. However, interval arithmetic does not require correct rounding. It merely requires rigorous error bounds. It turns out that a lot of the time when people ask how to obtain correctly rounded results, they are actually asking the wrong question, and what they really need is just error bounds.

This is why Arb has a lot more functions than MPFR: it's extremely difficult to implement correctly rounded functions as required by MPFR; Arb's contract (returning an error bound) is far easier to satisfy and works just as well 99% of the time.

From a software development point of view, the problem with correct rounding is that it doesn't compose: given correctly rounded functions f and g and correctly rounded input x, f(g(x)) is not (in general) correctly rounded. In interval arithmetic, the corresponding composition law (the inclusion property) does hold; this means that you can compose arbitrarily complicated functions without thinking.

Thank you for the reply! This is interesting, I didn't realize this. Unfortunately I'm still not sure I fully grasp it though: if I say 1.001 is "correctly rounded", that is equivalent to saying it has error bounds of +/- 0.0005. I could say either of these and they would mean the same thing. So what does it mean when you say you an establish rigorous error bounds but you can't guarantee correct rounding? If you can decrease the error bounds (which it seems you can, given this is arbitrary-precision) then isn't that equivalent to being able to round with more digits (precision)?

This is really cool, thank you. I implemented John Gustafson's "unum" intervals (as well as his "posits", that work of mine is more well known); and boy were they a pain in the butt. I'm not really working on this anymore, but it's neat to see a different approach (which I suspect is simpler in many ways).

IA always returns a correct, but possibly useless, result. There are multiple optimizations to IA like Affine Arithmetic (AA) that can produce correct and useful results for wider problems. Also there is always a quick-and-dirty way to increase the precision when the result is deemed useless; this also happens to be a general approach to implement arbitrary-precision math functions.

It depends what kinds of results you expect.

The table-makers dilemma only applies if you want a specific number of decimal places, correctly rounded. The midpoint-radius representation gives you a value and an upper bound for the distance to the true value.

You may find that the radius is too big to get the number of correctly rounded decimal places you want, but that doesn't make the result incorrect, just vague.

> You may find that the radius is too big to get the number of correctly rounded decimal places you want, but that doesn't make the result incorrect, just vague.

That's true, but if you can do this, then you can also increase the precision until the radius shrinks to something sharp, right? Which means you should be able to round correctly?

The value can be arbitrarily close to the round-off point for the chosen number of digits, so you'd need to make the radius arbitrarily small, which can take an unbounded amount of time.

Of course if you're not a table maker trying to make a fixed-width table with correctly rounded values, you can just increase your precision to 10⁻¹⁰⁰ instead of worrying about the rounding error of a 10-digit representation.

Interesting... I think we might be interpreting the statement differently? It seems like you're saying the statement claims the difficulty is in deciding whether f(x) = 0 or f(x) = 1 when we see f(x) ≈ 0.5? If that is the only problem then the obvious solution would be "just use f(x) = 0.5 instead", right? (i.e. allow for an extra digit.)

Somehow that's not how I've been interpreting the statement until now. Rather, the way I read it, we don't know how to predict the amount of work it takes to guarantee correct rounding to any number of digits (even if we allow for extra digits beyond those we really want)—not merely to a fixed number of digits. Or to put it another way, it sounded as if for pathological inputs even guaranteeing >= N digits of accuracy isn't possible with a predictable amount of effort.

What led me to believe this is a couple of different statements from Kahan [1]:

A desideratum generally acknowledged but not always achieved is: "In error by rather less than an ulp if by more than half."

Reputable math libraries compute elementary transcendental functions mostly within slightly more than half an ulp and almost always well within one ulp.

Both of these suggested to me that errors can occasionally exceed half an ulp by a non-negligible amount. So, if we were rounding to the nearest integer, this would imply occasionally not only rounding arbitrary overshoots like 0.5 + 1E-100 to zero (whose difficulty you're pointing out), but in fact even occasionally even producing 0 when the output is 0.7. That's far worse. Is this interpretation wrong? If it is, then why should errors sometimes not be "well within one ulp"?


> If that is the only problem then the obvious solution would be "just use f(x) = 0.5 instead", right? (i.e. allow for an extra digit.)

Yes, but in many cases that's not desirable. E.g. if you want to compute things in hardware where adding an extra digit means adding more transistors to your chip. (And then why not always use that extra digit instead of rounding to one digit less?)

> mostly within slightly more than half an ulp

The reason this doesn't say "at most half an ulp" but allows for "slightly more" is the table-makers dilemma. (If you keep reading after the part you quoted, you'll see that he's talking about precisely the case where a value is close to x.5)

> almost always well within one ulp

This is a different case, concerning functions that are expensive to compute to a high degree of precision (e.g. due to high sensitivity to rounding of intermediate results) but "reputable" libraries will still try really hard to give good results.

In terms of the midpoint-radius representation, the first is about not knowing how much smaller the radius needs to be to achieve correct rounding to a predetermined number of digits, whereas the second is about needing to wait for a long (but usually predictable) time before the radius drops below a predetermined target.

> This is a different case, concerning functions that are expensive to compute to a high degree of precision (e.g. due to high sensitivity to rounding of intermediate results) but "reputable" libraries will still try really hard to give good results.

This is what I was referring to. See where it says "Why can't y^w be rounded within half an ulp like SQRT? Because nobody knows how much computation it would cost."... that's why I was confused how a arbitrary-precision library would handle something like this. User asks to raise some long decimal to another long decimal power and it seems to me you might sometimes either have to return a very wrong result or spend so much time computing that it won't be worth it.

> User asks to raise some long decimal to another long decimal power and it seems to me you might sometimes either have to return a very wrong result or spend so much time computing that it won't be worth it.

y^w isn't that expensive. Usually it gets expressed roughly as exp(log(y^w)) = exp(w * log(y)), where exp and log are implemented using polynomial series. So if you want more precision, you just need to evaluate a few more polynomial terms.

If just a few more terms would suffice then why does he say "nobody knows how much computation it would cost"? Are you saying he's just referring to the half-ulp + epsilon case here too? If so then I've definitely mixed these up!

Yeah, the question he asks is "Why can't y^w be rounded within half an ulp like SQRT?" not "Why can't y^w be rounded within slightly more than half an ulp like SQRT?" It's really just that tiny epsilon difference that turns "just a few more terms" into "nobody knows how much it'd cost".

The reason SQRT can actually be rounded within half an espilon in a predictable amount of time is that SQRT(x) >= y + 0.5 precisely when x >= (y + 0.5)^2 = y^2 + y + 0.25, which means you can determine whether to round up or not with a simple comparison.

Gotcha. Thanks for clarifying this!

> My guess is they aim for convergence and it works well in practice but it's still unknown if that's guaranteed to be correct.

I guess you refer only to the computation of transcendentals sin, exp, etc. (because even in native floating-point formats, most other individual operations are guaranteed to be rounded correctly).

I haven't checked their docs carefully enough, but if it is like MPFR, the way they do it is by trading time guarantee for precision guarantee. So they can't predict how long each operation will take (let alone guarantee that it will go fast), so that they can guarantee that they meet your required precision.

Techniques like this are used to track chaotic dynamics. Iterations of

   k * x * (1-x)
can have arbitarily long periodic orbits but not if you use conventional computer math. (e.g. you could at most have a period of 4 billion with 32 bit numbers)

It is possible to do grid scans of parameters and control the interval such thst you know you used enough digits that you can trust the result.

Relatedly, there's a relatively new julia package called MultiFloats.jl [1] for doing arithmetic with numbers of the form a_1 + a_2 + ... + a_n where a_{i+1} is of the order of the round off of a_i. So it's fixed precision, but the cool thing is you can easily generate code for arbitrary `n`. The package itself is just a few hundred lines of code. I've tried it on an NVIDIA GPU with fast double precision, and it seems to "just work" for basic linear algebra.

[1] https://github.com/dzhang314/MultiFloats.jl

This has various names: double-double, quad-double (etc.) arithmetic; floating-point expansions. It's definitely the best way to do arithmetic at precision up to a couple of hundred digits on modern hardware, though traditional arbitrary-precision arithmetic wins at higher precision (and in any case becomes necessary, due to the exponent range).

It's unfortunate that this isn't something that was standardized and more widely available a long time ago. By contrast, IEEE binary128 and binary256 are practically useless due to lack of hardware support.

Yeah, I think only Power has hardware double of 128 bits, but it's not the same as double-double because the exponent range is equal to double's exponent range.

I agree this package is not new conceptually, but it has a convenient metaprogramming to generate code for arithmetic for any [hardware ieee number type] x N.

It also lists the complexity as a function of N, most operations scale as N^3, so yes, for larger N it might fail to be fast.

Arblib.jl [1] is a Julia wrapper for Arb. Most of the basic wrapping is auto-generated from Arbs documentation. There is still work left to make it interface well with the Julia ecosystem, and document it, but it's on it's way.


Whenever I encounter arbitrary precision libraries I want to try them out on extended Dudeney's digital century puzzle.

This is from TAOCP 4A part 1:

> Dudeney's Digital Century puzzle.) There are many curious ways to obtain the number 100 by inserting arithmetical operators and possibly also parentheses into the sequence 123456789. For example, 100 = 1 + 2 x 3 + 4 x 5 - 6 + 7 + 8 x 9 = (1 + 2 - 3 - 4 ) x (5-6-7-8-9) = ((1/((2 + 3)/4 - 5 + 6)) x 7 + 8) x 9.

Of course, instead of 100 we can put something else, add exponentiation, decimal point and an operator that reinterprets the number as if written in base X (12345_6789 = 2124962081444303 in base 10).

I also like how Knuth adds the number of possible trees in a grammar that generates these expressions without redundant parentheses :D

Applications are open for YC Summer 2021

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact