Hacker News new | past | comments | ask | show | jobs | submit login
C library for multiple-precision floating-point arithmetic with correct rounding (mpfr.org)
121 points by Zaqball on June 3, 2022 | hide | past | favorite | 51 comments



Not mentioned in the list of users is SLEEF (https://github.com/shibatch/sleef), which provides fast approximations for various elementary functions. (It generates coefficients for the approximations with mpfr)

SLEEF itself is used by PyTorch.


Ok, I’ll bite.

What exactly is correct rounding? Sounds kinda like perfect approximation.

I search/skimmed all references to “round” on the linked page, but didn’t find a definition of what “correct” was. Wikipedia page on rounding doesn’t use the term correct anywhere. Is this a mathematical adjective in this case? Or just the opinion of the authors?

I’m well aware of banker’s rounding, and rounding up, and round down, and rounding towards zero, and rounding towards signed infinity, etc. But “correct” as a formal modifier is new to me. What did I miss?


That's a legitimate question. It's just floating-point jargon. Here's what it means:

> I’m well aware of banker’s rounding, and rounding up, and round down, and rounding towards zero, and rounding towards signed infinity, etc.

MPFR supports most of those rounding modes you mentioned:

https://www.mpfr.org/mpfr-current/mpfr.html#Rounding

> But “correct” as a formal modifier is new to me

It just means that once you select a rounding mode, it is guaranteed (modulo bugs) to give you the correct answer rounded according to your selected rounding mode.

At first one can think: "obviously, who wants incorrect rounding?". And indeed all recent floating-point hardware implements correct rounding... But only for 32- or 64-bit floats, and only for addition, subtraction, multiplication and division.

By contrast, most libc/libm implementations do not have correct rounding for transcendental functions (cos, sin, tan, exp, log, etc.). There are some exceptions (Rlibm, CRlibm).

MPFR implements correct rounding

- at any bit width and

- for all transcendental functions.


In a big application, wouldn't rounding to a random direction (a feature GNU MP supports) be the most accurate probabilistically? If you round up or down all your numbers and you perform a lot of arithmetic operations, you create a lot of bias.


Consider the logarithm. In almost any interval, rounding down will be more accurate than rounding up the majority of the time -- so random rounding will bias you away from an accurate result. Random rounding can be a good idea, but it's not always a good idea. Random rounding is an okay default, but having fine-grain control over rounding direction is vastly superior to always sticking to the default.

For another example, interval arithmetic is a really cool approach to keeping an eye on your accuracy. There, you want to round your upper bounds up, and round your lower bounds down.


Maybe, but having random "accurate" behavior in a program is generally worse than being incorrect all of the time.


Correct rounding means that if the function result has X bits, all those X bits should be the same as when computing the result with infinite precision and then rounding it using the specified rounding mode.

For the ordinary arithmetic operations, including for square root and for reciprocal square root, there are algorithms to do correct rounding without excessive computation. Because of that, the standard for floating-point arithmetic specifies that a conforming implementation must use correct rounding for those operations.

On the other hand, for transcedental functions, like exponential, logarithm, sine and so on, there is no general method to guarantee that the result will be correctly rounded without having to compute an unlimited number of result bits before being able to round correctly.

Because of the difficulty, many mathematical libraries and also all hardware implementations of transcedental functions round correctly many of the possible results, but not all of them.

Nevertheless, for each usual transcedental functions and for the numbers of bits used frequently, i.e. single-, double- and quadruple-precision, there are known algorithms to round correctly, which have usually been found by exhaustive searches for corner cases, when more result bits must be computed than for most other results.


Doesn't it mean that, when you compute - say - sin(123.45) then the resulting computed value is - of all the finite numbers of possible floating point values that the underlying MPFR representation can encode - the one that is closest to the true value of sin(123.45)?

That may not be what it does, but that's certainly what I'd expect "correct rounding" to do.


This is correct.


Imagine you get the result correct with infinite precision, and then round to the required precision in the specified way (round up, round down, round towards zero, round to nearest with ties to even, etc.). Correct rounding is equivalent to this. It is also reproducible, as there is only one possible correct result.

There is a similar concept which is supported experimentally by MPFR, which is faithful rounding. With faithful rounding, a number is rounded either to the closest value up or to the closest value down, but it is not necessarily the nearest of them. That is, if a and b are representable and no value between them is representable, and if a < x < b, then with faithful rounding x will be rounded to any of a or b. This can sometimes be faster, but is not reproducible any more.


Try building any common OSS from source and you quickly realize there are a few libraries at the core of everything. MPFR being one, GMP, MPC and Zlib being the others that come to mind.

In fact I believe MPFR depends on GMP.


MPFR does depend on GMP, but the reason it’s everywhere seems to be that GCC uses it—unlike GMP, which is also used for cryptography, underlies many languages’ large integer support, etc.


A similar library with even more capabilities:

https://www.ginac.de/CLN/

Includes not only infinite-precision [correction: arbitrary-precision] floats (not sure about the correct rounding part) but also integers (so subsumes GMP), rationals, complex numbers, and polynomials.


Arbitrary precision (the library claim) is not infinite-precision (your claim).

The original library claim for correct rounding is very important for many tasks, and is the entire reason for that library. I doubt your library does it, it it would claim it, since it's extremely hard to do and would be a great selling point.

Here's [1] a slightly older paper with lots of details on the problems in getting such a library to work. There's massive literature on this

[1] https://hal-ens-lyon.archives-ouvertes.fr/ensl-01529804/file...


A clarification, since this is a thing that frequently confuses people: the importance of correct rounding is that it is the easiest mechanism to unlock reproducibility. If I use library A, and you use library B, but both are correctly rounded, then we know that we get the same result.

When reproducibility doesn't matter, it's basically always better to use a not-necessarily-correctly-rounded implementation at somewhat higher precision, as this gets you smaller overall errors for less effort. But there are a lot of domains where reproducibility matters, and that's where correct rounding comes in.


>When reproducibility doesn't matter,

If it doesn't matter, then high precision rarely matters either. Once you have results with unknown noise in them, and you do a few operations on those, the noise compounds, at approximately sqrt(N) bits for N steps of computation (rough estimate, it certainly depends on algorithm selection, compiler nuances, stability of problem, etc.). So high precision degrades quickly when doing any work without using a reproducible and understood rounding mode.

>correct rounding is that it is the easiest mechanism to unlock reproducibility

Any rounding method works works for reproducibility - truncate, round to zero, Bankers rounding, etc., all (roughly) equivalently hard/easy to implement. However, pretty much none of these are possible without the ability to do correcting rounding first as a result of the Table Makers' Dilemma.

For example, Banker's rounding is generally better for doing computation than the usual correctly rounded. Knuth has some papers on this somewhere which are fascinating.

Also note most of the non-reproducible issues in numerical computation are compiler added - compile code with GCC version X, get one result, compile with later version, and results may change. Or across compilers. Negative 0, denormals, rounding modes, all need carefully dealt with. It can be done, but is not trivial. Keeping floating point solid is very, very tricky over compilers and time.


> If it doesn't matter, then high precision rarely matters either. Once you have results with unknown noise in them, and you do a few operations on those, the noise compounds, at approximately sqrt(N) bits for N steps of computation (rough estimate, it certainly depends on algorithm selection, compiler nuances, stability of problem, etc.). So high precision degrades quickly when doing any work without using a reproducible and understood rounding mode.

This isn't really true for any of the numerous numerical algorithms that satisfy backwards error bounds; there you would always prefer to have a well-rounded arithmetic at higher precision, if reproducibility is not required.

> Any rounding method works works for reproducibility

Only if you round _correctly_ in that direction.

> Banker's rounding is generally better for doing computation than the usual correctly rounded.

Banker's rounding _is_ the default IEEE 754 rounding mode.


Note that bankers rounding is a subset of correct rounding since bankers rounding only affects ties.


Wait, now I'm confused. If what you want is reproducibility, why does it matter that the rounding be correct? It seems to me that what matters in this case is that the rounding conform to some standard or specification. Why should it matter what that standard actually is?


One usecase, for instance, is if you're writing a compiler and you want to constant propagate floating point operations, and the compile host doesn't necessarily support the same FP formats the target hardware does. But you still want the results to be the same as if you compute it on the target HW at runtime (barring usage of a "damn-IEEE754-just-go-faster" compile option).

This, FWIW, is what GCC uses MPFR for.


Note that GCC actually has a bunch of bugs related to this where it will give you different results depending on whether a value gets computed at compile time or runtime.


Been a while since I was actively involved in GCC development, but back then most issues like this AFAIR were due to math functions and not basic arithmetic. GCC doesn't control the target libm, and correctly rounded libm's are rare and much slower than ones which permit a bit more slop in the results.


Sure, you could invent a different standard, but what are you going to write into it?

"If you input 0.1 into atan, the result should round up when outputting 5 digits, down with 8 digits and up with 12 digits"?

Also, "round up" is not any easier than "round correctly" if you require it to be consistent. Say you compute a result like 9.004 and want to round to two digits - if the exact result would have been 8.991, then rounding your computation as 9.01 is different from the standard-specified result of 9.00.

So your standard would literally have to specify the desired output for every combination of operation, input, and output precision - obviously impossible.


> Arbitrary precision (the library claim) is not infinite-precision

True. I've edited my comment to reflect the error.

> I doubt your library does it

Just for the record, CLN is not mine. I'm just a (casual) user. And you're right, it probably does not do correct rounding. I was unaware of the significance of this.

The above notwithstanding, I think CLN is pretty cool and deserves more attention than it gets.


Oh sorry, didn't mean to imply the library was yours. Only that you posted it.

Agreed - I think all such libraries are interesting. Knowing the pitfalls of them is also useful and not obvious.


Thank you for the link to this paper - it's amazing to see how much care and thought can go into analyzing FP computation, and good to be reminded of how much I still have to learn.


If you are interested and like this stuff - ask some questions and I'll post places you can learn what you want.


That's kind of you. I don't know how related it is, but my main interest now is a kind of empirical evaluation of numerical precision. I have some C code (actually students' C code), and one thing I do to characterize it is run it twice, once rounding towards negative infinity, and once towards positive infinity, and see how different they are. Increasing the crappiness of code tends to widen that window (between the two results), but not all crappy code generates a wide window. I would much rather, at a great computational expense that I'm happy to deal with, run the code many many times, each time randomizing the rounding direction just prior to any FP operation, so that I can get a sense of the distribution of numerical results over all possible roundings. I'm contemplating doing things like reading assembly code and injecting random rounding changes, and then compile and run that, but I'm also wondering if others have already done this, in the setting of vanilla C code.

And in case its relevant, in this setting, students are only allowed to use single (32-bit) precision floats and operations.


>run it twice, once rounding towards negative infinity, and once towards positive infinity, and see how different they are

That's actually a good method that lots of numerical analysts use. I think C99 has several rounding modes - I often have code that runs the same computation under each mode and look at what your doing.

>not all crappy code generates a wide window

True, often you need to find the values that make it puke. I think there was a paper recently on doing this automatically.... If I can refind it I'll post - it was cool stuff.

>students are only allowed to use single (32-bit) precision floats and operations

For stuff this small you can often brute force all values or at least all edge case values. I sometimes do the trick of converting 32 bit integers (watch out for weird C aliasing rules) to hit lots of numbers near precision edges and run them through, denormals, other edge cases, and pseudo-fuzz the results. This will sometimes turn up more.

If you teach, and have not worked through some numerical analysis books that provide good background for this stuff, read Goldberg's paper "What Every Computer Scientist Should Know About Floating-Point Arithmetic" [1]. The next good thing to dig through is Higham's book "Accuracy and Stability of Numerical Algorithms" [2]

Those will give you lots of places to look for issues, and the ability to analyze algorithms more concretely for floating point issues.

[1] https://docs.oracle.com/cd/E19957-01/800-7895/800-7895.pdf [2] http://ftp.demec.ufpr.br/CFD/bibliografia/Higham_2002_Accura...


I like and assign the Goldberg paper. I haven't spent much time with the Higham book; thank you for for full text link.

In my class, FP computation is something I try to teach the students to be mindful of, but it is hardly the focus of the class (about certain kinds of data visualization). FP computation arises in lots of different places in many different functions, often in the context of a larger data processing pipeline that actually starts and ends with 16-bit integers (e.g. medical imaging data). So I'm not sure how I'd thoroughly exercise the FP-related code in the way that you describe. Hence my current use of something I can easily control at the outset of the whole pipeline (the direction of rounding).

Do you have any ideas on injecting randomized rounding mode changes prior to FP ops in the assembly, or something functionally equivalent for characterizing C code (ideally without code changes)? I just found [1] which seems extremely related, but I'll have to look into whether the code still works.

[1] http://www.iboview.org/random_rounding/


At the risk of just talking to myself, I'll note that a google scholar search on the [1] above led me to this phd [2] which in turn led me to two newer tools, Verrou [3] and Verificarlo [4] which involve Valgrind and Docker, respectively, though I'd have to understand what exactly is being contained by Docker for Verrou. But it makes me appreciate more the minimalist approach of Knizia et al. in [1], which does involve recompilation of modified assembly, but that's something I'm doing multiple times with student code already.

[2] https://tel.archives-ouvertes.fr/tel-03110553/document

[3] https://github.com/edf-hpc/verrou

[4] https://github.com/verificarlo/verificarlo


MPFR has some timing comparison on its page.

https://www.mpfr.org/mpfr-4.0.1/timings.html

Also, MPFR depends on GMP which has integers and rationals. Then there is MPC which in turn depends on MPFR and provides complex numbers.

https://www.multiprecision.org/mpc/


IMO, this highlights that you should pretty much always use Arb instead of MPFR if you have a choice.


Arb depends on MPFR, so it is not exactly "instead of".

https://arblib.org/setup.html#dependencies


This is a great library!

Also another amazing piece of software behind quickjs:

https://bellard.org/libbf/


MPFR has been the standard for a long time. It's also GPL if I recall correctly so... take from that what you will.


It's LGPL not GPL.


Which to you and me means a huge difference, and to a lawyer/suit means they must be the about the same thing. Sadly.


why is this on hn front page? am i missing sth?


Because it’s a cool piece of technology, and this is a place for computer professionals? if you can’t appreciate GNU MPFR, you might be hanging out at the wrong place


Maybe the GP is being satirical. Maybe he was asking “why is this on HN without a reference to doing it in Rust” or one of the other commonly repeated threads on HN.

I personally wish more of HN were actually this kind of thing; less about another web tech piece that will be absolute in 3 months anyway.

But hey variety is good and it takes all kinds of us and all kinds of what we all do.


My thought was "I'd expect the average HN reader to be already familiar with foundational libraries". GMP and MPFR are pretty much one step above glibc in terms of how basic/widespread they should be.


Good question. If we all posted every library we've ever worked on HN would have 0% relevant content instead of the usual 1%.


I remember years ago when I didn't know that much about software engineering, discovering and playing with MPFR was like magic. "I have super-science power in my hands, how many bits of precision can I fit in my RAM". It was a fun time. Can I do big integer? Bigger integer? HUGER integers!


Advent Of Code seems to always have a Big Integer question to trip up beginner programmers. Its a great way to learn.

Also coding a simple emulator, like for the chip-8 for example, and learning what a carry flag does would help with basic understanding of the problem.


I've coded more than one mini-big int libraries for some high school CS comps back in the day. Also fun times. Simple arithmetic is easy and we never got really complicated operations for big ints.

These days I'm super lazy and just Python the big int problems on things like Advent of Code. Much more boring. And because of floating point issues, I haven't seen them give something in the real domain, as compared to whole numbers.


You should use MPFR to have more precision on those figures


People just post random old software and Wikipedia links. Weird but apparently allowed according to the rules.


I have a problem with the mathematical proof document of mpfr accuracy.

The mathematical setup work is not rigourous enough. I allow myself to say that, because I know my maths teachers would have rejected such work.

Solution, cleanup and redo, this time very rigourously, the maths setup work of this document. Yes, it will probably add a few pages.

I had a look at SLEEF, but I could not find the mathematical proof document of the accuracy of its approximants.


Which document are you talking about? This one [1]?

If so, "because my maths teachers" is not a very good rebuttal to what is obviously a form of program documentation and not peer reviewed publication.

The document could use some editorial cleanup, but so don't see anything systematically wrong with the math. It's understandable by people versed in the language of numerical analysis, which is certainly the target audience.

[1] https://www.mpfr.org/algorithms.pdf


Yes, this document has not a rigourous math setup: for instance rule 9.

I don't think there is an accuracy proof document of SLEEF approximants.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: