Hacker News new | past | comments | ask | show | jobs | submit login
Floating Point in the Browser, Part 2: Bad Epsilon (randomascii.wordpress.com)
75 points by todsacerdoti 9 months ago | hide | past | favorite | 12 comments

I think the author undersold the investigation they did. They understood the problem rather convincingly. But the blog post as written is misguided:

> I never investigated why the test had suddenly started failing but I suspect that the timer frequency or the timer start point had changed,

Do not act on suspicion alone.

Sometimes sporadic test failures are problems in the test framework. Suspicion + evidence becomes a hypothesis. Find a way to reproduce the failure, that leads you to understand why the test usually passes, then you can fix it with confidence.

Other times sporadic test failures point to problems in deeper layers. Maybe some errant signal handler was borking the FP rounding mode. Maybe there's real hardware errata. We may never know.

> It bothered me that it required esoteric floating-point knowledge to understand this problem so I wanted to fix googletest

This is a different problem. The test is failing, and also Google Test is written in a way that requires esoteric FP knowledge. Fix the first, then refactor to fix the second. These should absolutely be two independent commits. I think this opportunity was missed.

> I think that the googletest fix is ultimately more important than fixing the Chromium test

This is a false dichotomy. You can fix both!

If you increase the error bounds for some test, the failure no longer repros. That may or may not be the correct fix, but you have to be able to justify it.

In this case, one could mock the timer and search for times that would repro the failure. A nice FP trick is avoiding i386: its 80-bit fpu masks lots of other problems.

> These should absolutely be two independent commits. > I think this opportunity was missed.

Uh, they were two independent commits. The test in Chromium was fixed in 2017. The change to googletest landed a few weeks ago in 2020. Separated by three years and in two different repositories - that sounds independent to me.

> This is a false dichotomy. You can fix both!

Yep, did that.

FWIW the failure was originally reported on an x64 test, so the i386 FPU with its 80-bit registers was not used.

>These should absolutely be two independent commits. I think this opportunity was missed.

look like two to me unless I'm missing something ?



BTW, I double checked and I'm reasonably certain that the test failure happened because we started running the test on machines with a lower QueryPerformanceCounter frequency (2.148 MHz) and the simulated counter values then created larger times that necessarily had less precision.

I updated the post to include that. Note that the error bounds are only increased for those test values that are unrealistically large. The error bounds are as small as possible for the more "normal" test values.

I think the real problem is that IEEE 754 floating point almost always gets treated as black box that magically provides the behavior of real numbers, which is impossible. Instead of recognizing that the abstraction leaks in cases like these because the floating point representation only has so many bits, people just ignore the mathematical reality. The right thing to do is descending to the level of the representation (bits) and comparing ULP differences when necessary.

The most annoying thing is how the bit-level representation of floating point is ignored even when relative error is significant instead of absolute error, because ULP differences are just that - relative error.

EDIT: the epsilon fix in the test code in the blog post actually converts the relative error to absolute error so both could be done with the absolute error check. It would be more clear to instead do both checks explicitly. I think it would also be faster, which might matter in some situations.

Even though floating point numbers are usually treated like a black box, most programmers know that they are imprecise, and the common advise is usually "never check for equality".

Ok, fair enough, but how do you compare then? Standard libraries are rarely helpful here, and more often then not, you have to roll your own. Usually, that's poorly done. I've seen it done wrong way too many times, and that's by competent programmers.

Why aren't there ULP-based functions in libmath for instance? I shouldn't have to cast float to int, do bit manipulations, and break portability for that.

Do something like an optimized version of

    int ulpdiff(double a, double b) {
        int ret = 0;
        if (a > b) {
            while (a > b) {
                b = nextafter(b, a);
        } else {
            while (a < b) {
                a = nextafter(a, b);
        return ret;

I used the following C code earlier this year (use at your own peril): https://github.com/nsajko/numericcompfricas/blob/master/chec...

Distance between 0.0 and -0.0 is taken to be 0. If x or y are NaN, the distance is taken to be the greatest positive value of the return type.

Replace mfloat_t with double. Replace int64 with long or similar.

  ud(mfloat_t x, mfloat_t y) {
          // Handle NaNs.
          if (x != x || y != y) {
                  return (int64)(~0UL >> 1);

          union {mfloat_t X; uint64 a;} u1;
          union {mfloat_t Y; uint64 b;} u2;
          u1.X = x;
          u2.Y = y;
          uint64 a = u1.a, b = u2.b;

          if ((a & (1UL << 63)) == (b & (1UL << 63))) {
                  // a and b are of the same sign.
                  a = a - b;
                  if ((a & (1UL << 63))) {
                          return -a;
                  return a;
          return (int64)(a - (1UL << 63) + b);
Using memcpy could be more portable than using unions, but I think it doesn't really matter.

EDIT: The Musl libc, for example, also uses unions instead of memcpy: https://git.musl-libc.org/cgit/musl/tree/src/math/nextafterf...

As usual, Rust has a crate for that: https://crates.io/crates/float-cmp

That does not provide an ULP distance function, rather it tests for approximate equality while using both relative and absolute error margins.

Also, a warning: the code behaves as if numbers of opposite sign are never "approximately equal". Also, the documentation sucks: it doesn't even mention the sign issue, nor does it mention how NaN values are handled, but it nevertheless contains irrelevant anecdotes.

That led me to https://crates.io/crates/efloat/0.2.0 which sounds even more useful as you can actually track what the error bounds are after you've done all your math. That sounds eminently more useful than just making epsilon math correct, no?

That just tracks the error bounds on individual "small" operations. I think this is more like what you want, an interval arithmetic library: https://github.com/eseraygun/rust-honestintervals

The point is that AFAIK you need the whole algorithm (like, e.g., logarithm) analysed to get sensible error bounds.

> I think the real problem is that IEEE 754 floating point almost always gets treated as black box that magically provides the behavior of real numbers, which is impossible.

That's true for integers, too, though the edge cases are fewer, lots of problems happen around catastrophic surface areas like overflow.

Applications are open for YC Winter 2022

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact