Hacker News new | past | comments | ask | show | jobs | submit login
My C code works with -O3 but not with -O0 (mulle-kybernetik.com)
177 points by mulle_nat on Jan 16, 2020 | hide | past | favorite | 159 comments



The title is actually wrong - the -O0 version is correct, the -O3 version is not (despite giving the output the author expected).

Casting the value to double ends up converting the long value 0x7fffffffffffffff to the nearest double value: 0x8000000000000000. As the -O0 version CORRECTLY reports, this does not round-trip back to the same value in the "long" type. Many other values, though not all, down to about 1/1024 of that value (1 / 2^(63-53)) will also fail to round-trip for similar reasons.

Unless my coffee-deficient brain is missing something at the moment, it should be the case that any integer with 53 bits or fewer between the first and last 1 bit (inclusive) will roundtrip cleanly. Any other integer will not.

Edit: fixed a typo above, and coded up the idea I expected to work and ran it through quickcheck for a few min, and this version seems to be correct ('int' return rather than bool is just because haskell's ffi doesn't natively support C99 bool):

    #include <limits.h>
    
    int fits(long x) {
      if (x == LONG_MIN) return 0;
    
      unsigned long ux = x < 0 ? -x : x;
      while (ux > 0x1fffffffffffffUL && !(ux & 1)) {
        ux /= 2;
      }
    
      return ux <= 0x1fffffffffffffUL;
    }


So the problem here is that in this line:

      if( value < (double) LONG_MIN || value > (double) LONG_MAX)
          return( 0);
The cast of LONG_MAX to double rounds upwards (which is allowed by the standard, which says that rounding direction is "implementation defined"), so that "value > (double) LONG_MAX" is false, right? Even though the mathematical value of "value" is larger than LONG_MAX?

Which then leads to this line:

   l_val = (long) value;
Where value is cast to a long despite being outside of the range of longs, thus causing undefined behavior. So to be clear, BOTH of the -O0 and -O3 versions are correct, since both invoke undefined behaviour.

When -O0 and -O3 give different results, either at least one of them is incorrect and you've stumbled on a compiler bug, or both of them are correct and you're invoking UB (the far more common situation).

EDIT: no, I think I misunderstood it: it's not that value is larger than (double) LONG_MAX, it IS (double)LONG_MAX, so of course "value > (double)LONG_MAX" is false.

The problem is that the "(long)((double)LONG_MAX))" is undefined behaviour on implementations that round (double)LONG_MAX upwards instead of downwards. Which is allowed by the standard. Ok, cool :)


This ^^^^ is the core issue. It's insidious that LONG_MAX is too big to be exactly representable as a double.

The answer by ambrop7 below solves a different problem but appears to be correct. The reason escaped me at first and is super subtle. The trick is that LONG_MAX is 2^63 - 1, not 2^63. And the subtlety is that 2^63 is guaranteed to be exactly representable in IEEE double because it is an even power of 2, which 2^63-1 is not.

I don't care much for runtime ldexp() anyhow. So I'd be tempted to just pre-compute the exact limits -2^63 and nextafter(+2^63, 0) and encode them as doubles manually (omitting some #if method of portably determining the width of LONG):

  #define FLONG_MIN 9223372036854775808.0 // exact -2^63
  #define FLONG_MAX 9223372036854774784.0 // exact nextbefore(2^63)

  if (value < FLONG_MIN || value > FLONG_MAX)
     return(0);
Then the UB is avoided. I think the rest of the module works as is. Now, I question whether the author really wants integers between 2^53 and 2^63 to sparsely return true. So it might be better to just change the whole design to use +-2^53 as a hard limit, for trivially guaranteed round trip of the entire range and dispense with these nasty edge cases.


Huh. So, how does one check that a cast from double to long would succeed in C? Just go and do lround()+errno check?


Here's a reliable/portable solution:

    bool double_to_uint64 (double x, uint64_t *out)
    {
        double y = trunc(x);
        if (y >= 0.0 && y < ldexp(1.0, 64)) {
            *out = y;
            return true;
        } else {
            return false;
        }
    }
If you need different rounding behavior, just change trunc() to round(), floor() or ceil(). Note that it is important that the result is produced by converting the rounded double (y) to an integer type, not the original value (x).

Explanation:

- we first round the value to an integer (but still a floating point value),

- we then check that this integer is in the valid range of the target integer type by comparing it with exact integer values (0 and 2^N),

- if the check passes, then converting this integer to the target integer type is safe, and if the check fails, then conversion is not possible.

Of course if you literally need to convert to "long" you have a problem because the width of "long" is not known, but that is a rather different concern. I argue types with unknown width like "long" should almost never be used anyway.

(based on my answer here: https://stackoverflow.com/questions/8905246/how-to-check-if-...)


I think the most correct way would be to explicitly set the rounding mode to down or towards-zero, then calculate largest_double_that_fits = (double) LONG_MAX, then check value <= largest_double_that_fits. Since largest_double_that_fits is guarenteed to be <= LONG_MAX, and value <= largest_double_that_fits, it follows that value <= LONG_MAX.


This is not right. The current rounding mode is not guaranteed to apply to integer to floating point conversions. Quoting C99, section 6.3.1.4, paragraph 2:

"When a value of integer type is converted to a real floating type, if the value being converted can be represented exactly in the new type, it is unchanged. If the value being converted is in the range of values that can be represented but cannot be represented exactly, the result is either the nearest higher or nearest lower representable value, chosen in an implementation-defined manner. If the value being converted is outside the range of values that can be represented, the behavior is undefined."

See it says implementation-defined manner, not according to the current rounding mode.

Test case:

    #pragma STDC FENV_ACCESS ON
    #include <stdio.h>
    #include <stdint.h>
    #include <fenv.h>

    int main()
    {
        fesetround(FE_DOWNWARD);
        printf("%f\n", (double)UINT64_MAX);
        fesetround(FE_UPWARD);
        printf("%f\n", (double)UINT64_MAX);
        return 0;
    }

    $ gcc -std=c99 -frounding-math x.c -lm
    $ ./a.out 
    18446744073709551616.000000
    18446744073709551616.000000
In both cases it rounded up.


I stand corrected, thanks! What a bizzare choice when floating point standards already have a well-defined system for determining how to round things.


> converting the long value 0x7fffffffffffffff to the nearest double value: 0x8000000000000000

From what I understand from the spec it should be the nearest in value, no? Not the nearest in memory representation.


Who said anything about memory representations?

The language spec (or at least a summary of it) is linked in the article, and it is pretty loose: nearest higher or nearest lower, chosen by the implementation (regardless of which is nearer).

IEEE double-precision float cannot store LONG_MAX (which is 2^63-1 with 8-byte longs) precisely, so it gets converted to 2^63; which you cannot cast back to a long, because it doesn't fit (resulting in undefined behaviour).


OP wants the opposite; they're converting from double to long.


“Unless my coffee-deficient brain is missing something at the moment”

Add enough zeroes, and you’ll run out of exponent range.


Hoo boy. This is what happens the first time C programmers start to work with floating point and don't know the fundamentals.

When you work with floating point, you need to remember you work with a tolerance to epsilon for comparisons because you are rounding to 1/n^2 precision and different floating point units perform the conversion in different ways.

You must abandon the idea of '==' for floats.

This is why his code is unpredictable, because you cannot guarantee the conversion of any integer to and from float is the same number. Period. The LSBs of the mantissa can and do change, which is why we mask to a precision or use signal-to-noise comparisons when evaluation bit drift between FP computations.

He has the first part correct, < and > are your friend with FP. But to get past the '==' hurdle, he needs to define his tolerance, the code should be something like:

if (fabs(f1 - f2) > TOLERANCE) ... fits = true.

I was irked by his arrogance when he asks, "Intel CPUs have a history of bugs. Did I hit one of those?" First, learn about floating point, then, work on an FPUnit team for 10 years, and even then, don't assume you're smarter than a team of floating point architects, you're not.


Floating point is indeed hard. It's made harder because the C programming language, due to its history supporting a large number of targets, especially for high performance computing, does not even mandate IEEE 754. (C predated IEEE 754). IEEE 754 is actually very precise about rounding, rounding modes, etc. The x87 floating point coprocessor is also heavily to blame, because of its internal 80-bit precision. Decades of headaches. Other headaches come from ARM NEON's vector instructions not implementing subnormals (RTZ mode), which is not IEEE, which makes vector math differ from scalar math in corner cases. GPUs also went through a similar evolution. Slowly the industry is moving to all-IEEE 754 compliant arithmetic. There's a lot more to say, but yes, I agree with you, it's complicated.


Yeah and then we have the problem that most floating point variables shouldn't be IEEE 754. And we're mostly fucked because most languages don't support anything else.


If they shouldn't be IEEE 754 then what would you use instead? IEEE 754 exists because a lot of smart people put a lot of thought into what the best floating point representation should be. Then the hardware folks put a lot of work into making it fast and accurate.


IEEE 754 is designed for scientific computation and simulation. And it's very good for that. However it's terrible for representing decimal numbers and fractions. You end up with the exact problems described in the article.

It's kind of a problem since most programmers aren't writing programs to do scientific/modeling. They're writing programs that are doing decimal math and currency where IEEE 754 is inappropriate.


I ask again: what would you use instead? The industry has decided that IEEE 754 is the best all-around compromise, and everything is optimized for that case. Most programs aren't doing "decimal math", but programmer expectations are shaped by that since that's what they've used all their life.


>I ask again: what would you use instead?

And I'll repeat what the parent said, again: decimal math.

>Most programs aren't doing "decimal math"

And yet they do. Most programs do decimal math with floats, either because the programmers don't know that's not a thing (and consider floats the same as decimals), or because the language doesn't give them anything else (e.g. Javascript up until recently), or because workarounds are cumbersome (e.g. representing as fractions or working with integers for money with a scale factor and downscaling in the end, etc).

But in almost all programs, the kind of math the program needs to be done, from money handling, to estimating distances, to layout, etc are better done as decimal.

Floating point don't represent a way to solve any kind of actual problem, except the problem of speed and memory representation. So their use is because of computer constraints, not because the problem domain requires floating calculations.


The only point I'll concede is monetary calculations. Estimating distances? Layout? I can't see how the difference between decimal and binary changes anything.

Decimal math isn't built into most of the languages or processors we use, so that's why I keep asking what you'll use. It's a kind of hand-wavy response. You'll need to seek out an appropriate library and use it consistently, which is a lot more trouble than just using the built-in IEEE. Not impossible, just usually more trouble than it's worth.


>Estimating distances? Layout? I can't see how the difference between decimal and binary changes anything.

Whether it changes everything is neither here not there.

It might still result in the same output (e.g. the lack of precision might not matter for layout purposes), but there's absolutely no domain need to do the calculations in FP except for performance or lack of a better number type in the language.

In all of these cases, the exact match would be better than lack of precision, aside from the performance and similar (memory, etc) costs.

That is, if decimal was computing-wise as fast and as available as FP, there would be absolutely no reason to use FP. It's not a choice of mathematical need (e.g. not imposed by the calculation) but a choice of computing constraints.

>Decimal math isn't built into most of the languages or processors we use, so that's why I keep asking what you'll use.

In my experience many languages have it (C#, Java, Python, Javascript is getting it, etc) and people don't use it.

And in most cases, being in the processors wouldn't matter - we use FP for all kinds of non-critical performance calculations, not because it's needed for the speedup, but because it's there.

When there is indeed a need for the CPU-support, I'd use FP myself, sure. But not in most other cases, and absolutely not for money and other cases where precision matters (but people still use FP).


> I ask again: what would you use instead?

bfloat.

incompatible with and far superior to IEEE 754 for one very obvious application. surprised you missed it.


I hope you noticed that IEEE 754 merged DFP support in 2008.... Which means its actually quite good at base 10 if you have a machine/compiler that supports it.


> you cannot guarantee the conversion of any integer to and from float is the same number.

Totally false. Any integer between -(2^53) and 2^53 can be converted to IEEE 64-bit double without any loss of information.


Read the parent again.

Those are a specific integer range, not "any integer".


I'm just saying that the claim made by the parent is overly strong. For most of the integers that we will deal with, they can be converted back and forth to doubles without problems. Even when they can't the double will contain an integer value, it just won't be the correct value. The usual problems people have with doubles are the bits to the right of the decimal point.


Absolutely. It's true that Intel CPUs have a history of bugs, and it's always a danger when working with any CPU. But unless you really, really know what you're doing you probably didn't, especially if you are doing something relatively basic and think you've somehow uncovered a new bug.


Choosing a tolerance is hard enough when you know the floating point model you're using, but it seems like an impossible task to try to support all possible floating point hardware. I couldn't tell you what guarantees you have when __STDC_IEC_559__ remains undefined.

These days I write my floating point code assuming float is a 32-bit IEEE 754 floating point number, and double is 64-bit. You can get those guarantees on any desktop hardware with the right flags, and the same semantics are commonly available on other platforms too.

Picking a well-defined implementation makes it much easier to reason about conversions between integer and floating point types. In fact, it allows you to reason about a lot of operations, e.g. 1.0 + 2.0 == 3.0 is true; (float)183 == 183.0 is true; 0.1 == (double)0.1f is false; etc.


You don't always need epsilon. I've written lots of floating point code that works well without epsilon checks. See my comment in this thread (double_to_uint64) how what the OP needs can be done correctly without an epsilon check.


How is == still defined for floats (in typeful languages)? Fixed precision decimals should be on offer and suggested by compilers in response to an equality-on-floats.


Each floating point value that is not a NaN represents a certain real number, -inf or +inf (this can be expressed in terms of the sign bit, exponent and mantissa). Knowing that, a == b when neither operand is a NaN is defined as equality of what they represent, in purely mathematical terms. Similar can be said for inequality operators.

Be aware that +0.0 and -0.0 are different floating point values but represent the same real number, so +0.0 == -0.0 follows.

People who say == means nothing for floating point and you always need epsilon checks are wrong, plain and simple. == is very well defined. Don't confuse the definition of floating point operations with common practices for using them effectively.

You can iterate through all non-NaN values and check that successive ones are indeed not equal:

    #include <math.h>
    #include <stdint.h>
    #include <assert.h>
    #include <stdio.h>
    #include <inttypes.h>

    int main()
    {
        float x = (float)-INFINITY;
        uint64_t count = 1;
        while (x != (float)INFINITY) {
            float y = nextafterf(x, (float)INFINITY);
            assert(y != x);
            x = y;
            ++count;
        }
        printf("Found %" PRIu64 " floats.\n", count);
        return 0;
    }
    
    $ gcc -std=c99 -O3 a.c -lm -o a
    $ ./a
    Found 4278190081 floats.
(a little bit harder for doubles)

Interestingly, this only finds one zero (-0.0), hence the assert doesn't actually fail around zero.


Each int also corresponds to a certain real number, but if you write "x*(1/x)==1" for integer x, you'll either

(a) get x casted to a float

(b) get `div` instead (like in Python 2) and obtain a false result

(c) get cursed out with a type error.

These three options are available because it's possible to determine whether a given real number is representable as an int or not. This is not possible with floats.


Programming languages aren't able to express arbitrary real numbers, so to "determine whether a given real number is representable as an int or not" is mostly meaningless in a programming language as opposed to a computer algebra system.

What you can do is:

- determine if a float is an integer (trunc(x) == x),

- convert a float to a certain integer type with some kind of rounding, or get an error if it's out of range (see my comment with double_to_uint64),

- convert a float to a certain integer type exactly, or get an error if it's not representable (e.g. by doing both of the above).

The basic reason that so many people fail to use floats correctly is that they act like operations on floats are equivalent to operations on the real numbers they represent, when in fact they are usually defined as the operation on real numbers rounded to a representable value.


> Programming languages aren't able to express arbitrary real numbers,

No system of finite representations is able to express arbitrary real numbers.


Depends what you mean by an arbitrary real number. Any constructed real number could be stored as it’s construction steps. Eg use a series for Pi.

But paradoxically you can store “any” real number, eg the number made up of all future lottery draws.


This may be an instance of "you should really know the gcc/clang sanitizers and use them to test your code":

clang test.c -O0 -fsanitize=undefined

./a.out

[...]

test.c:17:12: runtime error: 9.22337e+18 is outside the range of representable values of type 'long'

Interestingly gcc doesn't throw that warning.


I Second this. Most weird behaviors in C++ today can be detected by ASAN and UBSAN.

There also exists low-cost random-sampling-based ASAN implementation that can be enabled in production: Google uses GWP-ASAN for all server-side applications as well as Chrome on Windows/Mac. See https://www.youtube.com/watch?v=RQGWMLkwrKc for details.


According to most surveys, they aren't used as much on real life.

Here 14%, https://www.jetbrains.com/lp/devecosystem-2019/cpp/

Here 40 - 55%, https://www.bfilipek.com/2019/12/cpp-status-2019.html

At CppCon 2015 or something, at Herb's question during his keynote, about 1% of the audience as per his comment on the video.


This looks interesting. I was wondering if it's due to poor build tool support. In real life I found rr + sanitizer as a strictly better choice than plain gdb.


Just yesterday I had upgraded the version of clang used to compile Android's emulator. It got reverted due to some post submit test failing suddenly. The test case was shifting values in a byte array into one value but the LHS of a left shift didn't have enough bits to represent the shift, which is explicit UB. The statement had multiple shift operations (and other sub expressions with templated types) so it wasn't immediately clear that was the issue. -fsanitize=undefined found it immediately. "Spot the UB" is seemingly becoming my pastime.


Does that verifier have a strong possibility of false positives? I'm curious why C compilers have such a strong history of making reasonable checks optional and hidden behind a bunch of switches.


No, actually the false positive rate of these flags is practically zero. (I'm not sure if it's 100% zero, but I used those extensively, reported many bugs and every time a developer told me "this is a false positive" they were wrong.)

The reason they aren't enabled by default is that that's not what they're designed for. They have a significant performance impact, you can't enable them all at once, they conflict with other security features and they may introduce security issues.

These are developer features. They aren't there to run your production code, they are there to test during development and bug finding.


I know TSan has zero false-positives, because it only flags a data race when multiple threads access the same memory without synchronization (and at least one of those accesses is a write). Not sure about the other LLVM sanitizers.


Here's one that I encountered that I cannot figure out:

  $ g++ -x c++ -fsanitize=undefined -
  #include <iostream>
  
  int main() {
      int *a = nullptr;
      std::cout << std::addressof(*a) << std::endl;
  }
  ^D
  $ ./a.out
  <stdin>:5:29: runtime error: reference binding to null pointer of type 'int'
  0


You are not allowed to dereference a null pointer no matter how you use it even if you later convert it back to address.


You are allowed to dereference a null pointer, but you are not allowed to access the result or use it as an initialiser for a reference. You can only immediately discard the result, or immediately take the result's address. Using std::addressof(* a) first binds a reference, then uses that reference to take the address, hence the error. You won't get any UBSAN error with &*a.


> You are allowed to dereference a null pointer

Can you cite me something for that? The very first mention of dereferencing in the C++03 standard - ISO/IEC 14882:2003 1.9 Program exection ¶ 4 (page 5) - would seem to disagree:

> Certain other operations are described in this International Standard as undefined (for example, the effect of dereferencing the null pointer). [Note: this International Standard imposes no requirements on the behavior of programs that contain undefined behavior. ]

EDIT: As does 8.3.2 References ¶ 4 (page 136):

> [...] [Note: in particular, a null reference cannot exist in a well-defined program, because the only way to create such a reference would be to bind it to the “object” obtained by dereferencing a null pointer, which causes undefined behavior. As described in 9.6, a reference cannot be bound directly to a bit-field. ]


The standard doesn't say dereferencing a null pointer is invalid. The fact that it gives it as an example of undefined behaviour is a defect in the standard. In the discussion on core language issue #232, the intent has been explicitly stated:

> Notes from the October 2003 meeting:

> ...

> We agreed that the approach in the standard seems okay: p = 0; *p; is not inherently an error. An lvalue-to-rvalue conversion would give it undefined behavior.

http://open-std.org/JTC1/SC22/WG21/docs/cwg_active.html#232

WRT your edit: no, that says dereferencing a null pointer and binding a reference to the result produces undefined behaviour. That agrees with what I was saying. "which" refers to the whole of "to bind it to the "object" obtained by dereferencing a null pointer", not just to "dereferencing a null pointer".


I see! Thanks for the reference.


In C99 however, you are allowed to do &٭a because when the operand of & is the result of the unary ٭ operator

> neither that operator nor the & operator is evaluated and the result is as if both were omitted, except that the constraints on the operators still apply and the result is not an lvalue


For anyone who wants it, this is in section 6.5.3.2 Address and indirection operators, paragraph 3.


And to elaborate on how the error message relates to this: std::addressof takes a reference which also can’t be null. So when you dereference the pointer you’re actually turning it into a reference (or binding the reference to the pointer, as the error message says).


A pretty common way to do the offset_of thing is to take a null pointers. Eg: &foo->member where foo is null.

sizeof on invalid pointer dereferences is also pretty common.


Doesn't addressof() take a reference?


> every time a developer told me "this is a false positive" they were wrong

This sounds like a good UB learning experience. The parent comments sounds like the author is running into UB more than they really want to.


In my experience both clang ASAN and UBSAN (and TSAN, the thread sanitizer) are very solid tools, IFIR I haven't seen a false positive yet (of course some code may be specifically written to rely on undefined behaviour, but of course that's a mine field).

On the other hand, the clang static analyzer may have a shocking amount of messages when run first on a large existing code base, and some of those warnings can be considered more "opinions" than warnings. It's still makes sense and is very rewarding to make a code base "static analyzer clean".

The runtime sanitizers in comparison are very precise and always pointed to actual "sleeper bugs", it's almost definitely a good idea to use them and take their warnings serious.

But anyway, clang ASAN, UBSAN, TSAN and the static analyzer are all really excellent and important tools for everybody writing C or C++ code.

PS: the reason why those checks are optional is that they increase compilation time (sometimes dramatically, like 10x slower compilation or more), and they add runtime instrumentation code which both increases the executables size and decreases performance dramatically (also 2..10x times or more, although the clang sanitizers are really quite fast compared to other solutions).


A lot of developers I've worked with don't want to use these tools because they are working with already-awful legacy codebases that will emit screenfuls of (legit) positives when run, and it feels overwhelming. I remember working on some teams where all compiler warnings were turned off because if you turned on even the default ones, you'd get hundreds of thousands of them during the build, and it looked bad. Since nobody had the budget to spend any time cleaning up warnings because "customers don't care about compiler warnings" they would never get fixed. Forget even sanitizers: You wouldn't even get the splash screen displayed before ASAN would barf on you.

The best projects are the ones that start out from day 1 with all the pedantic warnings turned on, warnings-as-errors, static analysis and sanitizers run as part of the automated build and any peep out of them gets treated just like an error would get treated. When you start the project out that way, there isn't that de-motivating initial hump to get over.


Kind of opposite mindset here.

Current codebase generates around 3000 casting errors. I doubt I'm the only one with this kind of "history" to deal with AND the application is crashing with memory access errors.

What's my cleanup plan for this? Multiple compilers with every warning enabled on multiple platforms. Even a platform we're not targeting. We've mapped out which files and which lines have the biggest code smell. Now we have a giant map on a 65" tv to guide us.

Why all this attention? We're moving this nightmare from 32 to 64 bits. Parts of it were originally 16bit which have already been updated. Cast errors alone are now signposts to other bad code.


> Why all this attention? We're moving this nightmare from 32 to 64 bits

It helps that you then almost certainly have buy-in and are allowed to treat this as a priority because it's tied to work that the business is willing to prioritize. You're already over the hurdle of "customers don't pay to fix compiler warnings".


Can't debug anything with a soup of warnings and other mess in the way. They either clear the rot or start from scratch. Our estimate is two years devel to get back to now. So, no time to restart.

Getting rid of the warnings and more importantly the related other debris means being able to use enhancements we make each week. Well... after qa/test have done their thing.

Been good practice for my own indie dev. Treat every warning as an error and make a habit of running a detailed reporting build each day or week. The trick of using other platforms and compilers I brought from years back.


Been there done that. One great feature I ran across was that the intel compiler has a #pragma incantation that enables many checks on a per-file basis. This lets you create new code and selectively fix old code without these obvious defects.



When most people who know C learned C, this technology didn't exist. Currently John Regehr teaches his students that they must use sanitizers.

Note that ruining sanitizers in prod might be insecure. They're for development.


*running, obviously. Too late to edit.


UBSan and ASan have essentially zero false positives. They do point out undefined behavior which happens to work on your platform, but at the very least those are still portability bugs.


This is a runtime check and so has a (small) performance overhead.


That's true. I am using memory sanitizers in my workflow, but I haven't been using the `undefined` sanitizer. This could have saved me a day worth of effort.


or just start moving away from using c. the last five years have brought nice alternatives like zig and rust


So start over 40 years of progress? Everything in my OS is written in C. Why am I always told not to use this language in HN if it's what my computer runs on. It's dangerous, sure, but seems like as a software engineer it's something I need to learn instead of run away from.


Progress is exactly what brought us safe languages. Proper engineering dictates eliminating, or reducing as much as possible dangerous/unsafe/etc. behavior. There's a reason C and C++ top the CVE lists with their buffer overflows and undefined behavior.

It's pretty much established that even expert C and C++ programmers, especially for larger code bases, will end up making some sort of mistake that will cause a security vulnerability or undefined behavior.


That's not my point, I don't contest that safe languages are safer. I contest the idea that we should "just use" something else. All my tools are written in C, the entire GNU toolchain is in C, how am I supposed to operate in this world if I "just don't use" C?


That is the disadvantage of open-source, it forces you to use the language of someone else's source.

I used Delphi on Windows. Most of the code I wrote in the last 20 years is in Pascal.

And that works really well on Windows. You have a stable API, and it does not matter what language the API is written in. Any language can use the API in the same way.

I tried to run some old projects this month. My 20 year old Delphi Windows programs run better in WINE on Linux than most programs I wrote 5 years using Linux tools, because the libraries have changed, but the API has not


I think OP was implying not to write new code in C. If possible, we can also use replacement tools written in safer languages if they offer the functionality we need (e.g. `ripgrep` instead of `grep`, or web servers written in safe languages over those written in C/C++).


ripgrep isn't POSIX compliant. It's hardly a proper replacement.


Right, it is intentionally not POSIX compliant. This saves fairly significant development effort and also permits expanding the tool to be more user friendly, such as transparently searching UTF-16 encoded files.

ripgrep cannot be a drop-in replacement. Despite that, it can certainly replace grep in a wide variety of use cases. See: https://github.com/BurntSushi/ripgrep/blob/master/FAQ.md#pos...


Congratulations. You have succesfully regurgitated the same sentence I've read here a thousand times.


Not only the last 5 years, but I digress.


Yeah, it's kind of weird how little attention some of C's (historical)competitors in the systems programming space get. You'd think to see a lot of comparisons to whatever Ada* or Pascal would do in the same context.

* Probably because the Ada practitioners are all trapped in Scifs somewhere in the inner mantle of the Earth.


That is why I use Pascal

20 years ago it was advertised as the safe C alternative.


No. I'd rather learn how to use my tools (that I have invested years to be comfortable with) properly than starting over.


I would have trouble convincing myself I'm capable of writing C that doesn't blow up randomly, when everyone else has been failing at that for my entire career and more.


C is not very productive.


Maybe, but it’s super fun and elegant. C and Scheme are at the top of my “most elegant” list. Though, of course, for different reasons.


A precise integer value is only guaranteed to be representable losslessly in a double if it is up to `64 - 1 (sign) - 11 (exponent) = 52` bits in magnitude.

This should be fairly obvious with knowledge about how floating point numbers are represented internally IMO.

Edit: Be more precise about what can be represented.


Perhaps it should be, but every 53 bit integer is exactly representable in double, because there's an implicit leading significand bit in the representation.

It's also worth noting that every finite double with magnitude larger than 2^52 has a precise integer value; it's just that once you get beyond 2^53, not every integer is representable.


Yes you're right - thanks for clarifying. I meant to say that not every integer in magnitude greater than 52 bit has an exact floating point representation in IEEE doubles.


IIRC not every 53-bit integer is representable in double, since float has two zero representations, but twos-complement integers have only one.

[edit]

Since the extra value is precisely a power of 2 (-2^52), then it will round correctly, however the value is arguably not precisely -2^52 since it has an epsilon of greater than 1.


The spacing of floating point numbers have any bearing on their values.

To be precise: every integer, positive or negative, with magnitude less than 2^53+1, is exactly represented in double-precision. “Extra values in two’s complement” don’t (and couldn’t possibly) effect this at all, since it is a statement about abstract integers and floating-point numbers, neither of which depends on two’s complement representations.

In particular, -2^52 has a sign field of 1, an exponent field of 1023+52=1075, and an all-zero significand field. This number is exactly -2^52.


So the correct implementation of fits_long would be this?

    int fits_long(double value)
    {
        double max = (double) (1L << 52)
        double min = (double) -max;
        return min <= value && value <= max;
    }
(assuming a 64bit long)


It depends on the semantics of fits_long. There are values outside of your min/max range that are representable by 64-bit integers, but if you convert them to a double and back, you may get a different value (off by as much as 4096, since, IIRC, the specification allows any rounding strategy).


That would allow a double with a fractional part to return true though, which is not the intention. If you do this before the casted comparison, the cast shouldn't produce an undefined value.

I think this is a good solution without using the math library and having access to the FP status registers. Otherwise the lrint solution seems better.

    int fits_long(double value)
    {
        double max = (double) (1L << 52)
        double min = (double) -max;

        if( min <= value && value <= max)
            return( (double) (long) value == value);
        return( 0);
    }


Why not nix the range check and just rely on the cast? As pointed out by another poster above, there are double values outside the ±2^53 but fits inside long.


The range check filters out the cases where the cast is undefined.


I disagree.

The value 2^80 can be exactly represented in double-precision as well, but it is also the most precise representation of 2^28 other integers, so it is ambiguous which integer is being represented.

[edit]

To clarify, the representation you suggest would also be the best possible representation of -2^52-1.


That's not how floating-point works. The IEEE 754 standard is quite clear that each floating-point datum represents exactly one real number (or infinity or NaN). Floating-point numbers are not intervals.

Analogously, `float x = 2.25; int a = x;` assigns the value two to a, but this does not imply that the integer two also represents 2.25. Two is just two.


IMO that's a very wrongheaded way to think about floating-point values. As far as IEEE 754 goes, my only praise for it is that it's significantly better than no floating-point standard (which we had before).

Integers are inherently different because calculations with integers are naturally discrete, while floating-point calculations are a discrete approximation of the reals, which are not discrete.

There are two general purposes for floating-point numbers: scientific computing, where you start with an imprecise value and the precision multiplicatively accumulates (the original use). And as a hardware optimization for calculation of non-integer values (a common use-case today). In neither case does it make sense to treat a floating-point value as a precise number, which matches common advice to not compare floating-point values by equality.


I came here to comment that I think the problem is poorly posed and algorithm chosen to implement it is "backwards".

But I also see that StehpenCannon has spoken :-); there is going to be very little chance he is wrong when it comes to arguing about floating point! He also notes the following, but not quite so explicitly...

The problem is poorly posed because the question is, which mantissas, exponents and sign bits fit into a long.

The algorithm is backwards because simple comparisons in integer space cannot compute this; but I think the algorithm should be,

  int fits_long(double value) {
      unsigned int trailingZeros = countOfTrailingZeros(mantissaOf(value));
      bool fits = (bitsInAMantissa + 2 - trailingZeros + exponentOf(value) < bitsInALong);
      return fits;
  }
[Notes:

if double is always > 0 and going to an unsigned long, the 2 above would be 1. The 2 represents the sign bit in the double. Given that sizeof(double) == sizeof(long) all bits in the two representations are accounted for, so there is no information loss.

checking:

mantissa = 0 (with a hidden msb of 1) means, trailingZeros = bitsInAMantissa -> fits will be true when the exponent value can be 0..62. So this represents each of the +/- 2^exponent values

mantissa = 1 x bitsInAMantissa means, trailingZeros = 0 -> fits will be true when the exponent value can be 0..(bitsInALong-BitsInAMantissa-2). So, +/- (all ones) * 2^exponent value.

The representation for going from double to long is not "smooth".

]

countOfTrailingZeros() is a favorite of bit twiddlers. "Hackers Delight" or "bithacks" https://graphics.stanford.edu/~seander/bithacks.html

[edited to improve readability]


I've seen quite a lot of errors stemming up from assumptions about floating point numbers. Not sure there is a good way of handling this in the end, except exert caution. Even basic assumptions like f + 1 > f will have this issue.


The problem is that in most languages, to novices floating point numbers swim like a duck, quack like a duck, but it turns out they're alligators.

A good way to get around this is reading https://floating-point-gui.de/ to weed out any preconceptions, but yeah it's difficult to steer novices there without them stepping on one of the pitfalls first.


Note that f + 1 > f can also fail for integers in many languages (e.g. it does not hold for signed integers in C or C++, because the behavior is undefined when you add 1 to INT_MAX, and unsigned integers always wrap around). This particular gotcha is not unique to floating-point.


Yes, signed overflow can make f+1 > f be false. However in C/C++, because signed overflow is undefined behavior, the compiler is perfectly allowed to simplify the expression f+1 > f to always be true.


Unsigned integers are still specified to wrap.


The compiler can make that simplification, but there's no guarantee that every compiler does, so you cannot assume that it will do so; it will sometimes happen to be true in a specific compiler, but it is never true in the standardized language model (which is what programs are written against).


Which would then have the same effect as the title's article, a bug with -O0 but working with -O3


It _could_ have that effect, but it would depend on signed integers wrapping.


i+1>i breaks even for unsigneds


At least for unsigned integers i+1!=i, and the only exception to i+1>i is i==UINT_MAX. For signed integers it may be undefined behavior, of course—again in just one instance—but for floating point you may have f+1==f, or (f+1)-f>1 depending on the rounding mode, for many values of f which are nowhere near the limits of the range, not to mention other oddities like (f==f)==false when f is NaN. If you code involves floating-point operations then the number of corner cases you need to test increases greatly.


Although it's at least not UB


Well, it may seem fairly obvious but it's also wrong.


When doing floating point arithmetic on the x86 though, it can extend to `80 - 1 (sign) - 15 (exponent) = 64` bits. So if the result of a floating point just so happens to have zero exponent, the mantissa can fit just right in a long int.


This hasn't been dependably true on x86 for almost two decades. SSE2 does double-precision computation at native width, not in the extended 80-bit format. Some 32b compilers still use 80-bit x87, but almost no 64b compilers do so.


> This hasn't been dependably true on x86 for almost two decades. SSE2 does double-precision computation at native width, not in the extended 80-bit format. Some 32b compilers still use 80-bit x87, but almost no 64b compilers do so.

My experience has been different -- forcing SSE instructions gives me a different result on some math calculations. Core2 cpu, boost odeint calculations. Clang or gcc.

Do you have a reference for why it's rare?


x87 is the default for 32b processes on Windows and Linux. 32b processes on macOS and 64b processes on all three OSes use SSE2 for double precision.


Clear your floating point exception register by calling feclearexcept(FE_ALL_EXCEPT). Convert to long by calling lrint(rint(x)). Then check your exception register using fetestexcept(). FE_INEXACT will indicate that the input wasn't an integer, and FE_INVALID will indicate that the result doesn't fit in a long.

Edit: check for me whether just calling lrint(x) works. The manpage doesn't specify that lrint() will set FE_INEXACT, but it seems weird to me that it wouldn't.


As someone who's had to read C and C++ code using `double`, it's been a few years since I've heard of `feclearexcept` and how important it is.

Great, thanks, now I have to go back and restart some of those code reviews I've been doing of certain third party matrix math libraries...


> The manpage doesn't specify that lrint() will set FE_INEXACT, but it seems weird to me that it wouldn't.

Annex F:

The lrint and llrint functions provide floating-to-integer conversion as prescribed by IEC 60559. They round according to the current rounding direction. If the rounded value is outside the range of the return type, the numeric result is unspecified and the ''invalid'' floating-point exception is raised. When they raise no other floating-point exception and the result differs from the argument, they raise the ''inexact'' floating-point exception.


Thanks. I should file a bug about this against the Linux man-pages project.


I had no idea this feature existed! Does it behave usefully in a multithreaded context?


Yes it does. N1570 7.6:

> The floating-point environment has thread storage duration. The initial state for a thread's floating-point environment is the current state of the floating-point environment of the thread that creates it at the time of creation.


I think it uses thread-local storage like errno does, but I'd have to verify.


To save some time for new readers, the author is unfamiliar with floating point representation and thinks that a double precision number since it is 64 bits can hold any 64 bit integer and is somewhat confused as to what an xmm register can hold(they believe that it has 128 bits of precision instead of being able to hold 2 64-bit doubles, or 4 32-bit singles). They attempt to find the issue a few ways. The correct solution is not to convert any integer larger than 2^53 in absolute value since only integers that large can be successfully converted to double and back( aside from a few others that exist sparsely).


> When something very basic goes wrong, I have this hierarchy of potential culprits:

I don't know if this is supposed to be a joke or part of the setup for an explanatory post about undefined behaviour, but that list is in exactly the wrong order.


I agree, but I'd also say that silicon bugs are rarer, so I put them at the end of the list.


SSE xmm registers might be 128 bits wide, but the precision is still 64 bits. The additional (high) bits are zeroed out.

What you're seeing is not excess precision due to wide registers but excess precision due to optimization and constant propagation, which means GCC calculates a fast path for (argc == 1) that doesn't round correctly and ends up with "it fits".

Interestingly it does optimize to the correct "doesn't fit" with -mfpmath=387 -fexcess-precision=standard, so I guess this is a bug in how GCC treats SSE math. The sanitizer (-fsanitize=float-cast-overflow) also notices the problem.


Based on my experience, this title is a strong hint that some undefined behaviour is triggered.


    if( ! fits)
Why this (constently) terrible formatting though? Never seen anyone using this style.


Gotta say, without seeing it in context, it's flat-out clear what it means and non ambiguous


Sure, but it's anti-mathematical let's say :) Left and right brackets should have symmetrical formatting. Never seen a style with such asymmetry, it's pretty odd.


Ahh you ought to see my style. I've been told it's unique and quite ugly. Linters tend to very much dislike it. Nonetheless my style has a purpose to me and I'm sure so does the author's.


I think glueing punctuation to the previous word but always having a space before the next one makes the words more readable. Probbaly makes sense although you always need research to back up such claims. For example there's research that says snake_case is more readable than camelCase (and yet most languages encourage camelCase for some reason)


That's perfectly fine if you're working alone.


With the help of your comments, I could now write the conclusion to my article. In a nutshell this is the solution:

    #include <math.h>
    #include <fenv.h>


    int   fits_long( double d)
    {
       long     l_val;
       double   d_val;
 
    // may be needed ?
    // #pragma STDC FENV_ACCESS ON
 
       feclearexcept( FE_INVALID);   
       l_val = lrint( d);            
       d_val = (double) l_val;       
       if( fetestexcept( FE_INVALID))
          return( 0);
 
       return( d_val == d);
    }
The article explains it in more detail. Thanks for the help.


The frst rule of floating point comparison is you do not compare them for equality, but instead calculate the difference and check if the difference is less than epsilon.


This is super common advice but it is generally wrong, at least the second part.

Comparing floats is more subtle than most programmers realize, and there really isn't a one-size-fits-all solution.

Things to consider due to the nature of fp representation - comparing results close to zero is different (i.e. "is a small" needs a differen test than "are a & b close"

- the distance between fp numbers depends on their magnitude, so comparing two large numbers to each other shouldn't have the same bounds as comparing two numbers near 1, say[1].

- if you aren't quite careful you can easily create tests where a == b but b != a , which can cause sorting issues, etc.

Hand-wavily speaking if you want to do this "right", you should probably look at doing the analysis in ULP (units in last place) rather than directly on the floats. Don't do it for values near zero though. And have a fast path for differently signed values.

The above doesn't even get into denormalized values.

[1] note that what people usually mean for epsilon is the version of machine epsilon that is the difference between 1 and the next representable float above 1 [2]. So by definition this is smaller than the representable difference between any two numbers in larger decades

[2] MS .NET somewhat confusingly defines Epsilon as the smallest representable normalized number.


>I am still looking for a better way to check, if a double will convert cleanly to an integer of the same size or not.

I'd say the cleanest would be to decode exponent and mantissa, check if the exponent is within the 64-bit limit of long, then check if there's any bits set below the decimal point. (+plus some extra care for two's complement negative numbers)

The problem with this is of course that this would be platform dependent.


I'd say the correct way to is to use lrint or lround and check for errors the standard way.


This is why using safe languages is important. Even frequent users of C and C++ end up making mistakes that are difficult to track down.


I'm not sure what you mean by safe in this context. A language that forces type casts to be explicit? That throws runtime error when invalid/imprecise cast is done?


The most surprising thing for me out of this is that casting a high positive integer to double will output the nearest double which could be higher, not the highest one smaller than or equal to the integer value.

Is there a way to get the largest double smaller or equal than some positive integer?


> When something very basic goes wrong, I have this hierarchy of potential culprits: the compiler buggy hardware OS vendor last and least me, because I don’t make mistakes :)

I really dislike the arrogant programmer trope. Can we all stop?


How did you decide that "the method works for LONG_MIN"? Did the method return the expected output of false? Because it really seems like the code is working correctly on `-O0` and incorrectly on `-O3`...


Why would you expect the output to be false? On any typical system around (with 8-byte longs), LONG_MIN has the value of -2^63 which (converted to a IEEE double-precision float) passes the function's checks just fine -- even if the values around it don't.


Ah, my mistake. Still, seems like the -O0 is actually what's correct and the -O3 is reporting an incorrect answer.


I had something similar happen but with GCC generating an internal compiler error and just plain failing. Still haven't figured out why.


[flagged]


Rude and unnecessary. Now what could be a learning experience is diminished. Saying nothing is free.


Necessary. Some people, especially in management think that something which works but does so by exploiting accidental interaction of the build environment is acceptable, and then the blame falls on the person who makes the unrelated change that breaks the trashy code, so the thing is cast as blame on the guy who changes things rather than opportunity to fix latent bugs.

It is a common attitude among non-programmers too and on people who think delivered fast matters the most.


that's a completely different situation than we're talking about here.


"Lousy programmers" was very uncalled for, and is not productive. We're all learners.


Not every driver can be Ayrton Senna. It is best to make automobiles work in such a way as to mitigate slopping driving if possible.


Am I misreading your comment, or are you holding up a driver who died by driving into a wall at 145mph as an example of how to do things carefully?


If it's working by accident, it's hard to tell, isn't it?

You wouldn't necessarily think to check for something wrong if it appears to work.


It's good to be called a lousy programmer when deserved. Incentive to keep improving. If you get hurt by that, imagine what Torvalds would have told you.


Insulting someone because of his/her mistake is of no value, no service to he/she who made a mistake. Much better is to teach the right way so that the person is given the opportunity to learn. If yes, everyone wins. Insulting the person could work sometimes, but then you're just adding up to hatred and fear. Which the world is full of;


If it is deserved, that is, based on truth, then it is not an insult. It is up to the individual to be honest about it instead of feeling offended. I do agree that merely saying they are bad programmers is not enough and it is more productive to also show them how they can improve.


Fair enough. Things are not black or white after all. There's infinite ways you can give feedback;


Lousy means 'in my opinion, not good enough'. The problem with that there is no information about what to get better at, and also it's just an attack at the person. Torvalds? What a famous programmer might say based on internet stereotypes is also of no concern.

What is needed is constructive criticism. "You should spend time thinking about the structure and purpose of the code before you start writing code, perhaps draw a diagram on paper first" might be an example.


Code is a social activity. "lousy programmers" are the ones that are rude jerks to others.


I don't like jerks and there is a social aspect to programming. But there are plenty, far too many, lousy programmers that are really nice guys too. Having social skills is extremely helpful. Being able to code is mandatory.


I'd say just put volatile and be done with it. Now your -O3 will also break, but at least it's consistent with -O0 :p




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: