Hacker News new | past | comments | ask | show | jobs | submit login
Why I find IEEE 754 frustrating (p5r.org)
140 points by plesner on Nov 8, 2014 | hide | past | favorite | 58 comments

Good essay. The dynamic scoping / mutable global state issue, especially with respect to rounding modes, is a total nightmare. We've been trying our hardest to provide great support for rounding modes in Julia and it just doesn't work well. The trouble is that there are functions that need normal rounding regardless of the current rounding mode while other operations can safely respect the current rounding mode. There are still other functions for which neither is appropriate. A software-implemented sin function, for example, may completely break if used with an unexpected rounding mode, so you don't want that. But you also don't want to just completely ignore the rounding mode either. What you really want is a sin function that returns a correct result, rounded according to the rounding mode. But that means you really need five variants of the sin function and you need to dynamically dispatch between them based on the current rounding mode – i.e. global mutable state – which is, of course, quite bad for performance, not to mention the explosion of complexity. I've come to believe that the rounding mode concept as designed in IEEE 754 was only ever feasible to use when you were writing your own assembly code.

The essay is not quite right about the motivation for rounding modes – they are not intended to support interval arithmetic. When we got the chance to ask him about it, Kahan was quite negative about interval arithmetic, noting that intervals tend to grow exponentially large as your computation progresses. Instead, the motivation he suggested was that you could run a computation in all the rounding modes, and if the results in all of them were reasonably close, you could be fairly certain that the result was accurate. Of course, at the higher level the essay is exactly right: the fact that this motivation isn't explained in the IEEE 754 spec is precisely what allowed this misunderstanding about the motivation to occur at all.

> But that means you really need five variants of the sin function and you need to dynamically dispatch between them based on the current rounding mode.

That would be one way to achieve the behavior you describe, but it is certainly not the only way, nor is it the best. You could, for example, do a computation that produces a result rounded to odd with a few extra bits (independent of the prevailing rounding mode), then rounds that result to the destination precision using the prevailing mode.

This assumes that you really want correct-rounded transcendentals anyway, which at present are sufficiently slow that you might as well dispatch based on rounding mode (despite the herculean efforts of the CRLibm crew). If faithful rounding is all that is required, even simpler methods exist that are extremely performant.

Yes, this is a very fair point – it's not like our transcendental function implementations are uniformly correctly rounded anyway – being within 1 ulp is good enough. I suppose I might be happy if there were a lexical flag for each floating-point instruction (i.e. a bit of the opcode) that chose whether to use the current rounding mode or the normal one. The trouble now is that you have no choice but to use the dynamically scoped one, which is problematic often enough that one is pushed towards just opting out of changing rounding modes altogether.

> Instead, the motivation he suggested was that you could run a computation in all the rounding modes, and if the results in all of them were reasonably close, you could be fairly certain that the result was accurate.

I actually thought that was the use case I was describing, though I would expect round-positive and round-negative to be enough. Don't the other rounding modes yield results within those bounds?

Consider something like

To obtain the correct lower bound, you need to round the addition up, and the subtraction down. This is also incredibly inefficient on current processors, as the registers are flushed after each mode change.

You're right. To be absolutely sure I tried writing a little program that does the thing I describe in the post, https://gist.github.com/plesner/7a13b5c8d269315df645, and sure enough it breaks down completely,

round default: 0.072817 round up: 0.072703 round down: 0.072931

Now I don't understand how you get a reliable result using rounding modes at all.

Interval arithmetic is tricky: you have to make sure you round in the most conservative direction for each operation. As other commenters mentioned, these bounds can end up being so large as to be useless.

The way I hear this is: you have two options.

- Either use interval arithmetic which is tricky and may give you useless large bounds, but those bounds are guaranteed to hold.

- Or just try all the rounding modes which is less likely to give you large bounds, but now you have no guarantee that those bounds say something meaningful about your computation.

Or does the second case give meaningful guarantees?

Not strictly, no. From what I understand, Kahan's idea was that you run the computation under different rounding modes and see how the behaviour changes, see for example:


Thanks for the link, I hadn't seen that. I have two reactions.

Firstly, I find his example contrived. He wants to determine the accuracy of a black box by subtly tweaking how floating-point operations spread throughout that black box behave? It seems like an okay hook for ad-hoc debugging -- if those flags are around anyway -- but as the primary rationale for the entire rounding mode mechanism? That's not a lot of bang for a whole lot of complexity.

Secondly, this is exactly the kind of discussion around rationales and use cases I'm looking for. Even if I don't buy his solution the problem still stands and might be solvable some other way, for instance through better control over external dependencies. Maybe something like newspeak's modules-as-objects and hierarchy inheritance mechanisms could be applied here.

I agree. I've never heard of anyone else utilising this approach, and allowing programs to change the mode of other programs is a recipe for shooting yourself in the foot.

One of the most infuriating things about global rounding modes is that they are so slow for things when changing rounding modes would actually be useful. A nice example is Euclidean division: given x,y, find the largest integer q such that y >= q*x (in exact arithmetic). If you change the rounding mode to down, then floor(y/x) would get you q. However the mode change is typically so slow that it is quicker to do something like round((y-fmod(y,x))/x).

Honestly this sounds like a perfect use case for monads - you write your computation with a dependency on the rounding mode, and you can either ripple that all the way up (to your whole program if need be), or specify an explicit rounding mode at a level where it makes sense. It would also allow sensible (if poorly-performing) rounding for subtraction cases. Has anyone implemented it like that?

I think monads might possibly somewhat alleviate the problem of writing 5 different versions of sin by hand but I think you would still need to do some dynamic dispatching to handle the rounding modes, which according to Stephan is going to be a big performance hit.

How important is it to implement floating point arithmetic exactly as the standard specifies it? What if the same set of features is generally available, but not quite in the required way?

That depends on what you want. Some people don't care at all, but others have legitimate needs for a lot of the features described in the spec. So if you can at all, it's best to follow the spec, imo. That way languages don't do a lot of annoyingly different, similar things. And, as this essay points out, the spec is wildly successful – it is arguably the most successful spec of all time – so second-guessing it is usually a mistake. Kahan at al. have forgotten more about numerical analysis than most of us will ever know.

Factor has a library for setting up floating point environments that might interest you. I think we implemented all of the functionality you might need and it's nestable with dynamic scoping. I haven't needed to use it for anything though.

Code: https://github.com/slavapestov/factor/blob/master/basis/math...

Tests: https://github.com/slavapestov/factor/blob/master/basis/math...

Docs: http://docs.factorcode.org/content/article-math.floats.env.h...

a lot of analysis is based on ieee 754, that would mean developers would have to re-do the analysis and change the finicky algorithms. I'm personally not able to do it. I nod at the research papers I read, but I would not be able to write them.

>> But that means you really need five variants of the sin function and you need to dynamically dispatch between them based on the current rounding mode

Just define a rounding behavior for the language and implement it that way. Don't claim full 754 support, just specify the strategy used by the language. A sin function should behave according to the design of that function and should be able to ignore any previous state of the FP hardware. I have not seen a language that directly supports setting the rounding modes, so any language libraries can do what they like - you don't need to preserve or worry about something you don't offer the option to modify.

Check out my other comment below. Factor has with-rounding-mode and you can nest these modes.


    +round-up+ [
        1.0 3.0 /f double>bits .h
        +round-zero+ [
            1.0 3.0 /f double>bits .h
        ] with-rounding-mode
    ] with-rounding-mode
    ! output
    ! 3fd5555555555556
    ! 3fd5555555555555

Julia also supports this:

      with_rounding(Float64, RoundUp) do
        println(bits(1.0 / 3.0))
        with_rounding(Float64, RoundToZero) do
          println(bits(1.0 / 3.0))


With all the state and modes and such, is there a mode that just does perhaps: round to even, clamp to infinity, never produce NaN, and never trigger an exception? Because that's basically what a lot of people want - do the math and handle the extremes in reasonable ways without throwing exceptions.

In your "never produce NaN" arithmetic, what do you want sqrt(-1) to be, and why is that a better answer than NaN?

I want it to throw an exception - a language-level exception - and I'm fine with sacrificing a bit of performance (i.e. checking the flags after every operation) to do so. This is a better answer because it means I see the error where it actually happened, rather than getting a NaN out the end of my calculation and having no idea which particular computation it came from.

This is what we had before IEEE-754. Instead of having a closed arithmetic system, exceptional conditions caused a trap (the "language-level exception" of its day). It was a terrible situation, and the direct cause of several "software-caused disasters" that you may have learned about in an engineering class.

Having division by zero being an unchecked exception is a terrible idea, as you say.

But that's not what I want. I want a paranoid language. I want a language where potential division by zero is a checked exception. One where "a = b / c" won't even compile if c might be 0. One that won't compile if it can find an example of an input to a function where an assertion fires. I want one where there is no such thing as an unchecked exception. Or rather, one where you can explicitly override checked exceptions to be the equivalent of (read: syntactic sugar for) "except: print(readable exception trace); exit(<code>)" - but you need to explicitly override it to do so.

Would it be a pain to write in? Yes. But at the same time there's a lot of software that would be best written in this manner. A language where the language itself forces you to be paranoid.

> One where "a = b / c" won't even compile if c might be 0.

Dependently typed languages can provide this.

Can you give an example? I have yet to run into a language that doesn't require a proof of correctness, but will just attempt to find a counterexample.

Well, it sounds like what you're looking for is property based testing. You can setup something like QuickCheck to run at compilation.

You need to provide a proof, but in e.g. Idris the language gives you the tools to make that proof quite easy.

[Citation needed]

Again: I am looking for a language that doesn't require you to provide a proof. I'm looking for a language that is a "logical extension" of what currently is available - that is, I am looking for a language that will attempt to find a counterexample on compilation and will bail if it can.

But non-exhaustively? That exists already - plenty of languages will warn or error if they can tell you're dividing by zero, but don't catch every possible case.

Any working program will in some sense be a proof, by Curry-Howard. So I think asking to not have to provide a proof is backwards; what you want is a language that makes it easy to express the program and manipulate it as a proof.

Only because of inadequate language support (or, in the case of that Ariane 5, deliberately overriding the language support)

I know that's what a lot of people want, but I do not see how one can 'handle the extremes in reasonable ways' without producing NaNs. I have heard people advocate 0/0 = 0, but IMO that is NOT 'handling the extremes in reasonable ways'. What would you propose for 0/0?

More than that, a signaling NaN should raise an exception if used in a comparison. Otherwise, you get a bogus comparison and a bogus branch.

It isn't possible to never produce a NaN. 0/0 and inf-inf, for example, have no reasonable result that can be produced. And indeed, the spec defines those operations as resulting in NaN.

When confronted by closed standards with open drafts I generally just implement the last draft. I might buy the closed spec to check, but my code comments and documentation will all reference the draft since that is what people can read.

You might tuck a copy of these into your personal library in case IEEE purges them.

IEEE754 base document: http://www.validlab.com/754R/standards/754.pdf

IEEE754r draft: http://www.validlab.com/754R/drafts/archive/2006-10-04.pdf

Perhaps someone who has seen both versions can comment on how close these are to the closed versions.

The 754 (2008) draft you link to is reasonably close to the final standard in content, but there are definitely a number of significant changes that came in the two years between that draft and final publication (I'm "STC" from the change history, for context).

It's not hard to find copies of the final standard online, but the availability issue is definitely something that the committee is aware of.

I don't know, that seems bad just in a different way. Maybe if IEEE hosted those documents themselves it would make sense to link to them but even then the copyright notice on the draft looks pretty onerous.

Python's Decimal module (though not its floats, for some reason) has, IMO, a pretty good implementation of these features.


Basically, it encapsulates attributes and status flags into a thread-local "context" which you get/set through normal function calls. There's also the helpful "with" syntax which allows you to say "run this code block (and anything it calls, etc) with this context instead of the current one, then restore the current one on exit".

A sibling comment talked about a sin() example where you want to use an explicit rounding mode for your calculations, then apply the global rounding mode to the result. Under this paradigm it would look something like:

    with MySpecialContext(settings, etc) as ctx:
      check status flags in ctx
      get result
    round result # this uses the parent context

Looking at the broader issues here, I know I've had the same sort of problems with the ISO 10303 (STEP) standard [1]. Overall, it consists of dozens of $100+ books, most of which amount to little more than a long list of descriptions of the classes that can be used to transmit CAD data. Everything is in turgid bureaucratese. I've seen nowhere in the standard with any sort of high level description of how those classes are intended to be used, no motivation for why things are the way they are. There are some recommended practices documents, but they mostly seem to cover fringe areas like how to handle colors rather than core areas like the preferred approach for handling CAD geometric data.

It just seems so odd to spend so much effort to develop a public standard, then make it expensive and hard to use. Doesn't that defeat the entire point of having a standard?

[1] http://en.wikipedia.org/wiki/ISO_10303

This essay raises a lot of concerns about global state, especially with regard to rounding and flags (or exceptions). That's a common misconception, but nothing in IEEE-754 requires that this state be global. In the C language bindings, for example, dynamic rounding mode and status flags have thread scope.

In fact, dynamic rounding modes are not required at all by IEEE-754 (2008). The revised standard requires that languages provide a means to specify static rounding at "block" scope, which is a language-defined syntactic unit.

> (4.1) the implementation shall provide language-defined means, such as compiler directives, to specify a constant value for the attribute parameter for all standard operations in a block; the scope of the attribute value is the block with which it is associated.

> (2.1.7) block: A language-defined syntactic unit for which a user can specify attributes.

You can take "block" to mean whatever makes sense for your language: it could be a single arithmetic operation[1] or it could be the whole program (though it's more useful if it isn't). It is recommended, but not required, that languages provide a means to access "dynamic" rounding modes as well, which correspond roughly to what most people think of when they think of IEEE-754 rounding modes as widely implemented, but again a huge amount of flexibility is left to the languages to choose exactly what scope and precedence rules make sense for their language.

[1] efficient hardware support for such fine-grained static rounding is still somewhat lacking in the commodity CPU world. On GPUs and some other compute devices, it is quite natural (and "dynamic" rounding is sometimes quite a hassle). AVX-512 will bring support for per-instruction static rounding to mainstream CPUs.

When we look at flags, the situation is much the same. Languages completely specify the scope of flags. There is no requirement of mutable global state. For example:

> (7.1) Language standards should specify defaults in the absence of any explicit user specification, governing ... whether flags raised in invoked functions raise flags in invoking functions.

Like with rounding, current commodity CPUs make it easier to provide flags with thread scope, but IEEE-754 does not require it. Commodity hardware works the way it does because mainstream languages work that way. If a different model makes sense for your language, do that.

Finally, the concern about "exceptions" is entirely misplaced. "Exception" in IEEE-754 simply means "an event that occurs when an operation on some particular operands has no outcome suitable for every reasonable application," which is a rather different meaning than the way "exception" is understood in colloquial PL usage. Under default exception handling, which is all that IEEE-754 requires implementations to support, all that needs to happen in the case of an exception is for the implementation to raise the corresponding flag, the scope of which is (as previously discussed) up to the language to specify.

I would encourage you to direct questions like these about the spec to committee members. If you work for a big company, a few members probably work with you. If you don't, most committee members are happy to answer questions, even from people they don't know.

The concerns about access to the spec itself and to the minutes are well-placed, and definitely something that the committee is aware of. (But mostly out of the committee's hands; it's up to IEEE to set pricing. Send them your comments!)

> nothing in IEEE-754 requires that this state be global. In the C language bindings, for example, dynamic rounding mode and status flags have thread scope.

My example, RegExp captures in JavaScript, also have thread scope. Thread local global state is still not good. What makes the flags global is that access to them is provided implicitly and ubiquitously, independent of scope. All the operations as well as the flag functions like saveAllFlags are given read and/or write access and none of them take arguments that control the scope of the flags they're manipulating. They get pulled from thin air. This is problematic.

I deliberately never say that the rounding flags are global, my strawmen are that they're either lexically or dynamically scoped, both of which are problematic.

> You can take "block" to mean whatever makes sense for your language: it could be a single arithmetic operation[1] or it could be the whole program (though it's more useful if it isn't).

What I'm arguing is that there is no interpretation of "block" that yields a satisfying result. I didn't consider applying it to individual operations because the whole motivation for the rounding mode mechanism is to keep the same mode is in effect across multiple operations. The lack of hardware support suggests the same thing.

Maybe you can give an example of how you see this working where the attributes are more tightly scoped than per-thread?

> Finally, the concern about "exceptions" is entirely misplaced. "Exception" in IEEE-754 simply means "an event that occurs when an operation on some particular operands has no outcome suitable for every reasonable application," which is a rather different meaning than the way "exception" is understood in colloquial PL usage.

Fair enough, though that introduces a new problem: how do you implement fp-exceptions if the're not language level exceptions? But maybe if a reasonable solution can be found for the rounding flags something like that would work for the exceptions too.

> I would encourage you to direct questions like these about the spec to committee members. If you work for a big company, a few members probably work with you. If you don't, most committee members are happy to answer questions, even from people they don't know.

I did, I raised some of these issues, including the licensing issue, with David Bindel a year ago.

For floating point exceptions, I'd say go with a mechanism like C's fenv.h [1]. It's out of the way, and handles rounding flags too. For if you're passing attributes around in different threads, my first thought would be (ab)using a language's type/object system to make sure that all your threads are using the same assumptions for floating point behavior; something like classes named FloatUp, FloatDown, FloatZero, FloatClose.


fenv.h appears to be implemented as a simple library, but it actually requires major compiler support. In order to make it work, the C standard added #pragma STDC FENV_ACCESS to the language itself, which is described in the link you posted. Compilers such as clang and GCC haven't implemented that pragma or the features it entails yet, so fenv.h doesn't work reliably in practice.

The underlying problem is that programming languages and compilers want to model something like "add" as an operation which has two inputs, one output, and no side effects. The need to support flags conflicts with this. Declaring that flags don't cross function boundaries or any other boundaries doesn't make the problem go away.

> AVX-512 will bring support for per-instruction static rounding to mainstream CPUs.

That is a relief – this might actually make rounding modes usable. Unfortunately, it will only be usable on very new hardware, meaning that it's pretty hard to actually use in a language. Of course, as you point out, this is more of a hardware issue than a spec issue, but ultimately, I see one of the major responsibilities of a good spec to be ensuring that compliant hardware has all the primitives necessary to use features like rounding modes effectively. In my view, IEEE 754 has failed in this area. If a new version of the spec were to fix this, we would be in great shape – in 20 years.

The wonderful thing about very new hardware is that in a few years, it's mainstream hardware, and a few years later it's essentially all hardware.

Unsurprisingly, it's the hardware manufacturers who block most new requirements on hardware that IEEE-754 might want to add. They very much want the committee to standardize existing practice, and short of reforming the membership rules, there's very little that could be done to prevent them from blocking changes.

The mutable global state problem is hard. And, its implications are woven throughout the spec. For example, pow(0, nan) is 0, not nan, because that's slightly more convenient in some cases, and it's assumed that you can always check the Invalid flag to see if any nans were produced and swallowed.

At the same time though, it's not designed that way accidentally or in ignorance of the problems it creates. IEEE-754 knew that programming languages wouldn't be very happy about global state, and chose to keep it because they believed it was still the best approach. In many other areas, IEEE-754 pushed against people who said it would be too hard to implement, and in retrospect they ended up being right in many cases. It's tempting to wonder if global mutable state really was too much of a tradeoff though, in retrospect.

> they ended up being right in many cases

For clarity, is 'they' referring to IEEE-754 or the people who said it was too hard to implement?

I meant that IEEE-754 got a lot of stuff right, from our current perspective.

Very actionable suggestions. Hopefully some of the ieee folks are reading.

Can't we make an opensource description of the IEEE754 which resolves the issues? While the spec doesn't describe the same thing it should be no legal problems - right?

How are you going to create a free version of the spec without making it derivative work? If you read the original and you paraphrase it, that's derivative work, so it would have the same copyright as the original.

This is a rather poor rant. There's pretty much an easy answer to everything he describes that's been done in the academic world, and in other programming languages for decades, and almost certainly in his own programming language for similar problems.

    Can I copy it into my own spec or would that be 
    copyright infringement? Can I write something myself  
    that means the same? Or do I have to link to the spec 
    and require my users to pay to read it? 
Giving a summary is allowed even in the most draconian of interpretations of copyright. More loosely, pulling a quote of a few lines is standard for any sort of academic exercise; essays on novels certainly don't have you pull out the novel when they want to quote a bit of text.

    What if your language doesn’t have exceptions? What 
    does it mean to provide a status flag? 
You certainly have places elsewhere in your language for handling errors. How do you handle integer divide by zero? How do you handle timeouts when doing networking? How do you handle disk full errors? I hope you don't go stomping along after an error in those cases.

    This may have made sense at the time of the original 
    spec in 1985 but it’s mutable global state – something 
    that has long been recognized as problematic and which 
    languages are increasingly moving away from.
This is because statelessness is a leaky abstraction. Your machine's saving to disk, it's allocating memory for other things, it's pulling in stuff and shooting stuff over the network. Floating point has lots of knobs that you need to turn because there are various rounding rules for various problem domains. You've almost certainly got a type system for your language, perhaps your answer is to create, or have a way to create, a floatRoundUp type for when a user needs to go beyond your default rules. In a pure functional definition, I think one way to consider the problem is to think of a floating point type as the actual number and the state registers. If you don't care, then just use the number part, if the state register matters, put those in too. Saving flags means that you can get repeatability. With a given set of flags, you'll always get the same answer.

    Program block? Attribute specification? Nothing else in 
    my language works this way. How do I even apply this in 
    practice? If attributes are lexically scoped then that 
    would mean rounding modes only apply to code textually 
    within the scope – if you call out to a function its 
    operations will be unaffected. If on the other hand 
    attributes are dynamically scoped then it works with 
    the functions you call – but dynamically scoped state 
    of this flavor is really uncommon these days, most 
    likely nothing else in your language behaves this way.
There are two ways of handling this problem. Decide for the user, or expose the functionality needed to provide scoped floating point. The most "pure" way sounds like using the aforementioned type system to give defaults, but have a way for the user to tweak the rules as needed. You're the designer, one of your jobs is to make these sorts of decisions for the user. Things like your i/o library probably have similar problems, I don't see why floating point is all that much different.

A quick explanation of my downvote: You seem to be mistaking "I know the answer" for "everybody should know the answer".

You could have written basically the same reply, but starting out with, "Hey, those are good questions. Because of my time in the academic world, I happen to know some of the answers. Let me see if I can help."

But instead you had to be a dick. That's undeniably fun, but contempt for people asking reasonable questions doesn't make them smarter; it just stops them from asking questions.

I think what set me off was certain statements that he was coming from a position of authority, like that he was a language design guy. At the same time, he didn't give any statements to back up how deeply he's researched the issues. For example, there's no discussion of C's fenv.h, or how fortran's IEEE intrinsics work. Add to that the jabs on the age of the specification, and it just ended up giving the feeling of being a much more shallow piece than it could have been; it just felt like he was saying, "I don't know what to do. Internet, decide for me."

I thought it was mainly a post about how he personally found this standard frustrating. And I started getting that impression about the time I read the title, "Why I find IEEE 754 frustrating".

The guy has a reasonable background as a language guy. He was trying to deal with the spec; he's allowed to opine on it.

That you felt things? Those are your feelings. That you think he should be obliged to address your personal concerns preemptively? That's your problem, not his. If you want to know how deeply he's researched the issues, you could say, "Hey, have you looked at C's fenv.h?" Rather than just assuming the answer that lets you be a dick.

I'm not claiming authority. What I'm saying is: there are two camps, the FP guys and the language guys, and there's not a huge amount of overlap. I belong to the language camp.

$88 is not a lot of money for a professional software person, especially if you can get your employer to pay for it.

Applications are open for YC Winter 2023

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact