Hacker News new | past | comments | ask | show | jobs | submit login
0.30000000000000004 (30000000000000004.com)
763 points by beznet 5 days ago | hide | past | web | favorite | 399 comments





The big issue here is what you're going to use your numbers for. If you're going to do a lot of fast floating point operations for something like graphics or neural networks, these errors are fine. Speed is more important than exact accuracy.

If you're handling money, or numbers representing some other real, important concern where accuracy matters, most likely any number you intend to show to the user as a number, floats are not what you need.

Back when I started using Groovy, I was very pleased to discover that Groovy's default decimal number literal was translated to a BigDecimal rather than a float. For any sort of website, 9 times out of 10, that's what you need.

I'd really appreciate it if Javascript had a native decimal number type like that.


Decimal numbers are not conceptually any more or less exact than binary numbers. For example, you can't represent 1/3 exactly in decimal, just like you can't represent 1/5 exactly in binary.

When handling money, we care about faithfully reproducing the human-centric quirks of decimal numbers, not "being more accurate". There's no reason in principle to regard a system that can't represent 1/3 as being fundamentally more accurate because it happens to be able to represent 1/5.


Money are really best dealt with as integers, any time you'd use a non-integer number, use some fixed multiple that makes it an integer, then divide by the excess factor at the end of the calculation. For instance computing 2.15% yearly interest on a bank account might be done as follows:

  DaysInYear = 366
  InterestRate = 215
  DayBalanceSum = 0
  for each Day in Year
    DayBalanceSum += Day.Balance
  InterestRaw = DayBalanceSum * InterestRate
  InterestRaw += DaysInYear * 5000
  
  Interest = InterestRaw / (DaysInYear * 10000)
  Balance += Interest
Balance should always be expressed in the smallest fraction of currency that we conventionally round to, like 1 yen or 1/100 dollar. Adding in half of the divisor before dividing effectively turns floor division into correctly rounded division.

This is called fixed-point arithmetic:

https://en.wikipedia.org/wiki/Fixed-point_arithmetic

> In computing, a fixed-point number representation is a real data type for a number that has a fixed number of digits after (and sometimes also before) the radix point.

> A value of a fixed-point data type is essentially an integer that is scaled by an implicit specific factor determined by the type.


Yeah, though that notion tends to come with some conceptual shortcomings, like presuming a power of 10 radix. In the above code the radix is implicitly different on leap years, applying such tricks is usually not possible with a fixed point library or language construct.

Sounds like fractions cleanly describe what you're saying?

But that practically holds only for a reasonable amount of simple arithmetics. Fractional components tend to grow exponential for many numerical methods repeated multiple times. This can happen if you're describing money and want to apply a complex numerical method from an economics article for whatever purpose. Might be worth it but be careful not to carry ever expanding fractions in your system.


This only for dealing with actual money, generally our banking systems have rounding rules that prevent the fractions from getting out of hand.

If you are running an economic simulation you generally don't have to worry about rounding, the whole thing is only approximate anyway.


Yup. Once worked on a big project with one of the largest US exchanges. We were migrating large OTC (over the counter) CDS (credit default swaps) contracts to standardized centralized contracts. We were testing with large contracts, millions of contracts worth trillions of dollars. I was off by a single penny and failed the test. Took a while to find, but it was due to a truncate to zero instead of a proper round. I was using a floating point type instead of a proper decimal. Dont think the language I was using had a proper decimal type at the time, though it does now, 11 years later.

>Money are really best dealt with as integers

I wish I could up vote you more than once. You are bang on.


The real lesson is, no matter what base (radix) you use, floating point math is inexact.

The value of floating point is that it can represent extremely huge or extremely infinitesimal values.

If you're working with currency / money, floating point is the wrong thing to use. For the entire history of human civilization, currency has always been an integer type, possibly with a fixed decimal point. Money has always been integers for as long as commerce has existed, and long before computers.

If you're building games, or AI, or navigating to Pluto, then floating point is the tool to use.


> The real lesson is, no matter what base (radix) you use, floating point math is inexact.

This is just not true. If you add 1.5 + 4.25 with IEEE754, there is nothing inexact or rounded. That you cannot exactly represent 0.1 in base2 FP is a problem of base2, not FP.

You get inexact results with FP math for underflows, overflows, or if you don't have enough precision for the result (or an intermediate result). But the same is true for normal integer types.


I think what that commentator meant is that floating-point math is not an accurate model of rational-number arithmetic, not that there aren't certain computations that are in fact exact. (As you point out, there are: 1.5 + 4.25 is indeed exact)

> is that floating-point math is not an accurate model of rational-number arithmetic

Well, this is true. But integer math is also not an accurate model of rational-number arithmetic, yet nobody would claim that integer math is inexact.


Unsigned integer math (on typical machines) is an exact model of the ring of integers modulo 2^64. Floating point arithmetic is not an exact model of anything with nice properties that people are used to from algebra.

> Integer math (on typical machines) is an exact model of the ring of integers modulo 2^64.

And even this is only true if you retrict yourself to unsigned integers. For signed integers you have quirks (-0x8000.. = 0x8000..) or minefields (undefined overflow semantics in C, which can yield non-associativity, tests deleted by the compiler, etc.).

And I'd argue that whoever understands the ring of integers modulo 2^64, will also understand the IEEE754 semantics (which are, I agree, sometimes unfortunate. But not inexact).


> And even this is only true if you retrict yourself to unsigned integers

Fair point. I've edited my comment to include the word "unsigned".

> I'd argue that whoever understands the ring of integers modulo 2^64, will also understand the IEEE754 semantics

I'm an existence proof that that is not true :). Although I'm sure I could learn the IEEE754 semantics if I put enough effort into reading the spec.

But even if they don't know the word "ring", I think most programmers do understand how modulo arithmetic works, and they have algebraic intuitions about it that turn out to be true: both operations are commutative and associative, multiplication distributes over addition, equality of a forumla involving * and + is true if it's true in the actual integers, and so on.


>> I'd argue that whoever understands the ring of integers modulo 2^64, will also understand the IEEE754 semantics

> I'm an existence proof that that is not true :). Although I'm sure I could learn the IEEE754 semantics if I put enough effort into reading the spec.

This was sloppy writing on my side. I wanted to say "whoever understands the ring of integers modulo 2^64, can also understand". And I'm sure you could :)

And you don't even have to read the spec. The core idea (mantissa, exponent, and sign) is super easy and writing a FP emulation for addition and multiplation is a really nice task to understand what is actually going on. The only really unfamiliar idea is binary fractions and I think this is a cool idea to understand on its own.

> But even if they don't know the word "ring", I think most programmers do understand how modulo arithmetic works, and they have algebraic intuitions about it that turn out to be true: both operations are commutative and associative, multiplication distributes over addition, equality is true if it's true in the actual integers, and so on.

Well that is all fine but scrolling back to the grand grand grand parent: That would also be a completely wrong abstraction to model financial stuff. I'm not saying FP is the solution, but for sure modulo arithmetic is also how you not want to do finance :)


I think the big difference is that integers are accurate within a well-defined range, in a way that's easy to understand. Floating points work within a much larger range, but are inaccurate in most of that range, and it's harder for people to understand why.

There's no such thing as a "problem of base2". Base 2 is an ineffable fact of the universe, and it is neither virtuous nor problematic. All the problems you are describing are problems of floating-point arithmetic.

> There's no such thing as a "problem of base2".

That you cannot represent 1/3 as a non-periodic decimal number is a problem of base 10.

That you cannot represent 1/10 as a non-periodic binary number is a problem of base 2.

These are just mathematic facts. Maybe you don't like the world "problem", but it does not change that this is where we are.

The problem that you cannot represent 0.1 in base 2 FP, is a problem of base 2. You can represent it exactly in base 10 FP.


A 32 bit floating point number can only have around 4 billion unique values, yet must represent numbers from 10^38, to very small decimals. 99.99999% of numbers in this range cannot be accurately represented in floating point form.

Compare that to a 32 bit integer, which can have 4 billion unique values, and supports numbers from 0 to 4 billion. It's a 1:1 mapping.


To be mathematically pedantic, 100% of numbers in that range cannot be accurately represented in floating point form.

> yet must represent numbers from 10^38

No, they don't must represent all number in the range. I don't know where you get from that they must. An integer also can't represent all real numbers in its range.


> Decimal numbers are not conceptually any more or less exact than binary numbers.

True but irrelevant. The problem isn't with the math fundamentals, it's the programmers.

The issue is if you get your integer handling wrong it usually stands out. Maybe that's because integers truncate rather than round, maybe it's because the program has to handle all those fractions of cents manually rather than letting the hardware do it so he has to think about it.

In any case integer code that works in unit tests usually continues to work, but floating point code passing all unit tests will be broken on some floating point implementations and not others. The reason is pretty obvious: floating point is inexact, but the implementations contain a ton of optimisations to hide that inexactness so it rarely raises it's ugly head.

When it does it's in the worst possible way. In a past day job I build cash registers and accounting systems. If you use floating point where exact results are required I can guarantee you your future self will be haunted by a never ending stream of phone calls from auditors telling you code that has worked solidly in thousands of installations over a decade can not add up. And god help you if you ever made the mistake of writing "if a == b" because you forgot a and b are floating point. Compiler writers should do us all a favour and not define == and != for floating point.

Back when I was doing this no complier implemented anything beyond 32 bit integer arithmetic, in fact there was no open source either. So you had to write a multi precision library and all expression evaluation had to be done using function calls. Despite floating point giving you hardware 56 bit arithmetic (which was enough), you were still better off using those clunky integers.

As others have said here: if you need exact results (and, yes currency is the most common use case), for the love of god do it using integers.


> If you're going to do a lot of fast floating point operations for something like graphics or neural networks, these errors are fine. Speed is more important than exact accuracy.

Um... that really depends. If you have an algorithm that is numerically unstable, these errors will quickly lead to a completely wrong result. Using a different type is not going to fix that, of course, and you need to fix the algorithm.


From your description, I fail to understand how does it depend. You're saying that the algorithm is wrong, and changing the type doesn't help. If the type is not the issue, what difference does it make?

A single problem can be solved by using many different algorithms.

However, even though algorithm A and B are "correct" they can behave differently when rounding errors are introduced.

For example – if algorithm A uses

https://en.wikipedia.org/wiki/Kahan_summation_algorithm

and B uses naive summation then you can expect the end result of A to be more precise than the end result of B – even though both algorithms are correct.


In the world of money, it is rare to have to work past 3 decimal places. Bond traders operate on 32nds, so that might present some difficulties, but they really just want rounding at the hundreds.

Now, when you’re talking about central bank accruals (or similar sized deposits) that’s a bit different. In these cases, you have a very specific accrual multiple, multiplied by a balance in the multiple hundreds of billions or trillions. In these cases, precision with regards to the interest accrual calculation is quite significant, as rounding can short the payor/payee by several millions of dollars.

Hence the reason bond traders have historically traded in fractions of 32.

A sample bond trade:

‘Twenty sticks at a buck two and five eights bid’ ‘Offer At 103 full’ ‘Don’t break my balls with this, I got last round at delmonicos last night’ ‘Offer 103 firm, what are we doing’ ‘102-7 for 50 sticks’ ‘Should have called me earlier and pulled the trigger, 50 sticks offer 103-2’ ‘Fuck you, I’m your daughter’s godfather’ ‘In that case, 40 sticks, 103-7 offer’ ‘Fuck you, 10 sticks, 102-7, and you buy me a steak, and my daughter a new dress’ ‘5 sticks at 104, 45 at 102-3 off tape, and you pick up bar tab and green fees’ ‘Done’ ‘You own it’

That’s kinda how bonds are traded.

Ref: Stick: million Bond pricing: dollar price + number divided by 32 Delmonicos: money bonfire with meals served


I'm curious about the "off tape" part. Presumably this means not on a ticker or not made public somehow - how are these transactions publicized and/or hidden?

Hear, hear! It would be great if javascript had any integral type that we could build decimals, rationals, arbitrarily-large integers and so on off. It’s technically doable with doubles if you really know what you’re doing, but it would be so much easier with an integral type.

ES does have an arbitrarily large integer type, BigInt.

https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe...


It’s not supported everywhere though, so it’s not like you could use it to actually build a library, you would need to use something that fell back to Doubles anyway.

Because the double type can guarantee accurate reproduction of values up to the size of its mantissa (52 bits) you can effectively use than as integers up to that size. It would be nice to be able to just have an integer directly though as that would be more efficient

IIRC some JS engines are capable of detecting many circumstances where floating-point is not needed, particularly for simple cases like loop counters, and their JiT compilers will produce code that uses integer values instead of floats for those purposes - but how reliable that is for cases any more complex than that I don't know.


Though the lack of support in IE, current Edge, and Safari, blocks that from client-side use for many.

There are several BigInt libraries out there that you could use, though obviously this is not as convenient and even if they wrap BigInt when available will be less efficient.


Latest Edge dev preview has supported it since the switch to Chromium. The Chromium-based Edge launches on Jan 15th, at which point Edge will support it.

Safari (WebKit) actually has a fully working implementation, they just haven't shipped it yet. Search the release notes for "BigInt": https://developer.apple.com/safari/technology-preview/releas...


How is a true integer easier than just pretending a double is an integer? In both cases, you have to be aware of the range of values they can hold to prevent overflow (integers) or rounding (doubles), and you have to be careful not to perform operations that aren't valid for integers to avoid truncation (integers) or non-zero decimal places (doubles).

'Decimal' is a red herring. The number base doesn't matter. (And what are you going to do when you need currency coversions, anyways?)

Floats are a digital approximation of real numbers, because computers were originally designed for solving math problems - trigonometry and calculus, that is.

For money you want rational numbers, not reals. Unfortunately, computers never got a native rational number type, so you'll have to roll your own.


Historically, it's correct-but-too-vague to say computers were for "solving math problems". Historic computer problems should be divided into two types: business problems and scientific/engineering problems. Business problems include things like tabulation and accounting. Programmable digital computers go back at least as far as UNIVAC I, in 1951 (using programmable digital computers for science doesn't go back THAT MUCH farther).

Prior to the IBM/360 (1964), mainframes sold for business purposes generally had no support for floating point arithmetic. They used fixed-point arithmetic. At the hardware level I think this is just integer math (I think?), but at a compiler level you can have different data types which are seen to be fractions with fixed accuracy. I believe I've read that COBOL had this feature since I-don't-know-how-far-back.

This sort of software fixed-point is still standard in SQL and many other places. Some languages, and many application-specific frameworks, have pre-existing fixed-point support. So it's also not accurate to say that you necessarily need to roll your own, though certainly in some contexts you'll need to.

And for money, you very much do not want arbitrary rational numbers. The important thing with money is that results are predictable and not fudgable. The problem with .1 + .2 != .3 is not that anyone cares about 4E-17 dollars, it's that they freak out when the math isn't predictable. Using rationals might be more predictable than using floats, but fixed-point is better still. And that's fixed-point base-10, because it's what your customers use when they check your work.


Agree that rational isn't it. But "reproducing the existing quirks" seems like an accurate description. If you want to pay 7% APR on month-end balances, then that's a real-number calculation, but to match what customers expect you need in addition to specify when to round off to cents.

I enjoy Haskell's approach to numbers.

The type of any numeric literal is any type of the `Num` class. That means that they can be floating point, fractional, or integers "for free" depending on where you use them in your programs.

`0.75 + pi` is of type `Floating a => a`, but `0.75 + 1%4` is of type `Rational`.


Hm... what happens if you've got a neural network trained to make decisions in the financial domain?

Is there a way to exploit the difference between numeric precision underlying the neural network and the precision used to represent the financial transactions?


Neural networks are by their very nature a bit vague, random and unpredictable. Their output is not suitable as a direct, real monetary value you can rely on. At best, they predict trends, approximations or classifications.

> I'd really appreciate it if Javascript had a native decimal number type like that.

Was proposed in the late 90's Mike Cowlishaw but the rest of the standards committee would have none of it.


A new proposal for adding arbitrary-precision Decimal support to JavaScript is being presented at TC39 this week.

Proposal: https://github.com/littledan/proposal-bigdecimal

Slides: https://docs.google.com/presentation/d/1qceGOynkiypIgvv0Ju8u...


I'd agree for saner defaults, especially in web development. I can understand that if you want to have strictly one number type it may make sense to opt for floating point to eke out the performance when you do need it, but I'd rather see high-precision as the default (as most expect that you'd be able to write an accurate calculator app in JavaScript without much work) and opt-in to the benefit of floating point operations.

MS Excel tries to be clever and disguise the most common places this is noticed.

Give it =0.1+0.2-0.3 and it will see what you are trying to do and return 0.

Give it anything slightly more complicated such as =(0.1+0.2-0.3) and this won't trip, in this example displaying 5.55112E-17 or similar.


Are you sure it is not showing the exact answer because the the the cell precision set to a single decimal digit?

Yup: https://i.imgur.com/VuawaE1.png, on Excel v1911 (Build 12228.20332).

Kahan (architect of IEEE 754) has a nice rant on it:

https://people.eecs.berkeley.edu/~wkahan/Mind1ess.pdf

(and plenty of other rants...:

https://people.eecs.berkeley.edu/~wkahan/ )


I remember in college when we learned about this and I had the thought, "Why don't we just store the numerator and denominator?", and threw together a little C++ class complete with (then novel, to me) operator-overloads, which implemented the concept. I felt very proud of myself. Then years later I learned that it's a thing people actually use: https://en.wikipedia.org/wiki/Rational_data_type

An other compromise in to use fixed point which is effectively a rational with a fixed denominator. Extremely popular on machines which can handle integer arithmetics but not floating point (since you can trivially do fixed-point arithmetics using integer operations, you just need to be very careful when you handle overflows). If you look at the code of old school games (including classics like Doom if memory serves) the game engine used fixed-point to work on commodity hardware without FPU.

There's also BCD (binary coded decimal) that can solve some problems by avoiding the decimal-to-binary conversions if you're mainly dealing with decimal values. For instance 0.2 can't usually be represented in binary but of course it poses no problem in BCD.


Beware that BCD, and decimal in general, accumulates roundoff error at a much higher rate than binary, if you do any inexact operations.

It is more common these days to use base-1000, instead, when you need exact decimal representations. You can fit three base-1000 "digits" in a 32-bit word, with two bits left over for sign plus any other flag you find useful. (One such use could be to make a zero in the second place indicate that the rest of the word is actually binary; then regular arithmetic works on such words.) Calculations in base-1000 are quite a lot faster than BCD.

Almost always when people think they need decimal, binary -- even binary floating-point, if the numbers are small enough -- is much, much better. Just be sure to represent everything as an integer number of the smallest unit, say pennies; and scale (*100, /100) on I/O.


"Much, much better" in what sense? Just performance?

Performance, correctness, and maintainability. The amount of code needed is very small, and uses native instructions for the work, which are pretty well-tested.

Fixed/floating is an interesting tradeoff for many real-time strategy games too where changes in game state are a synchronized simulation. Fixed point math in software can give more reliable and cross-platform math operations, but with a performance cost (eg: Homeworld: Deserts of Kharak). Using the CPU's floating-point hardware is faster, but you often have to ensure the correct CPU registers are set before doing calculations and those registers can be changed by other software such as a DirectX driver or the operating system (eg: Age of Empires II, Rise of Nations. etc).

I currently build deterministic multiplayer WebGL games in Unity, built via C#->IL2CPP->Emscripten->WASM. The server is the same code base running on Microsoft's .Net runtime.

The chances of being able to run deterministic floating point calculations across this stack is basically zero (even leaving aside that the games are often run on ARM chips), and so we use this library when floats are absolutely necessary (but more often just plain longs):

https://github.com/asik/FixedMath.Net

It is a little terrifying that e.g. normalizing a vector involves a while loop, but all things considered the whole thing runs surprisingly well.

(I agree with everything in your post, just thought I could add a real world field report)


We also built and shipped a deterministic multiplayer WebGL game[1], but using CoffeeScript[2] + C++ -> Emscripten/dylib/DLLs to run the game in the browser and on Windows and Mac.

Our game would snapshot the entire game state every few seconds and send that back to server to detect desyncs and cheaters. Floating point math, to our astonishment, was not the source of any non-determinism.

I'm 80% sure that only source of non-determinism we encountered were from trig functions, so we just hard-coded lookup tables.

1: https://guardiansofatlas.com/

2: It was 2012 when we started.


You use that library when you want fractional values right? That is, numbers with a binary point but not floats.

For the most part, I use longs (for instance a FixedVec is a (long,long,long) struct where 1 = 1/1000 of a meter).

However, complicated calculations or anything involving angles or other math functions quickly becomes more convenient when expressed as a Fix64, which is more or less a drop in replacement for float.

I would ideally use Fix64 everywhere, but given the torturous route the C# takes to be transformed into something that's executed on the client machines, my faith in the compiler's ability to generate good code for that is basically zero. I mentally treat long + long as a single instruction, but Fix64 + Fix64 as a function call.


That's rough, fortunately for my own projects I'm only doing Unity on desktop, so I haven't had to go this far.

Even something simple like multiplying up and dividing down quickly adds a lot of overhead, and when running on mobiles you really need all the speed you can get.


> There's also BCD (binary coded decimal) that can solve some problems by avoiding the decimal-to-binary conversions if you're mainly dealing with decimal values. For instance 0.2 can't usually be represented in binary but of course it poses no problem in BCD.

BCD is/was super common in measurement equipment for internal calculations for this reason, and also because it is trivial to format for display (LED/LCD/VFDs) or text output (bus system, printer/plotter).


Many CPUs support BCD, at least in a limited number of ways compared to their normal binary representation.

The 8086 (and its descendants, of course) supports BCD by having instructions to adjust the result after the basic add/sub/mul/div instructions, though only one byte at a time.

The 6502's add and subtract instructions would operate on, and output, BCD values if the special purpose "decimal" flag was set. Again only in 8-bit (two digit) chunks but that is to be expected as it was an 8-bit chip generally.


It's actually in use in many places, for things like handling currency and money, and for when you get funny corner cases involving rounding such numbers and pooling the change.

Whenever I see someone handling currency in floats, something inside me wither and die a small death.


> Whenever I see someone handling currency in floats, something inside me wither and die a small death.

Meh. When used correctly in the right circumstances it is acceptable to use floats.

Here's an example. Suppose you are pricing bonds, annuities, or derivatives. All the intermediate calculations make essential use of floating point operation. The Black–Scholes model for example requires the logarithm, the exponential, the square root, and the CDF of the normal distribution. None of that is doable without floats.

Even for simpler examples it is sometimes okay to use floats. If you only ever need to store an exact number of cents, you can totally store the number of cents in a double. Integer operations are exact using IEEE-754 double operations when they are smaller than 2^53-1 or so. There's usually no benefit of doing so, but hey it's possible.


Currency handling is almost never done with rationals (numerator and denominator) and is frequently (and correctly so!) done with fixed or floating point decimal types.

I develop accounting software for banks, brokerage houses and likes.

Currency, taxes, rebates, etc. handling is NEVER done with floating point.

Whatever you do with money you need predictable, reproducible results. It is norm that calculations are checked by software at two companies on both sides of transaction. Any discrepancies are alarms, bug reports, unhappy customers.

Every significant operation is exactly specified with rounding rules, etc.

For card payments and especially on terminals usually BCD is used.

For everything else usually some kind of arbitrary length decimal library (BigInteger, BigDecimal).


> Currency, taxes, rebates, etc. handling is NEVER done with floating point.

Nonsense. I’ve seen real banking code at reputable banks that uses floats.

> Whatever you do with money you need predictable, reproducible results.

Floats aren’t random. They’re perfectly deterministic, predictable and reproducible. If you do the same operation in two places you get the same result.


I write real banking code. There is definitely a banking code that uses floats, e.g. valuation of financial instruments. The parent comment talks about software that does transactions and “simpler” calculations, like taxes and fees etc.

When people talk about non-determinism of floating point, what they usually mean is non-associativity, that is (x+y)+z may not be exactly equal to x+(y+z).


> When people talk about non-determinism of floating point, what they usually mean is non-associativity, that is (x+y)+z may not be exactly equal to x+(y+z).

Good example of this, in Python 3:

    >>> (0.1 + 0.2) + 0.3
    0.6000000000000001
    >>> 0.1 + (0.2 + 0.3)
    0.6

Every single time you run those two statements, you’ll get the same result. Yes they're non-associative. But that's specified and documented. That's not the same thing as non-deterministic in any way.

Yet, in accounting, you are expected to be able to sum a set of numbers in different ways and still get the same result

Yes, sorry, I was just intending to highlight non-associativity :) I agree it's not "non-deterministic".

The same code might be optimised in different ways by different compilers, though (or the same compiler with different flags). This might lead to different results for the same code. In that sense, it's non-deterministic.

> The same code might be optimised in different ways by different compilers, though

It's not an optimisation if it changes the result! And if you use non-standard flags that's your problem.


What is and what is not optimization and what changes are allowed or not depends on the application.

MP3 is an optimization of WAV, yet it changes the result.

Some applications are ok with reducing precision of calculations because they are not sensitive enough to small inaccuracies or they take effort to control inaccuarcies.

For example, graphics applications are typically heavy in FP calculations and yet they tend to not care much about precision and much more about performance. For those applications reducing accuracy for slight performance increase is likely win.


> Floats aren’t random. They’re perfectly deterministic, predictable and reproducible. If you do the same operation in two places you get the same result.

That's not exactly true in real hardware, or at least it wasn't until ~10 years ago. With the x87 FPU, internal precision was 80 bits, while the x86 registers were at most 64 bits. So, depending on the way the program would transfer data between the CPU and FPU your could get different results. It is very likely that different compilers and different optimization decisions could change the way these operations were implemented, so you would get slight differences between different versions of the software.

There are/were also several global FP flags that could get changed by other programs running on the same CPU/FPU that could impact the result of calculations. So, if you want 100% reproducible FP, you would have to either audit all software running on the same machine to ensure it doesn't touch those flags, or set the flags yourself for every FP calculation in your your program.


In a language like Java, all these factors are specified and fully deterministic.

False. Floating-point arithmetic in Java is generally nondeterministic. You will notice that the strictfp keyword exists and is off by default.

It's not false - strictfp mandates deterministic FP. If you use that your program will always run all floating point calculations in exactly the same way, full stop.

Secondly, on mainstream implementations, strictfp is already documented the same as default! They're planning to remove it anyway as it's a no-op in almost all cases.

See JEP 306.


> It's not false - strictfp mandates deterministic FP. If you use that your program will always run all floating point calculations in exactly the same way, full stop.

If you use it. Which is not the default. Your original claim remains false.


It does not matter. When you are doing accounting you are supposed to be able to sum large collections of numbers and get the same result regardless of the order.

That's something FP does not provide and it makes it completely unusable for accounting.


> regardless of the order.

That seems like a completely arbitrary requirement. Do accounting laws prohibit sort? Does 1 + 1 have to equal green on Tuesdays?


It seems you have no idea what double-side accounting is.

Each operation is accounted on two opposite sides of various account in a way that always keeps sides balanced (ie. they must sum up to the same value).

When you go to your bank account, for example, you have various sums on both sides of your account. Yet when you sum them up they MUST agree or you will be crying blood and suing your bank.


True, that's a good point. I was thinking of C & C++, but you're right, newer languages do a much better job of specifying and controlling this behavior.

Wonder if JS does something similar or not.


All major C/C++ compilers implement IEEE754. If you are telling the compiler to disregard it, that is on you.

It's not about IEEE754, it's about the precision that the FP co-processor offers. The results you get are correct per IEEE754, it's just that they may have even less error than required by IEEE754 in some cases. But, this is enough to make the results non-deterministic between different compilation options.

Also, changes applied to the FP co-processor by other processes on the machine could impact your process, regardless of your own compilation settings.


Are you talking about x87?

That's ancient history. Compilers don't use that instruction set any more in normal operation.

GCC, Java, LLVM, etc, will normally emit SSE2 in order to be standards compliant. They will only relax this if you tell them to, then it's your problem.


Yes, I was explicitly talking about the x87, and did mention that it has stopped being relevant for at least 10 years.

I believe there is still quite a bit of cautionary discussion of floating point numbers that was written in the age of the x87, so it's important to understand that people were not just misunderstanding IEEE754, even though their concerns are no longer applicable to modern hardware.


I did not say floats are random. But when you do accounting you need to be able to sum large sets of numbers and compare results with another sum of different numbers and the sum must match. This just does not work with FP.

Poor souls that use FP for accounting are scourge of the industry and source of jokes.


That's what I used to think, then I met these banking types, and they told me 'no we understand their semantics and we use them correctly and we know it is safe for our programs.' These teams have compiler experts on them - they aren't ignorant.

I started working on accounting software in 2002 and right now work for Citi. Compiler experts in accounting? If you are doing HFT you are not doing accounting. Accounting is what happens later when all those transactions need to actually be accounted for and balance calculated

If you rely on compiler implementations for accounting, you're already lost.

For anything imprecise and scientific, doubles will normally work well.

Accounting rules regarding truncation and rounding as specified, seems unaccounted for by most until they meet such stringent reqs.


You're confusing foreign exchange conversion with accounting arithmetic. Two different things.

This is false. It's not correct to handle currency with floating point types.

I don't see any problem with it if it's decimal. Here's an accepted answer on stack overflow with hundreds of upvotes recommending the use of `decimal` to store currency amounts in C#. That's a decimal floating point type.

https://stackoverflow.com/a/693376/44743


They said floating point decimal types which probably means BCD.

There are different implementations, and BCD is only one of them. Another popular one is a mantissa and exponent, but the exponent is for a 10-based shift rather than the typical floating point.


Tbey mean radix-10 floating point, as compared to the radix-2 floating point you are thinking of. The packing of the decimal fractional digits in the significand of a radix-10 FP number need not be in BCD, it can use other encodings (e.g., DPD or something else).

0.3 is exactly representable in radix-10 floating point but not radix-2 FP (would be rounded to a maximum of 0.5 ulp error as seen in the title), for instance, just as 1/3 = 0.3333... is exactly representable in radix-3 floating point but neither radix-2 or radix-10 FP, etc.


Right, it is not correct. But many programs do it wrong. If you just do a couple of additions the problem will never be noticed. It's easy to write a program that sums up 0.01 until the result is not equal to n * 0.01. Not at my computer now, so I can't do it again. I remember n was bigger to be relevant for any supermarket cashier. But of course applications exist where it matters.

But it is correct.

> It's easy to write a program that sums up 0.01 until the result is not equal to n * 0.01.

It's not easy to do that if you use a floating point decimal type, like I recommended. For instance, using C#'s decimal, that will take you somewhere in the neighborhood of 10 to the 26 iterations. With a binary floating point number, it's less than 10.


Of course with a decimal type there is no rounding issue. That's not what 0.30000000000000004 is about.

Many languages have no decimal support built in or at least it is not the default type. With a binary type the rounding becomes already visible after 10959 additions of 1 cent.

  #include <stdbool.h>
  #include <stdio.h>
  #include <string.h>
  
  bool compare(int cents, float sum) {
    char buf[20], floatbuf[24];
    int len;
    bool result;
    
    len = sprintf(buf, "%d", cents / 100 ) ;
    sprintf(buf + len , ".%02d" , cents % 100 ) ;
    sprintf(floatbuf, "%0.2f", sum) ;
  
    result = ! strcmp(buf, floatbuf) ;
    if (! result)
      printf( "Cents: %d, exact: %s, calculated %s\n", cents, buf, floatbuf) ;
    return result;
  }
  
  int main() {
    float cent = 0.01f, sum = 0.0f;
  
    for (int i=0 ; compare(i, sum) ; i++) {
      sum += cent;
    }
    return 0;
  }
Result:

  Cents: 10959, exact: 109.59, calculated 109.60
This is on my 64 bit Intel, Linux, gcc, glibc. But I guess most machines use IEEE floating point these days so it should not vary a lot.

That is simply not true. The C# decimal type doesn't accumulate errors when adding, unless you exceed its ~28 digits of precision. E.g. see here: https://rextester.com/RMHNNF58645

> unless you exceed its ~28 digits of precision

Precisely. That's why I specified ~ 10^26 addition operations.


It's not correct, but it happens anyway, even in large ERP systems that really should know better but somehow don't.

It is correct! Using decimal types is the widely recommended way of solving this problem. That includes fixed and floating point types. The problem is using base-2 floating point types, since those are subject to the kinds of rounding errors in the OP. But decimal floating point types are not subject to these kinds of rounding errors.

But they still can't precisely represent quantities like 1/3 or pi.


> Using decimal types is the widely recommended way of solving this problem.

No, it's not. The widely recommended way of solving this problem is to use fixed-point numbers. Or, if one's language/platform does not support fixed-point numbers, then the widely recommended way of solving this problem is to emulate fixed-point numbers with integers.

There is zero legitimate reason to use floating-point numbers in this context, regardless of whether those numbers are in base-2 or base-10 or base-pi or whatever. The absolute smallest unit of currency any (US) financial institution is ever likely to use is the mill (one tenth of a cent), and you can represent 9,223,372,036,854,775,807 of them in a 64-bit signed integer. That's more than $9 quadrillion, which is 121-ish times the current gross world product; if you're really at the point where you need to represent such massive amounts of money (and/or do arithmetic on them), then you can probably afford to design and fabricate your own 128-bit computer to do those calculations instead of even shoehorning it onto a 64-bit CPU, let alone resorting to floating-point.

Regardless of all that, my actual point (pun intended) is that there are plenty of big ERP systems (e.g. NetSuite) that use binary floating point numbers for monetary values, and that's phenomenally bad.


It's not correct, but in many cases it's plenty accurate

If you are dealing with other people’s money, the only accurate is accurate. Close enough should not be in any financial engineer’s mindset, imho.

In this case, it's both. Decimal floating point types do not lose precision with base-10 numbers, unless using trig, square roots, arbitrary division and the like.

> arbitrary division

Like commonly happens doing financial calculations, especially doing interest calculations.


It's not always terrible. I've seen doubles appropriately used in cases where performance was paramount, and floating point error was either not relevant or less important.

That said, yeah, when working with money in situations where money matters, some sort of decimal or rational datatype should be the rule, not the exception.


Storing money in floating point is always terrible. If speed is an issue, store it in integer types representing the smallest unit in the currency, e.g. pennies.

Unless you’re doing, what, massively parallel GPU algos on batches of independent amounts? But even then you could use the float as an int in that way... Honestly when is float ever actually good for money? Not for speed, not for correctness, ...


I think you mean that storing money in floating point is always terrible for accounting. Not all of finance is accounting.

Imagine you work at a hedge fund, and you have a model that predicts the true value of some option. Assume the option is trading for $3.00. You do not really care if your model spits out $3.5 or $3.5000000001, you are going to buy either way. And your model probably involves a bunch of transcendental functions or maybe even non-deterministic machine learning, so it's not really meaningful to expect it to be “exact” to some decimal or even rational value.

Even more saliently, you probably don't care whether your model outputs 2.9999999 or 3.000000 or 3.000001, either, because in any of those cases the actual correct interpretation is “we’re just not sure whether to buy or not”.

I think a good first-order characterization of domains where floating point can safely be used is “when the difference between < and <= is not very meaningful” (in calculus terms: when “how meaningful is a difference of `x`” is a continuous function of `x`).


I think the "floating point are bad for storing currencies" is one of the most common misconception about floating point.

Most people don't realize that the IEEE-754 single precision floating point represent real numbers with 9 decimal digits (or 23 binary digits). The double, on the other hand, represents the real numbers with 17 decimal digits.

This means that the double error UPPER BOUND is (0.00000000000000001)/2 per operation. But in reality the error is lower because of the rounding operations.

Also, it is posssible to extend the range using denormals, but most (all?) compilers disable them when compiling with anything other than O0 to avoid performance degradation.

The overheads associate with dealing with non-float types for most applications might not be worth it the cost and risk. If course, if the language are working with provides a currency type, go for it. But if doesn't , there is no need to worry.


> Most people don't realize that the IEEE-754 single precision floating point represent real numbers with 9 decimal digits (or 23 binary digits). The double, on the other hand, represents the real numbers with 17 decimal digits.

No, they don't. They merely can be converted back to decimal with those numbers of significant digits without loss of information.

That is important because (a) if this matters, you have to make sure you actually control the number of significant digits when converting to decimal, or you might end up with a different decimal, and (b) the operations that you do on the floats do not reliably behave as if there was the supposedly represented decimal number stored in them.

Now, sure, you can use floats for currency, if you know what you are doing, but the point of the warning against it is that you have to know what you are doing, and chances are you don't, or if you do, then you know where you can ignore it anyway.

(That is, unless you mean nothing more than that you can encode the information contained in an n-digit decimal in a float/double--which of course is true, but not particular to floating point numbers, as any state with a certain number of bits can, of course, encode any information of no more than that many bits, somehow.)


In a previous discussion, someone was worrying about using floats to represent price in JS. I think this is a consequence on the fear mongering on using floats to store currencies.

Floating Points are hard. There is a study done with academics that shows that even researches that works with float point everyday forget about the format intricacies. And the study didn't even look into the compiler mess.

But I agree with you, some (a lot?) of caution is needed when working with float point is good.


World GDP is around 87 trillion dollars.

    $ ruby -e 'pp 87e12 + 0.01'
    87000000000000.02
If you're certain that your software will never handle national-economy-scale or hyperinflationary use cases, then sure, you may be able to get away with 64-bit floats, but I think "no need to worry" is overstating your case. Please do worry about precision until you've proven you don't need to.

You probably want some smaller unit than a dollar for currency as well, in which case it becomes of a problem with even smaller amounts.

I really see no reason to use any other representation for currency than decimal fixed point. Store the amount as mils or whatever unit suits your use case.


Yeah, I should had be more careful with my words.

For the vast majority of person, there is no need to worry so much about using fp to represent currencies. There are other issues with float that will bite you in your back before precision became one of them.


Depending on context, you can assume that precision will bite you.

The problem is that rounding is kind of a big deal in certain financial contexts, and the process of rounding can greatly magnify floating point's decimal precision problems when you're dealing with numbers that are close to the .5's.

When I said up above that there are some contexts where IEEE floats are fine, those contexts are largely ones where you never have to round, or where you can guarantee that an accountant is never going to see or care how you rounded. So, to an approximation: Go ahead and fearlessly implement the Black-Scholes formula using doubles, but never, ever use them to do something simple like calculating an invoice.


Fun, tangential anecdote:

I worked with a CSV containing, among other things, phone numbers. A coworker called and complained that the phone numbers were all wrong. He'd edited the thing in MS Excel, which promptly converted the phone numbers to floating point with a loss in precision. When he saved it, those new numbers were happily written back to the disk.


I agree with your overall point: it most likely does not matter when the values are close enough. However :)

There can be two companies with 100M market cap. Corp A has issued 10M shares @ 10 each, Corp B has 10B shares priced at 0.01

A +/-0.001 change in Corp A share price is just 0.01% and moves the market cap by +/-10k, so probably nothing significant. The same nominal change in Corp B amounts to 10%, or +/- 10M in the company value, which is quite a big deal.

Also I think there may be some money to be made in changes at the 7th decimal place with large enough volume of high frequency transactions.


Because of the way floating point numbers work, you'd get an accurate amount for both cases, as it's really the number of significant figures, not decimals.

And that roughly captures the spot where I was seeing doubles used.

Yes, they could have used fixed point. I am guessing that what happened is that someone who had thought way more deeply about this than I ever needed to (I worked on the accounting side, where, yep, we always used decimals) either determined that, where the modeling was concerned, floating point errors were not worth worrying about, or estimated that the expected cost to the company stemming from bugs due to to fixed point math being easier to goof up on would have been smaller than the expected cost to the company due to floating point error.


To see 0.1 error using _double_ you have to do at least 2*10^17 operations (assuming the worst case scenario and no subnormals).

If you are working with such huge numbers, 0.1 cents is probably a cost you are willing to pay to avoid expending thousands in a software solution. The saving with power saving using a floating point is likely greater than power your computers will have to expend to get a precise solution.


You can get a larger error than that using one operation.

  fn main() {
      let x: f64 = 9007199254740992.0;
      assert_eq!(x + 1.0, x);
  }

You are absolutely right.

When adding numbers with large magnitudes differences (around 10^17 I think) it might exceed the format precision. I should have taken that in account when defining the error boundaries.

In dollars, you start having issues with cents when working with a tens of trillions.

For the vast majority of people this won't be an issue.


I can give you 1.0 error. Take a handful of numbers that add up to 1.5, sum them, and then round that result to the nearest unit.

I'm too lazy to figure out a specific example, but sets of numbers where doubles round up and decimals round down (or vice versa) aren't terribly uncommon.


My day job is high performance financial model implementation. Floats storing dollar amounts are the norm for predictions. Operating on values that are linear combinations of integer fractions multiplied by irrational constants (such as Euler’s number) is perfectly possible, but it’s much more performant to be aware of floating point epsilon when writing modeling code.

Financial models are predictive, they don't have to be accurate to a penny, right? Unlike processing actual money people own.

(I do some work with predictive simulations about money, but outside finance, and there we care that the result has accurate order of magnitude. Floats were used extensively in the project; I actually upgraded them to doubles for the sake of handling larger order of magnitude spans.)


That’s right. The trading desk also uses floats for analysis and regulatory reporting. Actual account balances come through an API that gives us floats, but rumor has it that it’s backed by Hollerith a punch card library maintained by cybernetic undead, encoded in 1215-EBCDIC-BLACKTONGUE.

I stand corrected, thanks for this example.

> If speed is an issue, store it in integer types representing the smallest unit in the currency, e.g. pennies

More typically, mills[1] (tenth of a cent).

[1]: https://en.m.wikipedia.org/wiki/Mill_(currency)


Amazon's EC2 hourly prices are rounded to mils ($0.011/hour).

https://aws.amazon.com/emr/pricing/

Azure has some hourly prices with ten-thousandths of a cent ($0.0102/hour):

https://azure.microsoft.com/en-ca/pricing/details/virtual-ma...

Microsoft should use gas station 9/10 pricing conventions to just barely undercut Amazon's lowest price $0.011 with $0.0109.

https://www.marketplace.org/2018/10/11/why-do-gas-prices-end...

>“They found out that if you priced your gas 1/10 of a cent below a break point, let’s say 40 cents a gallon, ‘.399’ just looked to the public like 39 cents…”


Tarsnap goes as low as counting attodollars. Yes, that's 10^-18 dollars, judging by the precision with which individual line items and total account funds are reported. Storage price is 250 picodollars per byte-month.

If it’s not possible to charge such amounts, what exactly is the point of the accuracy?

They're usually charging you for a shitload of them!

Tarsnap is prepaid.

"Tarsnap's author is a geek." ;)

https://www.tarsnap.com/picoUSD-why.html


"when it internally converts storage prices from picodollars per month to attodollars per day, it rounds the prices down to benefit the customer."

A gentleman and a scholar.


Storing money in floating point is fine. Just round to the nearest atomic unit when displaying. Sometimes this is a necessity when working with money in e.g. existing JSON APIs. You lose a few bits of range relative to fixed point storage but it's almost never a practical issue.

Performing arithmetic operations against money in floating point is the dangerous part, as error can accumulate beyond an atomic unit.


> Performing arithmetic operations against money in floating point is the dangerous part, as error can accumulate beyond an atomic unit.

A good example of this is trying to compute the sales tax on $21.15 given a tax rate of 10%. The exact answer would be $2.115, which should round to $2.12.

IEEE 64-bit floating point gives 2.1149999999999998, which is hard to get to round to 2.12 without breaking a bunch of other cases.

Here are three functions that try to compute tax in cents given an amount and a rate, in ways that seem quite plausible:

  def tax_f1(amt, rate):
    tax = round(amt * rate,2)
    return round(tax * 100)
  
  def tax_f2(amt, rate):
    return round(amt*rate*100)
  
  def tax_f3(amt, rate):
    return round(amt*rate*100+.5)
On these four problems:

   1% of $21.50
   3% of $21.50
   6% of $21.50
  10% of $21.15
the right answers are 22, 65, 129, and 212. Here are what those give:

  tax_f1:  21  65 129 211
  tax_f2:  22  64 129 211
  tax_f3:  22  65 130 212
Note that none of the get all four right.

I did some exhaustive testing and determined that storing a money amount in floating point is fine. Just convert to integer cents for computation. Even though the floating point representation in dollars is not exact, it is always close enough that multiplying by 100 and rounding works.

Similar for tax rates. Storing in floating point is fine, but convert to an integer by multiplying by an appropriate power of 10 first. In all the jurisdictions I have to deal with, tax rate x 10000 will always be an integer so I use that.

Give amt and rate, where amt is the integer cents and rate is the underlying rate x 10000, this works to get the tax in cents:

  def tax(amt, rate):
    tax = (amt * rate + 5000)//10000
    return tax
I'm not fully convinced that you cannot do all the calculations in floating point, but I am convinced that I can't figure it out.

> IEEE 64-bit floating point gives 2.1149999999999998, which is hard to get to round to 2.12 without breaking a bunch of other cases.

Your issue is on how to print the float, not with the precision of fp. For instance, `21.15 * 0.1` can be print both as 2.115 or 1.12 depending on how many decimal digits of precision you set your print function. I manage to get those results with printf using `%.3f` and `%.2f`, respectively.

To produce one cent (0.0x) error with the default FP rounding, it takes more than 1 Quadrillion of operation. Each operation can only introduce 1*10^17/2 error.

The "you shouldn't be using float to do monetary computation" is likely one the most spread float point misinformation.

The issues with your others examples is that you are rounding the data (therefore, discarding information). If you don't do any manual round, the result should be correct (I haven't test thought).


> Your issue is on how to print the float, not with the precision of fp. For instance, `21.15 * 0.1` can be print both as 2.115 or 1.12 depending on how many decimal digits of precision you set your print function. I manage to get those results with printf using `%.3f` and `%.2f`, respectively.

I get 2.115 with %.3f and 2.11 with %.2f. Here's my test program. Same result on my Mac with clang and my Debian 8 server with gcc.

  #include <stdio.h>
  
  double tax_on(double amt, double rate);
  
  int main(void)
  {
      double amt = 21.15;
      double rate = 0.1;
      double tax = tax_on(amt, rate);
      printf("%.3f\n", tax);
      printf("%.2f\n", tax);
      return 0;
  }
  
  double tax_on(double amt, double rate)
  {
      return amt * rate;
  }

The thing is that if 2.115 represents a calculated dollar figure, such as the value of some transaction or the cost of something or whatever, then we should round it to 2.12. (Unless we are working in a financial domain that deals with fractions of a cent.) Now in floating-point, we don't exactly have the exact value 2.12, but we have something that is extremely close. So close that if we happen to print it to %.3f, we better get 2.120, and if we print it to %.4f, we better see 2.1200.

That some monetary calculation works out to $2.115 (and is left that way) instead of being correctly rounded $2.12 doesn't add up to a valid argument against using floating-point for money.

I think piadodjanho does have a point there in the grandparent comment; "don't use floating-point for money" may just be a repeated mantra that doesn't entirely hold water. If extremely accurate engineering and scientific calculations can be done with floating-point, surely we can get floating-point values to measure stacks of pennies with the proper care in the programming.


> If extremely accurate engineering and scientific calculations can be done with floating-point, surely we can get floating-point values to measure stacks of pennies with the proper care in the programming.

That was for a long time my position. I definitely have commented before either here or in /r/programming to the effect that floating point is fine for money as long as you are aware that it is not exact and not associative, and take that into account when doing your calculations.

Any intermediate result in a calculation chain might be off a tiny amount from the exact value, but if you just rounded to the nearest 0.01 before you accumulated enough error to not < 0.005 off, you'd be fine.

I think that's probably true for addition of money amounts. If you have a large number of costs to add up, for example, you should be able to add thousands of them, round to nearest 0.01, and get the right result.

But for tax calculations, such as 10% of $21.15, 0.1 x 21.15 = 2.1149999999999998 in 64-bit IEEE floating point, and rounding the nearest 0.01 gives 2.11, not the 2.12 that we want. A call to fesetround(FE_UPWARD) makes that come out 2.115, and then rounding to the nearest 0.01 gives the desired 2.12.

Will FE_UPWARD make this work for all amounts and tax rates, or are there amounts and rates where we need FE_TONEAREST or FE_DOWNWARD? If so, how do we tell which one we need? Like I said earlier:

> I'm not fully convinced that you cannot do all the calculations in floating point, but I am convinced that I can't figure it out.

PS: calculating tax in cents given double amt, rate, using this method:

  tax = amt * rate;
  cents_tax = round(100 * tax);
almost works if the rounding mode is FE_UPWARD. For all amounts from 0.01 through 99.99, and all tax rates from 0.01% through 10.99% in increments of 0.01% it works except for 3.75% of $67.60 and 7.5% of $33.80.

> but if you just rounded to the nearest 0.01 before you accumulated enough error to not < 0.005 off, you'd be fine.

And in run-of-the-mill, everyday finance, there simply isn't enough calculation stuffed in between the concrete monetary points that are recorded in the ledger.

> If you have a large number of costs to add up, for example, you should be able to add thousands of them, round to nearest 0.01, and get the right result.

Exactly.

> But for tax calculations, such as 10% of $21.15, 0.1 x 21.15 = 2.1149999999999998 in 64-bit IEEE floating point, and rounding the nearest 0.01 gives 2.11, not the 2.12 that we want.

This problem will be there even if we use integers for the currency amounts, but floating-point only for these fractional calculations.

Luckily for us Canadians, I'm pretty sure the Canada Customs and Revenue Agency won't care which way you call this rounding. They also don't collect or refund overall discrepancies of less than around two dollars in a single tax return. I think I've been mostly rounding taxes down over the years, and tax credits up. E.g. if a tax credit is $235.981..., I make it 235.99.

The myth that has been foisted on programmers is that if you use floating-point for numbers, the actual ledgers won't balance, and sum totals of columns of figures will appear incorrect if verified by pencil-and-paper arithmetic. That will certainly be true if the math is done very carelessly; and it's true that it's easier to get it right with less care using integers.

A percentage calculation whose rounding is called the wrong direction will, in and of itself, not cause such a problem. E.g. if we split some sum of money into two complementary percentages, we can do it such that the two add up to the original.

You have to be careful not to do this as two independent percentages. Like, dont take 10% of 21.15 and then 90% of 21.15, individually round them to a penny, and then expect them to add up to 21.15. It has to be centround(21.15 - centround(.1 * 21.15)) to get the 90% residue.


The trick is that by default rounding happens using banker's rounding. Programming languages use this because this is what CPUs use. When you want to round your way, you need an extra digit and round manually:

    def tax_f4(amt, rate):
      tax = round(amt * rate * 1000)
      return tax // 10 + (tax % 10 > 4)

That works for 10% of $21.15, giving the desired 212.

However, for 10.14% of $21.15, it gives 215, but it should be 214. Another example is 3.5% of $60.70, for which it gives 213 but correct is 212.


You're right. My remainder calculation in my code snippet is incorrect. It should've been a floating point remainder instead.

    import math
    
    def tax_f5(amt, rate):
        t = amt * rate * 1000
        return round(t) // 10 + ((math.fmod(t, 10.0) - 5.0) > -1e-7)
But then since there's now an epsilon, it raises the question of how many digits of precision the tax rates typically need. This is indeed a difficult problem.

Some exhaustive testing on all amounts from $0.01 through $999.99 in $0.01 increments and all taxes from 0.01% through 99.99% in increments of 0.01% show that this is the minimum that does the trick (switching to C from Python for speed):

  unsigned long tax = (unsigned long)round(amt * rate * 1000000);
  return tax/(10000) + (fmod(tax, (double)(10000)) - (double)(5000) > -1e-5 ? 1 : 0);
(Yes, I see that I goofed in translation your code to C and typed -1e-5 instead of -1e-7. It looks like the results are the same with -1e-7).

I also tested that up through $9999.99 with taxes up to 12%, and no problems.

Adding another 0 to the 1000000, the two 10000's, and the 5000 works. And another, and another. Past that it starts to fail, but not the simple off-by-one failures you get when you don't use enough digits. These are way way off, so I'm guessing its running into some new class of problem. I haven't looked to see what that is yet.


> Storing money in floating point is fine. Just round to the nearest atomic unit when displaying.

Well, it's not just a display issue. In accounting, associativity and commutativity are important. People do care that `a + b + c - a == c + b` should evaluate to “true”.


It appears you did not see the critical point in the above comment. "Performing arithmetic operations against money in floating point is the dangerous part, as error can accumulate beyond an atomic unit."

You’re right, I missed that. If you’re not going to do any arithmetic, you might as well store them as strings.

There's very little point in storing money in floats if you're not going to do arithmetic in floats; about the only use case I can think of is JavaScript and JSON APIs.

Aside from the cases you mentioned, there are other dynamic languages in which numbers are by default floating point. e.g. Lua. I agree though.

Pennies (or any equivalents) are not the smallest unit in any currency. Fractions of it are perfectly acceptable and even common.

Even decimal floating point is a bad idea (for dealing with money) since you still can't represent a subset of rational numbers without approximation and without introducing rounding error during some calculations. It's just a different subset than what binary floating point can represent without approximation.

Well, this is one of those things where context matters.

In trading, it's super common to use floating point arithmetic for decision logic since it's very fast and straightforward to write. The actual trade execution, however, almost always relies on integer arithmetic because then money is actually being used (and hence must be tracked properly).

It's not therefore inherently incorrect to do currency conversions with floats in some situations provided that the actual transaction execution relies on fixed precision or decimal arithmetic.


When I was in college the professor of my software engineering class explicitly warned us to never use floating point numbers for money. He went on at length of the dangers of floating points for dealing with money and warned us that people can get really upset if they feel like they've been screwed out of money.

He had decades of experience in the software development industry and I got the feeling that he'd seen the effect of this issue personally.

I still remember that warning well.


[flagged]


Would you please stop posting unsubstantive comments to Hacker News?

I haven't worked in fintech but I've read that money is often represented (at least in storage) as plain integers, since for example US currency only ever goes to two decimal places. But I guess once you start operating on it you run into potential truncation unless you use rationals.

In finance, US dollars are generally stored to four decimal places, because you need to deal with stuff like compounding interest or stock splits.

COBOL has a built in fixed point integer type, which makes defining a 4 digit decimal and doing math on it easy. (IBM designed it from the ground up to cater to people with a lot of money, who spend a lot of money, to work with lots of money, ie banks) Java has the BigDecimal type, which is a class in the class library, which means you need to import it. And because Java lacks operator overloading, doing calculations is tedious.

In the 90s, there was a huge push to replace COBOL with <something else>, and Java was the Rust of its day, so that's what everyone got behind. However, 4 digit COBOL decimals apparently round differently than 4 digit Java BigDecimals, so all the tests failed. And all the stuff like a\x+b had to be written like BigDecimal.add(BigDecimal.multiply(a,x),b) so development was taking forever.

Eventually they said "fuck it" and 20 years later we're still stuck with COBOL and everyone who remembers the original death march says "never again".

I have a feeling a lot of the problems came down to computer science people thinking money has two decimal digits but domain knowledge people knowing it has four. We programmers, as a group, make a lot of assumptions about other peoples' domains and we're wrong a lot*.


I've had the thought that programmers should note assumptions in flagged comments, and those comments should be automatically collected, and then reviewed occasionally. Assumptions might be sustainable, so to speak, but they can also create one kind of technical debt.

> make a lot of assumptions about other peoples' domains and we're wrong a lot

What do you mean this person has no surname? That's unpossible, surname is never null, error error.


US currency can go to more than two decimal places...

http://blogs.reuters.com/ben-walsh/2013/11/18/do-stocks-real...

I guess it's time for someone to write an "Assumptions Programmers make about money" post.


Falsehoods programmers believe about prices: https://gist.github.com/rgs/6509585

Interesting list, though I'm not sure what do they mean by n. 7

For a brief time in 2008, 1 Zimbabwe dollar was very roughly equivalent to one TRILLIONTH of a United State penny. So technically a value of “1” did exist, but it was meaningless. I have some of the 100 Trillion Dollar notes from Zimbabwe from that time period.

I also have a few. You could buy a stack of 100 trillion dollar bills few a few bucks then. They are now selling for $50-$60 on eBay.

Investing in ZWL, bold move!


1. Money in a brokerage account is not US currency.

While it isn't physical US currency, my brokerage account represents the value of the account in units of USD- therefore any rules about how US currency works should apply.

Additionally, fractional cents are often presented to the consumer when purchasing gas/fuel.


Money is not as wierd as anyone might guess. I work on a financial application, and money is almost always just a BigDecimal with the scale set to 2 (and stored in a database as a bigint type or equivalent). When its not, its just a higher scale (for say, compound daily interest on small amounts for a significant period of time).

How do you store Bitcoin?

Bitcoin uses fixed-width unsigned integers. The smallest unit is 10^-8 of 1 Bitcoin, which is represented as just 1.

I like this approach. Is it compatible with the parent comment?

Yes, the parent approach is the same thing but with a decimal point added for convenience. The main thing is that the scale is fixed; it is effectively an integer count of 1s of the least significant index. It is impossible to truncate values or round upward and create money that didn't exist. This makes it perfect for representing actual money.

No it can't. There are systems that track things worth less than a penny for later billing, but at the end of the month when they bill someone, they do some sort of rounding.

If you are earning interest at a bank, and you've earned a fraction of a penny, they will eventually pay it to you once you've earned enough for a whole penny.

i.e. they track your account balance to more than 2 digits, they just only show you 2 digits.


Someone should tell that to everyone who ever used a ½¢ coin in the US. Also, US law explicitly states (31 USC §5101) that the unit of 1/1000th of a dollar is a mill.

>Someone should tell that to everyone who ever used a ½¢ coin in the US.

All 10 of them?


I’ve got some at home, but admittedly I’d never use one as currency. I also have US ½¢ paper notes too.

Go to a bank and ask for a half penny or a thousandth of a dollar. Let me know how it goes.

Go to a bank and ask for a $500 or $1000 note too. They won’t give you one as they aren’t in circulation anymore, but most (all?) will let you deposit it for face value.

Because, unlike a thousandth of a dollar, those are valid amounts for real transactions.

It goes without saying that half pennies are dead. A mill seems to be from the Coinage Act of 1792, which is perhaps a tad outdated.


Take two half penny coins or notes to the bank and they’ll credit your account a penny. Of course, just like with the $500 or $1000 notes, you’ll be losing money on the deal.

> A mill ... is perhaps a tad outdated.

Yet you use mills every time you pay for gas. Pointless in that case? Probably. Still used all the time? Certainly.


I pull the gas lever intermittently until the gal goes up but the cents don't. It takes about 5 tries, but always makes me smile that I beat the game.

>>...since for example US currency only ever goes to two decimal places

That is not correct. Stock settlement transactions often list four decimal places.


> Stock settlement transactions often list four decimal places.

That's not a significant difference compared to two decimal places, so brundolf's point still stands. There's no need for arbitrary precision.

Just store all dollars in PIPs so 5$ will be stored as 50000.


You can still get reasonable enough approximations with more than two decimals if you do something like `int64 myWorkingMoneyVal = currentMoney * 100000`, do your work, then divide the final result by 100000. You still risk some potential truncation if your work involves division, but the larger your multiplier that you're working with, the larger divisor at the end, which will help minimize how much of an error this ends up being. The 64 bit integer space is pretty darn big, so you typically don't risk an overflow, and you will typically get better performance than using a regular "decimal" type, since on-chip integer operations are usually very fast.

EDIT: Just a note, there's nothing special about the number 100000; pick the largest exponent of 10 that you can get away with a reasonable assurance that no overflow is possible. For a vast majority of money applications, I seriously doubt you're going to be hitting the limits of int64, so you could probably even get away with something like 1000000000.


Google uses 1000000 as multiplier in their APIs.

Edit: And they forbid equality comparisons for rationals. For some reason even >= is not allowed.


I didn't know that, but it doesn't surprise me (I suspected I wasn't the first person to come to the realization that there's no reason not to choose a giant number :) ).

I have developed a payment plan calculator for asset based finance and you would be amazed how many different rounding schemes and day counters (for fractional periods) exist and are actively used.

Counterexample: gas prices in the US are frequently displayed with 3 decimals (tenth of cents)

It isn't really a price though, it is a rate for an infinitely divisible good, i. e. $/L. You get the price when you multiply with the quantity purchased.

So how much do 2 CCs of gasoline cost?

The basic representation is usually good enough at 2 decimals (so a plain int), but it is often needed to have a transient representation during calculations.

For instance if one needs to apply discounts, add taxes, split in equal parts, all of the above one after the other, there will be a more precise intermediary representation before rounding everything in a way that keeps the total amount consistent with the original amount.


> It's actually in use in many places, for things like handling currency and money

Hm, are you sure? I don't believe "rational" types which encode numbers as a numerator and denominator are typically used for currency/money.

If they were, would the denominator always be 100 or 1000? I guess you could use a rational type that way, although it'd be a small subset of what rational data types are intended for. But I guess it'd be "safe"? Not totally sure actually, one question would be if rounding works how you want when you do things like apply an interest percentage to a monetary amount. (I am not very familiar with rational data types, and am not sure how rounding works -- or even if they do rounding at all, or just keep increasing the magnitude of the denominator for exact precision, which is probably _not_ what you'd want with currency, for reasons not only of performance but of desired semantics).

You are correct an IEEE-754 floating point type is inappropriate for currency. I believe for currency you would generally use a fixed-point type (rather than floating point type), or non-IEEE "arbitrary precision floating point" type like ruby's BigDecimal (ruby also offers a Rational type. https://ruby-doc.org/core-2.5.0/Rational.html . This is a different thing than the arbitrary-precision BigDecimal. I have never used Rational or seen it used. It is not generally used for money/currency.) Or maybe even a binary-coded decimal value? (Not sure if that's the same thing as "arbitrary-precision floating point" of ruby's BigDecimal or not).

There are several possible correct and appropriate data encodings/types for currency, that will have the desired precision and calculation semantics... I am not sure if rational data type is one of them, and I don't believe it is common (and it would probably be much less performant than the options taht are common). Postgres, for instance, does not have a "rational" type built in, although there appears to be a third-party extension for it. Yet postgres is obviously frequently used to store currency values! I believe many other popular rdbms have no rational data type support at all.

I'm not actually sure what domains rational data types are good for. Probably not really anything scientific measurement based either (the IEEE-754 floating point types ARE usually good for that, that is their use case!) The wikipedia page sort of hand-wavily says "algebraic computation", which I'm not enough about math to really know what that means. I have never myself used rational data types, I don't think! Although I was aware of them; they are neat.


Good catch! I'm thinking of fixed-point number types. Ruby's Rational was/is cool, but looks like an inherently difficult number-type to work with and keep sanity high.

For currency, business side should decide the rules (* 100 or * 1000000), and where to funnel the pennies ;) Fixed-point has it's own sort of gotchas, ie. multiplication, power, division, sqrt, etc. So there are fancy techniques to work with the numbers, like https://en.wikipedia.org/wiki/Bresenham%27s_line_algorithm


When you start looking into all of this, it's interesting to see how many ways there are to represent numbers in a computer. It isn't actually obvious or trivial at all, there is no one "true" or "accurate" representation, and they all had to be invented by past computer scientists!

If you're not a bank or similar but just dealing with currency to buy and sell things like for ecommerce, the default semantics of a fixed point type (like postgres 'money' type) or "arbitrary precision floating point" type like ruby's BigDecimal or are probably good enough, and just fine in a way that IEEE-754 floating point definitely are NOT. And probably don't require any additional business side decisions or involve any significant gotchas. Just using them instead of IEEE-754 floating point and not thinking too hard after that is probably just fine.

https://ruby-doc.org/stdlib-2.5.1/libdoc/bigdecimal/rdoc/Big...

https://www.postgresql.org/docs/9.6/datatype-money.html

If you ARE a bank or something similar -- I wouldn't know, I haven't done that! A relevant question: Am I concerned with specifying exactly how fractional pennies get rounded?


Ask the COBOL guys and gals for the true answer ;)

There are accounting, balancing, laws, regulations and reconciliation issues where for the really serious stuff, you use whatever fit spec and requirements, not the other way around. Ruby's BigDecimal will be fine, if you implement the detailed specification about how to calculate each operation every step of the way, with designated precision at various steps, together with truncations along the way that may not make much sense to the developer (or anyone else, but are required to get correct numbers).

Point is, sometimes other parties need to be able to replicate the exact numbers, unrelated to any internal library or coding standards. Code using just plain integers could be easier to certify than a library dependency.

In such cases, you don't just round to make numbers prettier, but may even keep the truncated part of the equation. It's then nice to use simple stuff that are proven to work and not change over time.


Right, sometimes, but quite frequently not.

Have you worked on ecommerce where you've been given such detailed specs? I have worked on ecommerce, I never have been.

I don't even understand exactly what you mean by "implementing detailed specifications about how to calculate each operation every step of the way" with ruby BigDecimal. Can you provide an example? Ordinarily, you just use ruby `+` and `*` etc operators with BigDecimal values. (Or SQL/postgres arithmetic operators with postgres money type). I don't even know what a "specification about how to calculate each operation every step of the way" would look like. This is something you've had to do with basic ecommerce apps?

I'm not sure what solution you are suggesting could be characterized as "simple stuff that are proven to work and not change over time"

Seriously, most everyone just uses something like BigDecimal or postgres Money type, and it's fine. (IEEE-754 float is NOT though. Neither, probably, is the rational type that you initially suggested... )


>If they were, would the denominator always be 100 or 1000?

The numerator and denominator get automatically reduced to lowest terms (just like you learned in elementary school, so 15/100 becomes 3/20) internally by every implementation I know of. This comes at a performance cost for every operation, but it helps keeping the numerator and denominator from blowing up.

>not sure how rounding works -- or even if they do rounding at all

They do not. The point of a Rational type is to keep precise values, so it's up to the programmer to decide when and how values are rounded.

>"arbitrary precision floating point" type like ruby's BigDecimal

Not sure how Ruby implements BigDecimal, but Java internally represents it as an BigInteger of digits, and a second integer that represents where in the number the decimal point should go. This means that BigDecimal still can't truly represent a value such as 1/3, since you can't have an infinite amount of 3's, but a Rational can.

>I'm not actually sure what domains rational data types are good for.

I'll be honest and say I've never had to use them either, but it's nice to know they exist. The intended use case is when you need to perform calculations and maintain as much precision and accuracy as possible in the intermediate values and such accuracy is more important than speed.


> If they were, would the denominator always be 100 or 1000?

Only if you never use anything but addition and subtraction. So, no currency coversions, interest rates, complex taxes or rebate schemes or the like.


Currency in banking is handled with bigints. Not rationals, just bigints of the smallest unit (i.e., 1 cent). This forces you to order operations so that divisions are done last or not at all.

The bigint you describe is just a poor man's rational, given that no computer architecture or mainstream language support them natively.

There is a slight difference where it forces decision-making about rounding at every step. Most importantly, it makes errors that print money (or burn money) impossible states. I'm aware of BigRational libraries or more clever currency abstractions, but this is the least magical way of doing it, kind of like representing datetimes as 64-bit UNIX epochs.

Well, you could look at it as x/100 rationals representing a dollar value. But you could also look at it as an integer amount of a smaller unit (cents). The difference is insignificant; computers support it natively.

> It's actually in use in many places, for things like handling currency and money

Which specific places have you seen it used in?


Reminds me of this Inigo Quilez article on experimenting with rendering using rational numbers: https://iquilezles.org/www/articles/floatingbar/floatingbar....

I actually ran into a bug recently while implementing my first raytracer, where the point calculated from the sphere-intersect test would just occasionally end up inside the sphere due to floating point imprecision, so the diffuse sample rays would have their origins completely in the dark, leading to randomly black pixels. Solved it by bumping every intersection out by 0.01 in the direction of its normal.

And then of course there have been several other "x.abs() < 0.01" cases for various purposes. So I could definitely see that being an interesting experiment.


Here's some good reading on robust ways to fix this without an arbitrary epsilon bump: http://www.pbr-book.org/3ed-2018/Shapes/Managing_Rounding_Er...

This phenomenon has a name, by the way: "shadow acne" (which is actually a more general phenomenon, but this is an example of it).

That's really interesting - hadn't thought of that before. To fix that, would you be able to do a square of the magnitude comparison with the radius and just bump the borderline cases, or is it more efficient without the extra branching?

I just did it across the board; since the error is in the floating-point noise I don't know if I'd even trust a comparison on that. Plus, the discrepancy between "bumped" and "unbumped" samples might cause some visible artifacts.

Direct3D used to have a Z-bias for a similar problem: rendering pictures hanging on walls at a far Z depth. Their whql tests even tested for it.

It was fun discovering all the corner cases while debugging drivers.


"Why don't we just" because it's harder than one thinks.

https://en.m.wikipedia.org/wiki/Arbitrary-precision_arithmet...

and gets harder when you want exact irrationals too https://www.google.com/search?q=exact+real+arithmetic


Although this does make me wonder what happens if you round the rational once the numerator/denominator becomes too big.

But maybe that just results in all the floating point weirdness again, just not for small rationals.


The result is that your number system (a) makes many common operations dramatically more computationally expensive, (b) has less predictable rounding which is very tricky to reason about, (c) generally gives worse results for the same bit budget. Instead of e.g. evenly spaced numbers, you get spacing like https://en.wikipedia.org/wiki/Minkowski%27s_question-mark_fu...

One thing you can try is storing a floating point numerator and a floating point denominator, and renormalizing them by bit shifts instead of finding GCDs. This lets you avoid rounding errors for small ratios. For general purposes this advantage isn’t really worth doubling the number of bits and complicating arithmetic for though.

See e.g. https://observablehq.com/@jrus/qang


Also, it's a lot less efficient, so it should only be used if absolutely necessary.

But rationals are more expensive to compute with (compared to floating-point; this is another example of the trade-off between performance and accuracy.)

it's also a range-storage trade-off. if you use two fixed width integers to represent a rational, the minimum and maximum values are the same as that of the integer type. floating point gives a far wider range for the same number of bits.

I'm sure there's some subtlety I'm missing, but isn't it actually the same trade-off? A 64-bit float can only represent 52-bit integers exactly. Anything above that, and you don't even have integer-level precision on the number anymore... This sliding scale of precision is exactly why floats are terrible at the kinds of operations that would cause you to use a rational instead.

> I'm sure there's some subtlety I'm missing, but isn't it actually the same trade-off?

not exactly, unless you consider space efficiency to be an aspect of performance (which is certainly reasonable). a naive implementation of rationals using two int32_t's only covers the range of a single int32_t, despite using as many bits as the double. it's also a trade-off between range and consistent precision, of course.

this certainly isn't some deep insight into number representation, just a quick point for the benefit of people who haven't thought much about rational data types before.


Once you care about that level of performance, you can surely optimise your representation to have a greater range (use more bits for the numerator) or greater precision (more bits for the denominator) or some boutique solution like using three integers to store the number a + b/c.

You can store slightly fewer numbers with rationals, because it's hard to avoid having a representation for both 2/4 and 3/6. But the loss of range or precision due to that is pretty small.


it's not just that they are expensive, it's that there is a nondetermistic compute time.

Let's say we need to do a comparison. Set

    a = 34241432415/344425151233 
and

    b = 45034983295/453218433828
Which is greater?

Or even more feindish, Set

    a = 14488683657616/14488641242046
and

    b = 10733594563328/10733563140768
which is greater?

By what algorithm would you do the computation, and could you guarantee me the same compute time as comparing 2/3 and 4/5?


I’m not sure I follow. Isn’t it just two integer multiplications followed by a comparison.

a/b > x/y is the same as ay > xb

Assuming you don’t overflow your integer type.


> Assuming you don’t overflow your integer type.

There's your answer :)

It's far too easy to overflow your integer type by simply adding a bunch of rationals whose denominators happen to be coprime, or just by multiplying rationals. For this reason, the vast majority of rational implementations use arbitrary precision integers, and of course arithmetic on those isn't constant time.


One approach would be to hold on to rationals for as long as possible, to eliminate drift, and then dump them out to the nearest floating-point at the very last moment

IEEE floats are pretty complicated, but today’s CPUs have dedicated support for those and not for rationals, so we use them where we probably shouldn’t.

IEEE floats are absolutely great for many applications where rationals would be overkill or even inappropriate. A videogame doesn't care if the result is 0.3 or 0.30000000000000004. Even some scientific applications can use floats if the coder knows what they're doing.

The problem is devs who don't understand what they're doing and just think that they can use floats in every situation and it'll just work out fine. This is not helped by many popular scripting languages who just default to floats when a result doesn't fit an integer (something more reasonable languages like Common Lisp don't do for instance).

For instance, to speak of videogames again, very tight precision isn't usually an issue but loss of granularity when numbers get very big can cause problems, especially if you have very large levels. That being said rationals wouldn't really help you here, you'd have the same problem except now you have to keep two numbers within bound instead of one. Imagine having a very small offset in a complex operation and ending up with a number like 100000000000000000000000000/100000000000000000000000001 !


I love this demonstration of that phenomenon: https://twitter.com/schteppe/status/1143111757751357440?s=20

Why 'even some scientific applications'? Don't nearly all scientific applications use floats?

Yeah, I assumed as much. I wasn't really thinking about that at the time, but knowing now that it exists in the wild, the only conceivable reason for it not to be used everywhere would be some kind of performance penalty.

When I was reading about this I thought why don't the print functions just by default round to the nearest 10 decimal places or similar so 0.30000000000000004 prints as 0.3 unless you specify you don't want that. And I wrote a function in javascript to round like that though it was surprisingly tricky and messy to do so.

Some langs have that in their standard included batteries.

(Shameless Common Lisp plug: http://clhs.lisp.se/Body/t_ration.htm)


Yes including C++, the language mentioned in the parent post (`std::ratio`)


You either feel smart by wondering why people don't use rationals, or feel smart by wondering why people use rationals.

what do you mean by "rationals"? infinite precision? because finite precision rationals are not associative and much worse than floats in many senses.

Handling quantities with varying Unit of Measures is made quite a bit easier by using numerator and denominator pairs.

Not all numbers are rational.

And not all numbers are real. And IEE 754 floating point numbers do not even cover all real numbers.

All floating point numbers are rational numbers.


FWIW, both of those can be expressed exactly by floating-point numbers ;)

Whew. I almost put some non-zeros in there.

Right all integers up to 2^53 (or something like that) can be (in double precision).

I assume that's the reason they made the mantissa linear, even though having the whole thing logarithmic makes more sense.


Addition/subtraction are also much simpler/cheaper than they would be in an entirely logarithmic model. If floats were just 2^x with some 64 bit fixed point x, it's not clear to me how to do addition efficiently.

What. Mantissa is already logarithmic, bit number n has value 2^(n - N-1) for an N-bit mantissa.

This is how positional number systems work at all.


The mantissa is linear. It's unrelated to how positional number systems work. A floating point value is split into two numbers - the exponent and the mantissa. Normally they are used to represent a final number like:

    x = 2^e * (1 + m)
Where e is the exponent and m is the mantissa (varying linearly from 0 to 1).

But you could have a fully exponential number format:

    x = 2^(m + o)
As pointed out though, it makes addition much more complicated, you can't exactly represent integers, and someone told me it makes quantisation noise worse too. Bad idea.

I'd always wondered how automated these comments were

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: